Cover Image

PAPERBACK
$35.00



View/Hide Left Panel

3
Are the Efficiency Metrics Used by Federal Research and Development Programs Sufficient and Outcome-Based?

The second question the committee was asked to address is whether the efficiency metrics adopted by various federal agencies for R&D programs are “sufficient” and “outcome-based.” The committee spent considerable time on this question, attempting to clarify why the issue has created such difficulties for agencies.

ATTEMPTING TO EVALUATE EFFICIENCY IN TERMS OF ULTIMATE OUTCOMES

In its guidance for undertaking Program Assessment Rating Tool (PART) evaluations, the Office of Management and Budget (OMB) clearly prefers that evaluation techniques should be related to the “outcomes” of the program; in other words, the metrics are to be outcome-based, including those for programs that perform R&D. For example, the R&D Investment Criteria state: “R&D programs should maintain a set of high priority, multi-year R&D objectives with annual performance outputs and milestones that show how one or more outcomes will be reached” (OMB 2007, p. 75).

In the case of the Environmental Protection Agency (EPA), that would mean that the efficiency of the research should be evaluated in terms of how much it contributes to improvements in the mission objectives of human health and environmental quality. However, at the same time as OMB presses for use of outcome metrics, it describes substantial difficulties in doing so. OMB points out, for example, that when the ultimate outcome of a research program is lives saved or avoidance of property damage, it may be the product of local or state



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 38
3 Are the Efficiency Metrics Used by Federal Research and Development Programs Sufficient and Outcome-Based? The second question the committee was asked to address is whether the ef- ficiency metrics adopted by various federal agencies for R&D programs are “sufficient” and “outcome-based.” The committee spent considerable time on this question, attempting to clarify why the issue has created such difficulties for agencies. ATTEMPTING TO EVALUATE EFFICIENCY IN TERMS OF ULTIMATE OUTCOMES In its guidance for undertaking Program Assessment Rating Tool (PART) evaluations, the Office of Management and Budget (OMB) clearly prefers that evaluation techniques should be related to the “outcomes” of the program; in other words, the metrics are to be outcome-based, including those for programs that perform R&D. For example, the R&D Investment Criteria state: “R&D pro- grams should maintain a set of high priority, multi-year R&D objectives with annual performance outputs and milestones that show how one or more out- comes will be reached” (OMB 2007, p. 75). In the case of the Environmental Protection Agency (EPA), that would mean that the efficiency of the research should be evaluated in terms of how much it contributes to improvements in the mission objectives of human health and environmental quality. However, at the same time as OMB presses for use of outcome metrics, it describes substantial difficulties in doing so. OMB points out, for example, that when the ultimate outcome of a research program is lives saved or avoidance of property damage, it may be the product of local or state 38

OCR for page 38
39 Efficiency Metrics Used by Federal Research and Development Programs actions, including political and regulatory actions that are beyond the agency’s control and distant in time from the original research. The committee believes that a link with ultimate outcomes is not the cor- rect criterion for determining the sufficiency of metrics for evaluating research. Indeed, after analyzing agencies’ attempts to measure outcome-based efficiency, the committee concluded that for most research programs ultimate-outcome- based metrics for evaluating the efficiency of research are neither achievable nor valid. The committee considers this issue to be of such importance for its report that it amplifies its reasoning as follows: • There is often a large gap in time between the completion of research and the ultimate “outcome” of the research. In the case of EPA, for instance, the gap often is measured in years or even decades, commonly because the true out- come can be identified only by epidemiologic or ecologic studies that necessar- ily lag the original research itself. Thus, a retrospective outcome-based evalua- tion may be attempting to evaluate the “efficiency” of research conducted decades previously. Such an evaluation, if it can be done, may have little rele- vance to research being undertaken at the time of the evaluation. • A number of entities over which the research program has no control are responsible for translating research results into outcomes. In the case of EPA, such translation can involve multiple steps even for problem-driven research. The EPA program office has to convert research results into a risk-management strategy that complies with legislative requirements. That strategy undergoes substantial review and comment by other government agencies, the regulated community, and the public before it can be adopted. It may even be subjected to judicial review. When it is finally adopted, state agencies usually perform the implementation chores with their own corresponding risk-management strate- gies and programs. Even then, no ultimate outcomes appear until people, busi- nesses, or other government units take action in response to the programs and their accompanying rules and incentives. The initial research program has no influence over any of those steps. If the initial activity is core research, the num- ber and variety of organizations and individuals involved in producing outcomes may be even greater. • The results of research may change the nature of the outcome. The pur- pose of research is to produce knowledge, and new knowledge adds to the un- derstanding of which outcomes are possible and desirable. To take another ex- ample of problem-driven research supported by EPA, suppose that results of a research project suggest that a particular chemical is toxic. That information may be only indicative, not definitive. EPA may launch a research program to confirm whether the chemical has toxic effects in humans or the natural envi- ronment. Results confirming toxicity would be expected to lead to a risk- management strategy that produces an ultimate outcome of reduced risk and improved health. In addition, EPA’s research on chemicals and development of toxicity screening tests provide industry with tools that impact their choice of

OCR for page 38
40 Evaluating Research Efficiency in EPA which chemicals to develop for the market. If the research provides no evidence of toxicity, no risk-management strategy will be developed; that is, there will be no “ultimate outcome.” That would not mean that the research had no value; the “intermediate outcome” produced by the research would have provided knowledge that prevents unnecessary (and inefficient) actions, and the research would have been effective even though it did not produce reviewable ultimate outcomes. Thus, ultimate outcomes of research are not useful criteria for measuring research efficiency, and ultimate-outcome-based metrics proposed by federal agencies to evaluate research efficiency cannot be sufficient. PLACING “RESEARCH EFFICIENCY” IN PERSPECTIVE If evaluating the efficiency of a research program in terms of ultimate out- comes is not feasible, what is the most appropriate (or “sufficient”) way to evaluate a research program? Of first priority, in the committee’s view, is an evaluation that is comprehensive—that applies all three categories of criteria used by PART (see Appendix G) and discussed above: • Relevance—how well the research supports the agency’s mission, in- cluding the timeliness of the project or program. • Quality—the contribution of the research to expanding knowledge in a field and some attributes that define sound research in any context: its sound- ness, accuracy, novelty, and reproducibility. • Performance—described in terms of both effectiveness, meaning the ability to achieve useful results,1 and efficiency, meaning the ability to achieve quality, relevance, and effectiveness in timely fashion with little waste. Nonetheless, OMB representatives stated at the workshop that a program is unlikely to receive a favorable review without a positive efficiency grade even if quality, relevance, and effectiveness are demonstrated. The committee re- jected that approach because efficiency should not be evaluated independently but should be regarded as a relatively minor element of a comprehensive evalua- tion. As proposed in the PART guidance, efficiency constitutes only a portion of the performance criterion, which itself is one of the three major evaluation criteria. Efficiency, of course, is a desirable goal, and it should be measured to the extent possible. That is commonly the case with input and output functions, whose efficiency may be clearly reflected in quantitative terms of hours, person- nel, dollars, or other standard metrics. But undue emphasis on the single crite- 1 That is, research of quality, relevance, and efficiency is effective only if the informa- tion it produces is in usable form.

OCR for page 38
41 Efficiency Metrics Used by Federal Research and Development Programs rion of efficiency leads to imbalances. For example, if the main objective of a program is to receive, review, and return the grant proposals of researchers, it is desirable to do so promptly; but no one would recommend reducing review time to zero, because that practice, although undoubtedly efficient, would inevitably reduce the quality of review. Many have cautioned against the reliance solely on quantifiable metrics for making decisions about the value of a program. This is particularly true if these metrics are collected solely to evaluate efficiency, rather than assessing the larger issue of the program’s quality, relevance, and performance. Organizations often tend to manage what they measure, which can result in distortions in or- ganizational emphasis and compromise the objectives of the program in question (Blau 1963). For example, in an assessment of the influence of the federal gov- ernment’s statistical requirements to evaluate local programs, De Neufville (1987, p. 346) found that “the required statistics seldom genuinely informed or directly affected program decisions by what they showed. Rather, they were assemblages of numbers tacked onto proposals. Indeed the preparation of the statistics was often delegated to a junior staff member or data analyst and done independently of the rest of planning…Local governments had little incentive or occasion to use them in any analytic way…Thus the required statistics became merely window dressing—part of the ritual of grant getting. As such they were not particularly accurate, but they were accepted. Few bothered to point out their limitations. It simply did not matter.” Similarly, Weiss and Gruber’s (1987) re- view of federal collection and distribution of education statistics noted that these can be guided by political influences. PROCESS EFFICIENCY AND INVESTMENT EFFICIENCY In the committee’s view, the situation described in the preceding sections presents a conundrum, as follows: • Demonstration of outcome-based efficiency of research programs is strongly urged for PART compliance. • Ultimate-outcome-based metrics of research efficiency are neither achievable nor valid. In the face of that conundrum, the committee found that PART asks two kinds of questions about efficiency—one explicit and one implicit. The explicit question applies to inputs and outputs, which should be identified and measured in their own right. Many cases of such efficiency can be characterized by the term process efficiency. That is most easily seen in aspects of R&D related to administration, facilities, and construction. Process efficiency can be measured by, for example, how fast a building is constructed, how closely a construction process adheres to budget, and what percentage of external grants is evaluated by peer review within a given period. Such activities can usually be described

OCR for page 38
42 Evaluating Research Efficiency in EPA quantitatively (for example, in dollars or units of time) and measured against milestones, as described earlier in the case of earned-value management (EVM). The implicit question about efficiency has to do with ultimate outcomes, for which PART prefers quantitative measures against milestones. In contrast to process activities, some major aspects of a research program cannot be evaluated in quantitative terms or against milestones.2 The committee describes such as- pects under the heading investment efficiency, the efficiency with which a re- search program is planned, funded, adjusted, and evaluated. Investment effi- ciency focuses on portfolio management, including the need to identify the most promising lines of research for achieving desired ultimate outcomes. It can be evaluated in part by assessing a program’s strategic planning architecture. When an agency or research manager “invests” in research, the first step is to identify a desired outcome and a strategy to reach it. In general, investing in a research program involves close attention to three questions: are the right investments being made, is the research being performed at a high level of quality, and are timely and effective adjustments made in the multi-year course of the work to reflect new scientific information, new methods, and altered priorities? Those questions, especially the first, cannot be answered quantitatively, because the answers require judgment based on experience. Judgment is required to ensure that investment decisions are linked to strategic and multi-year plans (rele- vance), that the research is carried out at the highest level by the best people (quality), that funds are invested wisely in the right lines of research (effective- ness), and that the most economical management techniques are used to perform the research (efficiency). It is important to emphasize the value of ultimate out- comes for assessing the relevance of research. The concept of investment effi- ciency may be applied to studies that guide the next set of research projects or stepwise development of analytic tools or other products (Boer 2002).3 An ap- propriate way to evaluate investment efficiency is to use expert-review panels, as described in Chapter 2. These considerations are indeed addressed by various PART questions in sections 1 and 2. However, the concept of investment effi- ciency, central to the performance of research, is not addressed in questions 3.4 or 4.3, those that specifically deal with efficiency. WHAT ARE “SUFFICIENT” METRICS OF PROCESS EFFICIENCY? As indicated in Chapter 2, no efficiency metrics currently used by agen- cies and approved by OMB to comply with PART succeed in measuring invest- ment efficiency. Instead, the metrics address issues that fall under the heading of 2 In its guidance, OMB recognizes the problem, concluding that “it may be difficult to express efficiency measures in terms of outcome. In such cases, acceptable efficiency measures could focus on how to produce a given output level with fewer resources” (OMB 2006, p. 10). 3 Boer proposes a method for valuing plans by which the value may be analyzed quan- titatively and increased by good management over time.

OCR for page 38
43 Efficiency Metrics Used by Federal Research and Development Programs process efficiency. Process-efficiency metrics should meet the test of “suffi- ciency,” however, and several questions can help in the framing of such a test. As the committee was tasked with evaluating whether these efficiency measures were “sufficient,” it has developed its own questions for evaluating sufficiency below. First, does the metric cover a representative portion of the program’s op- erations? Metrics that pertain to only a small part of a program fail to indicate convincingly whether the program as a whole is managed efficiently. They may also create misguided incentives for program managers to improve the small portions of the program being assessed to the detriment of the rest. It is possible to use different metrics for different parts of a program, and indeed this ap- proach may be useful to research-program managers; however, it is difficult to combine different metrics into a single number that represents the efficiency of an entire program. EPA acknowledged this difficulty when it noted that a major problem in applying any single metric across even a single agency is the varia- tions among programs (see Appendix B). Second, does the metric address both outputs and inputs of the program? The goal of a research program should be to produce desired outputs quickly and at minimal cost—that is, with minimal inputs—without diminishing their quality. Thus, a metric of efficiency should measure whether a program is pro- ducing its intended outputs. However, the use of such a metric requires that the program have some quality-assurance and quality-control (QA/QC) process to ensure that the drive for increased efficiency does not diminish the quality of the outputs. Two other questions, although they do not directly determine sufficiency, should be asked about a proposed metric. The first is whether its use is likely to create undesirable incentives for researchers and research managers. A common adage in program evaluation is that “you get what you measure.” A measure- ment system should not set up incentives that are detrimental to the operation of the program being evaluated (Drickhamer 2004). For example, program manag- ers can often make adjustments that “meet the measure” without actually im- proving—if not adversely affecting—the program being evaluated.4 Grizzle (2002) discussed the unintended consequences of the pervasive practice of per- formance measurement. She notes “we expect that measuring efficiency leads to greater efficiency and measuring outcomes leads to better outcomes, but we don’t always get the results we expect.” 4 Drickhamer (2004) includes an example provided by Andy Carlino, a management consultant: “Carlino says he once worked for a large organization where plant managers received a bonus if they reduced direct labor. The easiest way to do that is to automate, which is what happened at this company in a big way. The result, he recalls, was more downtime and poor quality, which required more support by indirect labor, which led directly to customer quality and delivery issues. At the end of the day, direct labor went down, but total costs increased.”

OCR for page 38
44 Evaluating Research Efficiency in EPA A second question—appropriate for all levels of compliance—is whether collecting the required information adds sizable administrative costs. Evaluation requirements, particularly when not carefully attuned to the program being evaluated, can cause program managers, administrators, and ultimately taxpay- ers substantial expense in collecting and processing information; the committee heard statements to that effect from research-intensive agencies attempting to comply with PART. Under the best of circumstances, the evaluation metric should depend on data already being collected to manage the program effec- tively and efficiently. Otherwise, the effort to measure the program’s efficiency can reduce the efficiency desired by both the agency and OMB. A CRITIQUE OF THE EFFICIENCY METRICS USED BY FEDERAL RESEARCH PROGRAMS In light of those questions, it is appropriate to ask how well the metrics used by federal research programs meet the test of sufficiency. Chapter 2 de- scribed types of efficiency metrics that have been proposed or adopted by fed- eral agencies to comply with PART (see Appendix E for details). The committee has examined nine of those metrics in the context of the four questions posed above and produced the following assessment. 1. Time to Process, Review, and Award Grants Several agencies use time required to process grant requests as an effi- ciency metric. Such a metric is valuable if the awarding of grants is the purpose and primary output of the research program. In such cases, it would also satisfy the criterion of covering a substantial portion of the research program. One weakness, however, is that the measurement unit of inputs is time rather than a more inclusive metric of resources, such as total applicable administrative costs. As a result, changes in the program’s administrative budget can result in changes in the metric that do not truly reflect changes in efficiency.5 A second problem is that continual “improvement” of this metric—reducing the time re- quired—meets a point of diminishing returns. Some amount of time is required to conduct efficient peer review and otherwise identify the best research propos- als. Excessively reducing the resources (including time) devoted to those efforts could substantially reduce the quality, relevance, effectiveness, and even effi- ciency of the research being funded. 5 For example, if a 20% decrease in administrative budget resulted in a 10% increase in the time required to award grants, this would appear as a decrease in efficiency accord- ing to the metric although it would be measured as an increase in efficiency if total costs were used as the denominator.

OCR for page 38
45 Efficiency Metrics Used by Federal Research and Development Programs 2. Proportion of Research Budget Consumed by Administrative Functions (Overhead Ratio) This metric has the advantage of being able to cover the entire research program. A disadvantage is that it includes no measurement of research outputs. It cannot be improved indefinitely; some amount of administrative cost is neces- sary for managing a high-quality, relevant, effective, and, indeed, efficient re- search program. If administrative costs are reduced too far, all those characteris- tics will be lessened. 3. Publications Per Full-Time Equivalent or Per Dollar Some agencies have proposed an efficiency metric of publications per FTE or per dollar. It is a useful metric for programs whose primary purpose is to produce publications, because it considers both inputs (FTEs or dollars) and outputs (publications). Here again some mechanism is needed to evaluate the relevance, quality, and effectiveness of the publications. Using a dollar metric rather than an FTE metric provides a better indication of efficiency because it considers total resources. Some agencies use the number of peer-reviewed publications as a measure of program quality. A problem with this practice is that publication rates vary substantially among scientific disciplines (Geisler 2000). Therefore, using such a metric for a program that supports research in different disciplines can provide misleading results unless the publication rates in each discipline are normal- ized—for instance, by relating them to the mean rate for the discipline.6 4. Percentage of Work That Is Peer-Reviewed Some agencies are apparently in the process of adopting a way to measure the portion of the research program that has been subjected to peer review as a metric of efficiency. Peer review is normally used to ensure the quality of re- search, not its efficiency. Asking experts who are experienced in managing re- search programs to evaluate the programs is an appropriate way to improve effi- ciency, but it is not clear that the agencies adopting this metric include such people in their review panels. 5. Average Cost Per Measurement or Analysis Using such a metric might be sufficient for an agency if a large component of the “research” program is devoted to fairly repetitive or recurrent operations, such as analyzing the constituents of geologic or water samples, and adequate 6 See NRC 2003 for further discussion of the problems involved in using such bibli- ometric analyses.

OCR for page 38
46 Evaluating Research Efficiency in EPA QA/QC procedures are incorporated into the analytic efforts. An advantage of this metric is that it considers both outputs and total costs. However, repetitive analyses or measurements are not normally a major component of an agency’s research program, so they are unlikely to yield results that apply to a research effort as a whole. 6. Speed of Response to Information Requests Several agencies use some metric of time required to respond to informa- tion requests. The technique suffers from the same weaknesses as the metric of time to respond to research requests and is even more likely to result in dimin- ished quality because a QA/QC function, such as consumer satisfaction, is rarely incorporated into such programs to measure the quality of responses. Respond- ing to information requests does not account for a substantial portion of an agency’s research budget, so it is unlikely to measure the efficiency of a sub- stantial portion of its research program. 7. Cost-Sharing In a few instances, agencies are apparently using cost-sharing as a metric of efficiency. Cost-sharing may be a proxy for the quality, relevance, or effec- tiveness of some research to the extent that other agencies or private entities may be willing to share the costs of research that they consider to be of high quality, relevance, or effectiveness for their own missions. Cost-sharing does reduce the cost of research to the sponsoring agency, but it is not a metric of efficiency in itself; an increase in cost-sharing does not reduce the total re- sources devoted to research. It also fails to address research outputs. 8. Quality or Cost of Equipment and Other Inputs Several agencies have adopted metrics that are related to the cost or effi- ciency of inputs to the research effort rather than to the research itself. Obvi- ously, lower cost of inputs can result in lower-cost and therefore more efficient research—as long as the quality of the inputs does not correlate positively with their cost. Indeed, improved equipment (for example, more powerful computers or more advanced analytic equipment) can be important in improving the effi- ciency of research. However, this metric itself does not measure research effi- ciency. It does not consider outputs, and it focuses on only some inputs. 9. Variance from Schedule and Cost EPA has recently agreed to attempt to use a variation of EVM to measure research efficiency. EVM measures the degree to which research outputs con-

OCR for page 38
47 Efficiency Metrics Used by Federal Research and Development Programs form to scheduled costs along a timeline, but in itself it measures neither value nor efficiency. It produces a quantitative metric of adherence to a schedule and a budget. If the schedule or budget is inefficient, it is measuring how well the pro- gram adheres to an inefficient process. This metric relies for its legitimacy on careful research planning and ad- vanced understanding of the research to be conducted.7 If a process generates a good research plan with distinct outputs (or milestones) produced in an efficient manner, measuring the agency’s success in adhering to its schedule does become a metric of the efficiency of research management. It also satisfies the criteria of including outputs and applying to the entire research effort. Thus, it can be a sufficient metric of process efficiency so long as the underlying planning proc- ess incorporates the criteria of quality, relevance, and performance. On the basis of those evaluations, the committee concludes that there may be some utility in certain proposed metrics for evaluating the process efficiency of research programs, particularly reduction in time or cost, on the basis of mile- stones, and reduction in overhead rate. There may also be applications for some metrics in certain EPA research programs, including: reduction in average cost per measurement or analysis, adjusted for needed improvements; reduction in time or cost of site assessments; and reduction in time to process, review, and award extramural grants. In all cases, caution should be used in applying the metrics, as considera- tion must be given to the type of program being evaluated. FACTORS THAT REDUCE THE EFFICIENCY OF RESEARCH Many forces outside the control of the researcher, the research manager, or OMB can reduce the efficiency of research, often in unexpected ways. Because these other forces can appreciably reduce the value of efficiency as a criterion by which to measure the results or operation of a research program, they are rele- vant here. For example, • The efficiency of a research program is almost always adversely af- fected by reductions in funding. A program is designed in anticipation of a fund- ing schedule. If funding is reduced after substantial funds are spent but before results are obtained, activities cannot be completed, and outputs will be lower than planned. • When personnel ceilings are lowered, research agencies must hire con- tractors for research, and this is generally more expensive than in-house re- search. • Infrastructure support consumes a large portion of the EPA Office of Research and Development (ORD) budget. Because the size and number of 7 The method comes from the construction industry, in which scheduling and expected costs are understood better than they are in research.

OCR for page 38
48 Evaluating Research Efficiency in EPA laboratories and other entities are often controlled by political forces outside the agency, ORD may be unable to manage infrastructure efficiently. • Inefficiencies may be introduced when large portions of the budget are consumed by congressional earmarks. That almost always constitutes a budget reduction because the earmarks are taken out of the budget that the agency had intended to use to support its strategic and multi-year plans at a particular level. Still other factors may confound attempts to achieve and evaluate effi- ciency by formal, quantitative means. For example, the most efficient strategy in some situations is to spend more money, not less; a familiar example is the pur- chase of more expensive, faster computers. Or a research program may begin a search for an environmental hazard with little clue about its identity, and by luck a scientist may discover the compound immediately; does this raise the pro- gram’s efficiency? Such examples seem to support the argument that an experi- enced and flexible research manager who makes use of quantitative tools as ap- propriate is the best “mechanism” for efficiently producing new knowledge, products, or techniques. EVALUATING RESEARCH EFFICIENCY IN INDUSTRY The committee reviewed information from the Industrial Research Insti- tute (IRI), an association of companies established to enhance technical innova- tion in industry. IRI has been actively involved in developing metrics for use in evaluating R&D, studying the measurement of R&D effectiveness and produc- tivity, and devising a menu of 33 metrics to be used in evaluating the effective- ness and productivity of R&D activities (Tipping et al. 1995). A recent industry study surveyed 90 companies with revenues exceeding $1 billion in order to understand their efforts to monitor and manage the per- formance of their R&D activities. Preliminary evidence indicates that higher- performing companies differ from lower performing companies in the metrics they use to evaluate their R&D programs. For instance, higher performing com- panies are more likely to track pipeline productivity metrics (such as revenues from new products and the value of the portfolio in the pipeline) and portfolio health metrics (such as percentage of portfolio in short-, medium-, and long- term projects) than lower performing companies which are less likely to focus on business outcome metrics (such as margin growth or incremental market share) and typically use more metrics to manage the R&D organization (D. Ga- rettson, RTEC, personal communication, December 7, 2007). Other studies published in the technology-management literature conclude that efficiency is best evaluated secondarily to effectiveness. For example, Schumann et al. (1995) noted that the “key to efficiency is maximizing the use of internal resources, minimizing the time it takes to develop the technology, and maximizing the knowledge about the technology that product developers have readily available.” However, they added that “rather than focus on effi-

OCR for page 38
49 Efficiency Metrics Used by Federal Research and Development Programs ciency, the focus of quality in R&D should be on effectiveness. The leverage here is ten to hundreds of times the R&D costs. . . . After the R&D organization has developed effectiveness, it can turn its attention to efficiency. The result must not be either/or, but rather simultaneous effectiveness and efficiency; i.e., doing the right things rightly.” THE SHORTCOMINGS OF RETROSPECTIVE REVIEW Finally, PART calls for retrospective review of research programs. The most recent R&D Investment Criteria (OMB 2007, p. 72) state, “Retrospective review of whether investments were well-directed, efficient, and productive is essential for validating program design and instilling confidence that future in- vestments will be wisely invested.” Although periodic retrospective reviews for relevance are appropriate, as suggested in the Criteria (p. 74), the retrospective analysis can be an unreliable indicator of quality (recommended every 3-5 years; p. 75) and performance (recommended annually; p. 76), for several reasons: • The size of any research program varies each year according to budget, so the amount and kind of work done also varies. Year-to-year comparisons can be invalid unless there is a constant (inflation-adjusted) stream of funding. • Retrospective analysis cannot demonstrate that resources might not have been put to more productive uses elsewhere. • Retrospective analyses are unlikely to influence future investment deci- sions, because they focus on expenditures in the past, when conditions and per- sonnel were probably different. Again, the value of any analysis depends on the experience and perspec- tive of those who perform it and of those who integrate the results with other information to make research decisions. SUMMARY AND RECOMMENDATIONS Despite the desire of OMB for agencies to use outcome-based metrics to evaluate research efficiency, no such metric has been demonstrated, so none can be “sufficient.” Meaningful evaluation of the efficiency of research programs at federal agencies can take two distinct approaches. First, the inputs and outputs of a pro- gram can be evaluated in the context of process efficiency by using quantitative metrics, such as dollars or hours. Process-efficiency metrics cannot be applied to ultimate outcomes, but they can and should be applied to such capital-intensive R&D activities as construction, facility operation, and administration. Second, research effectiveness can be evaluated in the context of invest- ment efficiency by using expert-review panels to consider the relevance, quality, and performance of research programs. Research investment includes the activi-

OCR for page 38
50 Evaluating Research Efficiency in EPA ties in which a research program is planned, funded, adjusted, and evaluated. Excellence in those activities is most likely to lead to desired outcomes. An ex- pert panel should begin its evaluation by examining a program in terms of its relevance, quality, and effectiveness, including how well the research is appro- priate to strategic and multi-year plans. Once a panel has evaluated relevance, quality, and effectiveness, it is well positioned to judge how efficiently research is carried out. The committee concludes that most evaluation metrics applied by federal agencies to R&D programs have been neither outcome-based nor sufficient. They have not been outcome-based, because ultimate-outcome-based efficiency metrics for research programs are neither achievable nor valid. Among the rea- sons is that ultimate outcomes are often removed in time from the research itself and may be influenced and even generated by entities beyond the control of the research program. They have not been sufficient, because most evaluation met- rics purporting to measure process efficiency do not evaluate an entire program, do not evaluate the research itself, or fall short for other reasons explained in connection with the nine metrics evaluated above. The use of inappropriate metrics to evaluate research can have adverse ef- fects on agency performance and reputation. At the least, inappropriate metrics can provide an erroneous evaluation of performance at a considerable cost in data collection and analysis, not to mention disputes and appeals. At worst, pro- gram managers might alter their planning or management primarily to seek fa- vorable PART ratings and thus compromise the results of research programs and ultimately weaken their outcomes. Agencies and oversight bodies alike should regard the evaluation of efficiency as a relatively minor part of the evaluation of research programs. REFERENCES Blau, P.M. 1963. The Dynamics of Bureaucracy: A Study of Interpersonal Relationships in Two Government Agencies. Chicago: University of Chicago Press. Boer, F.P. 2002. The Real Options Solution: Finding Total Value in a High-Risk World, 1st Ed. New York: John Wiley & Sons. de Neufville, J.I. 1987. Federal statistics in local governments. Pp. 343-362 in The Poli- tics of Numbers, W. Alonso, and P. Starr, eds. New York: Russell Sage. Drickhamer, D. 2004. You get what you measure. Ind. Week, Aug. 1, 2004 [online]. Available: http://www.industryweek.com/CurrentArticles/Asp/articles.asp?Article Id=1658 [accessed Nov. 9, 2007]. Geisler, E. 2000. The Metrics of Science and Technology. Westport, CT: Quorum Books. Grizzle, G. 2002. Performance measurement and dysfunction: The dark side of quantify- ing work. Public Perform. Manage. Rev. 25(4):363-369. NRC (National Research Council). 2003. The Measure of STAR: Review of the U.S. Environmental Protection Agency's Science to Achieve Results (STAR) Research Grants Program. Washington, DC: The National Academies Press. OMB (Office of Management and Budget). 2006. Guide to the Program Assessment Rat- ing Tool (PART). Office of Management and Budget. March 2006 [online]. Avail-

OCR for page 38
51 Efficiency Metrics Used by Federal Research and Development Programs able: http://www.whitehouse.gov/omb/part/fy2006/2006_guidance_final.pdf [ac- cessed Nov. 7, 2007]. OMB (Office of Management and Budget). 2007. Guide to the Program Assessment Rat- ing Tool (PART). Office of Management and Budget. January 2007 [online]. Available: http://stinet.dtic.mil/cgibin/GetTRDoc?AD=ADA471562&Location= U2&doc=GetTRDoc.pdf [accessed Nov. 7, 2007]. Schumann, PA., D.L. Ransley, and C.L. Prestwood. 1995. Measuring R&D performance. Res. Technol. Manage. 38(3):45-54. Tipping, J.W., E. Zeffren, and A.R. Fusfeld. 1995. Assessing the value of your technol- ogy. Res. Technol. Manage. 38(5):22-39. Weiss, J.A. and J.I. Gruber. 1987. The managed irrelevance of federal education statis- tics. Pp. 363-391 in The Politics of Numbers, W. Alonso, and P. Starr, eds. New York: Russell Sage.