Cover Image

PAPERBACK
$35.00



View/Hide Left Panel

Appendix B
Evaluating the Efficiency of Research and Development Programs at the Environmental Protection Agency: Workshop Summary

With oversight by the Committee on Science, Engineering, and Public Policy and the Board on Environmental Studies and Toxicology, the Committee on Evaluating the Efficiency of Research and Development Programs at the U.S. Environmental Protection Agency (EPA) organized a public workshop in April 2007 (the full presentations made at the workshop are available in the Public Access File of the National Research Council created for this project).

Representatives of the Office of Management and Budget (OMB), the EPA, other federal agencies that perform research, and industry addressed the following questions in the context of the Government Performance and Results Act of 1993 (GPRA) and the Program Assessment Rating Tool (PART):

  1. What efficiency measures are currently used for EPA R&D programs and other federally funded R&D programs?

  2. Are the efficiency measures sufficient? Are they outcome-based?

  3. What principles should guide the development of efficiency measures for federally funded R&D programs in general?

  4. What efficiency measures should be used specifically for EPA’s basic and applied R&D programs?

PRESENTATION BY OFFICE OF MANAGEMENT AND BUDGET STAFF

The rationale for using PART is that taxpayers deserve to have their money spent wisely to create the maximal benefit. OMB developed PART in 2002 because the reporting process associated with GPRA was losing momen-



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 75
Appendix B Evaluating the Efficiency of Research and Development Programs at the Environmental Protection Agency: Workshop Summary With oversight by the Committee on Science, Engineering, and Public Policy and the Board on Environmental Studies and Toxicology, the Committee on Evaluating the Efficiency of Research and Development Programs at the U.S. Environmental Protection Agency (EPA) organized a public workshop in April 2007 (the full presentations made at the workshop are available in the Public Access File of the National Research Council created for this project). Representatives of the Office of Management and Budget (OMB), the EPA, other federal agencies that perform research, and industry addressed the following questions in the context of the Government Performance and Results Act of 1993 (GPRA) and the Program Assessment Rating Tool (PART): 1. What efficiency measures are currently used for EPA R&D programs and other federally funded R&D programs? 2. Are the efficiency measures sufficient? Are they outcome-based? 3. What principles should guide the development of efficiency measures for federally funded R&D programs in general? 4. What efficiency measures should be used specifically for EPA’s basic and applied R&D programs? PRESENTATION BY OFFICE OF MANAGEMENT AND BUDGET STAFF The rationale for using PART is that taxpayers deserve to have their money spent wisely to create the maximal benefit. OMB developed PART in 2002 because the reporting process associated with GPRA was losing momen- 75

OCR for page 75
76 Evaluating Research Efficiency in EPA tum. The office wanted another opportunity to focus on defining success and on the design and implementation of programs. OMB has decided that efficiency should be measured because R&D programs need to maintain a set of high- priority, multi-year objectives with annual performance outputs and milestones that show how one or more outcomes will be reached despite limited resources. PART is used for several purposes. The most basic is to evaluate the suc- cess of programs. The second is to monitor the annual improvement plans re- quired of each program. Evaluation of the first two PART criteria, quality and relevance, primarily by expert review, has caused few problems. Application of the performance cri- terion—especially the measures of efficiency—has proved to be a challenge. As a result, OMB has approached implementation of that third criterion as a learn- ing process. The two relevant PART questions concerning the efficiency of R&D are questions 3.4 and 4.3. Question 3.4 asks, “Does the program have procedures (e.g., competitive sourcing/cost comparisons, IT improvements, appropriate incentives) to measure and achieve efficiencies in program execution?” A way to measure efficiency is required for a “yes” response. Question 4.3 asks, “Does the program demonstrate improved efficiencies or cost effectiveness in achiev- ing program goals each year?” Answering question 4.3 is predicated on a “yes” response to question 3.4 and improved efficiencies or cost effectiveness in achieving goals should be described in terms of dollars when possible. For the President’s Management Agenda (PMA), the agency is given a “yellow” score when at least 50% of agency programs rated by PART have at least one efficiency measure and a “green” score when all agency programs rated by PART have at least one efficiency measure. The meaning of efficiency, as OMB has applied PART, includes both out- comes or outputs for a given amount of inputs and inputs for a given amount of outcomes or outputs. Outcome efficiency might be measured in the economic terms of benefit-cost ratio, cost-benefit ratio, or cost effectiveness. Output effi- ciency might be measured in terms of productivity (input/output) or unit cost (output/input) or with respect to a standard or benchmark. For outcomes, attribution of success or failure is inexact and may be based on indicators as diverse as improved targeting of beneficiaries or customers, a radically different mode of intervention, productivity improvements, or cost reductions. For outputs, efficiency might be described in relation to a program’s resources, such as the use of labor or material, improved capability, or procure- ment. PART also requires that measures of outcome efficiency “consider the benefit to the customer and serve as an indicator of the program’s operational performance.” Output efficiencies have various potential criteria. They must reflect effi- cient use of resources, measure changes over time that should correspond to a decrease or increase in related costs, and include an assessment of the compara- bility of the kinds of outputs produced.

OCR for page 75
77 Appendix B OMB also attached high priority to assessing R&D programs at the project level; that is, there must be single-year and multiyear R&D objectives, with an- nual performance outputs, to track how a program (an aggregation of projects) will improve scientific understanding and its application. Programs must also provide schedules with annual milestones for future competition, decision, and termination points, highlighting changes from previous schedules. The problem is that basic R&D does not fit those criteria, and much applied R&D does so only with difficulty. OMB has suggested the use of “earned-value management” (EVM) as a technique for tracking R&D efficiency. EVM plots expenditures against time, beginning with actual cost in dollars and comparing it with current earned value and planned value. EPA has agreed to use EVM as an efficiency-assessment tool on a pilot basis, although no agency is using it for basic research. OMB sees several difficulties in applying PART to research. One is the concern that new PART requirements will cause agencies to favor research that fits those measures and to defund research that does not fit them. Furthermore, OMB has found it hard to devise efficiency measures for research that can be used to identify improvement each year, as is expected generally under PART. OMB’s view is that although EVM is effective for parts of programs, such as construction projects, it is difficult to apply it to entire R&D programs. PRESENTATION BY ENVIRONMENTAL PROTECTION AGENCY STAFF EPA representatives described the difficulties presented in finding an ap- propriate way to evaluate the performance of their research programs, especially with respect to the efficiency criterion. As a result, OMB gave EPA a “yellow” rating for the Budget and Performance Integration initiative under the PMA. After experimenting with several possibilities, the agency decided to use the number of peer-reviewed papers published per full-time equivalent (FTE) as an efficiency measure for its Water Quality Research Program (WQRP). The PART Appeals Board ruled that EPA could use publications on condition that the WQRP develop an “outcome-oriented efficiency measure.” That agreement helped EPA achieve a “green” rating in March 2007 on the PMA. EPA recognized the limitation of using publication citations, seen as better for measuring productivity than for measuring efficiency. One issue is the qual- ity of the publications. Another is that publications are not a useful metric for many applied-research programs, especially in ecologic fields, in which re- searchers publish fewer papers than in, for example, toxicology. A major prob- lem in applying any single metric across even a single agency is the variation among programs. A large percentage of the budget of the human-health program goes to extramural grants, which cannot be evaluated by the same measures as EPA’s extensive inhouse laboratory system.

OCR for page 75
78 Evaluating Research Efficiency in EPA EPA did note that its Office of Research and Development (ORD) does a bibliographic analysis of every program. It also quantifies the extent to which ORD research is used to support regulations. Other evaluation tools are client surveys and the average time spent in producing assessments. EPA considered other measures, such as research vs overhead and citations per dollar invested. For ecologic research, it tried a “version of EVM,” comparing projected costs for a long-term goal with actual costs in the context of scheduled output for the goal. A problem is that goals are planned on a multiyear cycle and are difficult to measure annually. OMB would not accept the use of expert review to measure efficiency. A problem for every agency is that OMB examiners vary widely in their knowledge of research and their views of what is acceptable. One examiner may accept a particular efficiency measure for multiple programs that another does not accept. Discussion focused on the concern that budget allocations might shift in the direction of “efficient” programs with little regard for the quality of the sci- ence. EPA acknowledged that low PART scores sometimes mean less money for a program. Particularly vulnerable was basic research or a long-term study with unclear outcomes, such as the search for a causal connection between drinking- water quality and cancer, in which the agency has been looking for “proxies” that have logical linkage to outcomes. With regard to negotiating the application of EVM, it has been applied as a short-term solution. EPA’s intention was to work out alternative measures that would work not just for EPA but also for other research-funding agencies. In response to a question as to whether EPA could align the progress of a multiyear program with the budget, EPA noted that each long-term research plan is revised and updated annually by a research-coordination team. EPA relies on customer surveys and decision documents to indicate how results of research are used. PRESENTATION BY DEPARTMENT OF ENERGY STAFF Staff of the Department of Energy (DOE) described the impact of PART on the many activities of the DOE Office of Science. In 2002, DOE received low PART scores (50% and 60%) because of its performance measures. The agency revised its system and raised its scores to the 80s and 90s. For evaluating the quality and relevance of R&D, DOE depends on peer review by committees of visitors. It had not reviewed performance before the creation of PART, so it established a committee to test appropriate metrics. It tried using the number of hours that large DOE facilities were available to users, but because the facilities were all fully subscribed, this metric was not useful for annual improvement. OMB asked for a new measurement involving dollars per unit of work; after long discussions, DOE responded that the use of a single unit-per-dollar measure would not be effective. Instead, it proposed a detailed

OCR for page 75
79 Appendix B examination of management procedures to be reviewed regularly by expert re- viewers knowledgeable about the processes. DOE uses EVM for the construction phase of a facility but does not apply it to R&D or the operation of facilities. DOE has reviewed the practices of other agencies and major corporations and found no useful models. No one knew how to define value for the kinds of projects in the DOE portfolio, so performance could not be established by using a dollar value. Therefore, DOE turned again to expert reviewers and asked them to quantify the value of a project, assign risk and probability curves, and then conduct EVM analysis. DOE noted that the director of the Office of Science and Technology Pol- icy had set up a committee on this issue. Its assignment is to examine the litera- ture for ways to identify prospective benefits of research, “something no one is presently able to do,” and to seek input from all federal agencies on useful tools for evaluating research. The charge is to describe a mechanism for measuring the value of research and to assign a cost to compliance. Neither GPRA nor PART addresses agencies’ costs to comply with the data-gathering, analytic, and reporting requirements, which can be considerable. Other agencies also ex- pressed concern about the time and budgetary costs of compliance, including a statement by the National Institutes of Health (NIH) staff that 250 people worked full-time for 3 months to comply with PART for the NIH extramural program. For DOE, PART has a natural application in engineering and other pre- dictable processes. Research represented a modest part of all the R&D work done, and the direction and outcomes were never as clear and specific as build- ing a facility. One goal of OMB was to draw a boundary around administrative costs and reduce them. For example, one measure at DOE is to maintain total administrative overhead costs in relation to total program costs at 12%. But DOE recognized the trap of attempting to drive down administrative costs con- tinuously. As one example, the Energy Efficiency and Renewable Energy program uses an overhead metric but only for operational and construction programs, not for R&D. The program also experimented with using the relationship between the corporate program-management line and the total program R&D budget but found it to be “ungameable.” For R&D, DOE uses the “alternative efficiency measure” of peer review for all portfolios every 2nd or 3rd year. It is fairly cost-effective, allowing the agency to look at what is proposed and how well it is performed, identify ideas that lack merit, discontinue inefficient processes, redirect R&D, or terminate a poorly performing project. PRESENTATION BY NATIONAL SCIENCE FOUNDATION STAFF The National Science Foundation (NSF) evaluates its programs, using strategic outcome goals (discovery, learning, and research infrastructure) and

OCR for page 75
80 Evaluating Research Efficiency in EPA annual assessments of its investments in long-term research. That is done by an external expert Advisory Committee for GPRA Performance Assessment. It reviews program accomplishments foundation-wide and submits a report to the director with conclusions and recommendations. NSF has also initiated a new annual stewardship-goals assessment with eight annual performance goals. The assessment focuses on proposal processes, program administration, and management. A well-known NSF approximation of a measurement consists of the pro- gram-portfolio level assessments performed every 3 years by external commit- tees of visitors. This process, called merit review, is a detailed and long exami- nation of both technical merit and broader impacts of research. NSF tracks efficiency primarily in two ways. One is to measure the time to decision on research awards, which is important to researchers who depend on grant support. NSF is able to inform 70% of applicants within 6 months. The second is to measure facility cost, schedule, and operation. A goal for new facili- ties is to keep cost overruns and schedule variances for construction to less than 10% of the approved project plan for 90% of the facilities, and a parallel goal for operating facilities is to keep operating time lost because of unscheduled down- time to less than 10% for 90% of the facilities. PRESENTATION BY NATIONAL INSTITUTES OF HEALTH STAFF NIH created an Office of Portfolio Analysis and Strategic Initiatives to ex- amine systemic assessments and practice the “science of science management.” Two general points emerged: the difficulty of using a business-model approach to measure efficiency in science and NIH success in using PART on research and research-support activities. Some 99% of the NIH portfolio had been “PART-ed”; 95% of programs were rated effective, and the other 5% were rated moderately effective. The review of extramural research is limited to elements of the program under NIH’s management control. With respect to both the extramural and intramural programs, NIH claims some success in improved management. The extramural-research program has achieved cost savings through improved grant administration. The intramural- research program has saved money by reallocating laboratory resources. The building and facilities program has monitored its property condition index. The extramural construction program has saved funds by expanding the use of elec- tronic management tools. In the business model approach used by PART, efficiency has three as- pects: time, cost, and deliverables. Efficiency can be increased by improving any one of them as long as the other two do not worsen. In scientific discovery, however, variables are largely unknown; because the outcome is unpredictable knowledge, the inputs of time, cost, and resources are difficult to estimate. Some inputs may also be fixed by scientific methods. If the goal of a project is to pro-

OCR for page 75
81 Appendix B duce a microarray, deliverables cannot be “increased” by producing two or three microarrays. There are other reasons why science does not fit easily with this type of business model. In business, risk is usually undesirable; in research, whether in the private sector or the public sector, high-risk projects are strongly associated with breakthrough innovative outcomes. Nor does the business model capture the null hypothesis, which states that a negative result gives valuable informa- tion. Changing direction in a project may look like poor management, but it may be good science. The outcome may be unexpected or lead to an unexpected benefit, as occurs with drug benefits. If multiple teams are doing the same re- search, there is no way to calculate the relative value of each team. Finally, be- cause of the government’s public-health responsibilities, including such issues as rare diseases, costs and benefits are different from those in for-profit enter- prises whose measure is new product sales. PRESENTATION BY NATIONAL AERONAUTICS AND SPACE ADMINISTRATION STAFF The National Aeronautics and Space Administration (NASA) has focused on aligning the PART process and the annual GPRA process to yield a single set of externally reported measures. That had allowed it, for example, to link the monitoring of mission cost and schedule performance metrics and GPRA outcomes. Recently, NASA has moved away from agencywide measures of effi- ciency toward program-specific measures. The future focus, in complying with PART, is on finding efficiencies in operational activities and supporting busi- ness processes that lead to science and R&D products. NASA is using PART measures in the complex launch process, for example, and to find safe ways to reduce the size of the Space Shuttle workforce. It plans to use them in other ways, such as increasing the on-time availability and operation of ground test facilities and reducing the cost per minute of network support for space missions. NASA has found that the definitions and guidance for PART efficiency measures are most useful for repetitive, stable, and baselined processes and for some aspects of the management of R&D, such as financial management, con- tracting, travel-processing, and capital-assets tracking. But for long-term re- search, NASA is unable, for instance, to put an efficiency measure on finding the dark matter of the universe. Much of what NASA does is make discoveries and prototypes on unrepeatable time scales dictated by science. NASA’s effi- ciency measures tend to be process-oriented, not outcome-oriented. NASA urged more flexibility for the process—for example, to recognize that short-term decreases in efficiency might lead to long-term efficiency gains and to recognize the need to balance effectiveness and efficiency.

OCR for page 75
82 Evaluating Research Efficiency in EPA PRESENTATION BY NATIONAL INSTITUTE FOR OCCUPATIONAL SAFETY AND HEALTH STAFF The National Institute for Occupational Safety and Health (NIOSH) is not a regulatory body; it serves as the research partner of the Occupational Safety and Health Administration in the Department of Labor, although it is organiza- tionally part of the Centers for Disease Control and Prevention in the Depart- ment of Health and Human Services. It uses independent expert review to evalu- ate its 30 research programs, which exist in a “matrix” with substantial overlaps. The programs are relatively small, with budgets of $5-35 million. Eight NIOSH programs are being studied by other ad hoc committees of the National Research Council for relevance, impact, and emerging issues. NIOSH research results can be divided into outputs, intermediate out- comes, and outcomes: • Outputs include peer-reviewed publications, NIOSH publications, communications to regulatory agencies or Congress, research methods, control technologies and patents, and training and information products. • Intermediate outcomes include regulations, guidance, standards, training and education programs, and pilot technologies. • End outcomes include reductions in fatalities, injuries, illnesses, and exposures to hazards. Some efficiency measures are used for PART, beginning with percentage of grant award and funding decisions made available to applicants within 9 months while a credible and efficient peer-review system is maintained. NIOSH has considered its own principles for progress on research- program efficiency measures, including the degree of control over efficiency variables, refinements of all PART definitions for R&D, and the “need for im- pacts” to drive efficiency. Several potential efficiency measures have emerged, including • Correlation between research-activity funding and congruence of ac- tivity goals over time. • Correlation between funding and number, quality, representativeness, and potential value of research partnerships over time. • The correlations above with the use of research. PRESENTATION BY PROCTER & GAMBLE STAFF Procter & Gamble (P&G) maintains a considerable middle-term and long- term research effort in hazard characterization, risk assessment, and develop- ment of core competences. Efficiency as measured by time to market is critical

OCR for page 75
83 Appendix B for firms such as P&G. The impact on corporate profits of being the first-to- market can be substantial. P&G’s short-term research supports new product ini- tiatives and investigates unusual toxicity. Most of its efficiency measures are designed to save time in product development, increase confidence about safety, and build external relations (although this is not quantifiable). PRESENTATION BY IBM STAFF The company uses efficiency measures for some kinds of activities, such as • Return on investment in the summer internship program and graduate fellowship program: What percentage of the recipients return as regular IBM research employees? • A “Bureaucracy Busters” initiative to reduce bureaucracy in labora- tory support, information-technology support, human-resources processes, and business processes. • Tracking of the patent-evaluation process. • Customer-satisfaction surveys to evaluate the effects of service reductions. • Measurement of response time and turnaround time for external contracts. • Measurement of span-of-responsibility for secretarial support. IBM representatives agreed that basic research is hard to measure and that the structure of EVM was almost antithetical to the performance of basic re- search. By EVM standards, surprise is bad. In basic research, surprise is good. EVM is oriented toward projects, not exploratory work in which an answer is not self-evident at the beginning. Some intrinsic challenges in assessing basic research are to define value and to specify its recipients. It is desirable to measure outcomes rather than out- puts because outcomes are a “cleaner” effectiveness measure and have a “clear value.” In measuring research on water quality, however, outcomes are unknow- able, and such an output as the number of publications per FTE may be the best approach available. Evaluating the quality of research is not hard, but evaluating the impact of research is much more difficult for a corporation until a place is established in the market. PRESENTATION BY DOW CHEMICAL CORPORATION STAFF Dow spends only a small percentage of its R&D budget on science, and it is aimed primarily at ensuring that products will not harm human health or the environment. Inhouse expertise is supported for several reasons:

OCR for page 75
84 Evaluating Research Efficiency in EPA • To maintain state-of-the-art competence. • To help to translate innovations into use by business customers. • To benchmark to external standards of cost, timing, and quality. In research, Dow works to exploit laboratory-integrated research strengths, including toxicity testing, analytic research, and mode-of-action research. The company also collaborates with various research partners to gain access to new technology and expertise and to enhance credibility through publication and participation in the scientific community. PRESENTATION BY ALCOA STAFF Alcoa spends about 1% of sales on research (of which 75% is inhouse) and is just now beginning to look at efficiency measures. For example, a return- on-investment calculation would include the following: • Variable cost improvement. • Margin impact from organic growth. • Capital avoidance. • Cost avoidance. The annual impact of those four metrics over a 5-year period is compared with the total R&D budget. The resulting metric is used to evaluate the overall value of R&D programs and the current budget focus. Although the reported numbers constitute a lagging indicator, the company tries to encourage business case development and projects expected financial impact on current and future projects whenever possible. Projects with a clear path to value creation are more likely to be funded than projects with no clear business gains. Possible measures to improve efficiency presented by Alcoa include the following: • For greater up-front business-case development: - Identify the customer. - Apply customer support and commitment. - Use a rigorous process for value capture. - Insist on transparency. • For a stage-gate process: - Establish objectives and timetables. - Require completion before additional funding. • For project review: - Have periodic review by a mix of supporters and skeptics to test objectives, feasibility, progress, and potential for success.

OCR for page 75
85 Appendix B • For program review: - Aggregate R&D expenditures by laboratory group or identifiable programs and publish value capture or success rate for each annually.