National Academies Press: OpenBook

Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings (1989)

Chapter: 4. Towards Measuring the Effectiveness of NRSA Training Programs

« Previous: 3. The Future Labor Market for Biomedical and Behavioral Research Personnel
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 67
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 68
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 69
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 70
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 71
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 72
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 73
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 74
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 75
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 76
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 77
Suggested Citation:"4. Towards Measuring the Effectiveness of NRSA Training Programs." Institute of Medicine. 1989. Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings. Washington, DC: The National Academies Press. doi: 10.17226/9912.
×
Page 78

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

- - - existing data sets. other ~nt~ormat~on could be gathered through surveys ot former participants. More information is needed on the determinants of a research career and on the process used to select trainees. The biggest methodological problem is the lack of adequate control groups. CHAPTER 4 TOWARD MEASURING THE EFFECTIVENESS OF NRSA TRAINING PROGRAMS OVERVIEW NRSA personnel programs are designed to ensure the adequate supply and quality of biomedical and behavioral researchers. The principal mechanisms are fellowships, which influence individual career choices, and training grants, which also strengthen institutional training capabilities. However, the complexity of these NRSA programs, as well as methodological and data problems, makes it difficult to measure their effectiveness. _ _ e . ,~ . ~ _ + - Recent studies of NRSA programs at NIH do suggest that participants outperform non-participants in terms of subsequent involvement in research during their careers. These differences held for a wide variety of performance measures, including grant applications, grants received, publication counts, and citation counts. It cannot be concluded from the studies, however, that the training programs were responsible for these differences. There have been no such evaluations of NRSA programs at ADAMHA or the HRSA. Much of the information needed! for more rigorous evaluations is available in In the case of physician/scientists, program evaluations are more complex. Few M.D. trainees go on to careers of bench-level research, yet their clinical research is vital in applying new knowledge of molecular biology to patient care. Several reports have recommended changes in the program of study in training grant programs for physician/scientists Background This chapter considers the two most advanced pools in the education pipeline: predoctoral and postdoctoral educational programs in biomedical and behavioral science.) The intent here is to examine how effective the National Research Service Award (NRSA) programs are in training individuals who move into successful research careers that meet national needs. A related concern is the question of how variations in effectiveness are related to program substance. With credible information about program effectiveness, plus more refined and thorough versions of the cost data presented in the Executive Summary, it will become possible to determine whether these programs demonstrate acceptable cost/effectiveness. The committee was aware from the outset that it would be impossible to provide definitive information about the effectiveness of these training programs because of insufficient time and inadequate prior research and data bases. Our more realistic iWe consider the words "training" and "education" to be synonymous, but the term "training program" in the context of this report often refers to specific educational efforts and participants supported in part by the National Institutes of Health (NIH) or Alcohol, Drug Abuse and Health Administration (ADAMHA) funds. In this context, training programs may be considered distinct from fellowship programs. 67

ambition was to derive a program of research and data improvement that, when implemented, could provide important steps toward definitive evaluations. The Education Pipeline Vital national interests require adequate supplies of highly qualified health-related scientists. As was stressed in the Executive Summary, our education pipeline--from elementary schools to universities and professional schools--is the key mechanism for assuring both the adequacy of the supply and its quality. Public policy must focus on effective and efficient ways to ensure that the highest quality and appropriate numbers of scientists are produced by the pipeline while at the same time containing costs. However, the complexity of producing new entrants into science makes it difficult to affect the course of the process. In addition to being leaky, the U.S. educational system is decentralized, ill-coordinated, and only loosely coupled. In an ideal world the lower levels of an educational system would guide properly prepared young people toward the universities and colleges. These institutions would in turn provide basic scientific training to especially talented young people and encourage them to pursue scientific and professional training and apprenticeships in graduate and professional schools. In that perfect educational system, every young person who showed promise would advance along the pipeline promptly and in proportion to their promise. In reality, however, local school boards run primary and secondary schools under loose state coordination, universities operate under a variety of jurisdictions, and science departments enjoy a remarkable degree of autonomy within universities. In this less-than- ideal world, therefore, the choice points that move one toward scientific and prot-ess~onal occupations are unclear, both to students and to educational institutions. This lack of clarity, together with the cumulative nature of the pipeline, makes it easier to get out than to stay in. A second result of this uncoordinated educational system is that policies directed at just one of the critical points cannot produce maximum effects, because the processes taking place at any one critical juncture only partially control the total production scheme. The only exception may consist of policies directed at the last stage-- graduate and professional schools. Even though the flow is smallest at this point, the degree of leakage appears to be among the largest. Policies that stem this leakage could simultaneously affect improvements in the quality of training. This discussion requires two caveats. First, even if new pipeline policies are implemented, their effects may take years to become discernible because of the length of the pipeline. Second, as in other areas of human behavior, public policy may be only a minor factor in shaping the flow of personnel into the work force of science; enclogenous processes dominate the shaping of that flow. As a result, the effects of policy innovations may be slight and will be manifested only over long time periods. Their detection requires very sensitive measurement, and their analysis needs sophisticated research. THE AIMS AND EFFECTS OF TRAINING AND FELLOWSHIP PROGRAMS The overall goals of NRSA personnel programs are easier to state than to achieve. As a matter of policy, they are intended to ensure that the Supply of biomedical and behavioral research personnel is sufficient to meet the demand, that their quality is high enough to meet the needs of a constantly improving level of biomedical research, and that the pool of skills is responsive to shifts in the demand for various kinds of specialized personnel. To reach these goals, the policy uses a number of devices whose adoption is based on assumptions, explicit or implicit, concerning how occupational choices are made, how biomedical research skills are acquired, and what the market for biomedical research 68

personnel will be. It is useful to examine the first two sets of assumptions to see how closely they match the actual programs pursued and to consider the alternatives to those assumptions that were not adopted. (Labor market issues were considered: earlier in this report.) Occupational Choice The most direct goal of NRSA programs is to influence the occupational choices of potential research personnel. The implicit model is that critical choices at each point in the career path are influenced by the balance between anticipated benefits and costs of alternative paths. NRSA programs are designed to influence the balances by lowering costs of particular choices through stipends. The effectiveness of this strategy depends not only on when the stipend is offered, but also on the array of alternative choices offered by the environment. In many engineering fields, for example, graduate stipends must compete against immediate employment as a B.S. engineer. Similar competitive circumstances face fellowship programs designed to recruit M.D.s into research. 1 ralulug Another purpose of NRSA programs is to provide students with training opportunities that may not otherwise be available to them. (In this context, training is meant to cover both formal course work and apprentice-like participation In research.' This purpose can be achieved in two ways: either by enhancing the ability of academic departments to provide training, or by improving the range and quality of training choices available to students. Training grants may be best at enhancing institutional training capacities, for instance by expanding the number of traineeships, providing a departmental focus, or enhancing opportunities for cross-disciplinary training and research. However, it is an open question whether trainees receive different educational experiences than other graduate students in the same departments. Stipends may release students from the necessity to support themselves by unrelated employment, but traineeships may also compete with employment that is directly related to training, particularly research assistantships in which graduate students directly participate in faculty research as apprentices. It is unclear whether the quality of training is affected by being employed as a trainee rather than as a research assistant. Fellowships awarded to individuals provide fewer advantages to departments, but they too enhance training opportunities for individuals and may also enhance the research of the faculty sponsors. Training Efficiency All other things being equal, the shorter the training period, the more research personnel can be produced. Traineeships and fellowships are believed to shorten the training period by making it less necessary for their incumbents to engage in income- generating activities that are not training opportunities: the more time devoted to training, the quicker it is attained. Several caveats must be taken into account in assessing this argument. First, the greater efficiency of the entire biomedical and behavioral research personnel "industry" can only be attained if there are qualified and promising candidates for training who cannot be accommodated by the industry~s capacity. Second, stipends are fungible--they can substitute for job earnings unrelated to training, but they can also increase the 69

consumption of goods and services or even prolong one's stay in a training position. The fungibility of stipends also allows departments to use stipends to substitute for other funds, thereby increasing the resources available for other purposes or, more likely, increasing the number of graduate students. The connection between traineeship or fellowship strategies and increased efficiency is not necessarily causal except where expansion is required as a condition for awarding a grant. Honor Because fellowships and traineeships are awarded mainly in competition, they honor those who win the awards and hold the resulting positions. Honor may affect subsequent performance by increasing self-esteem, self-confidence, and the expectations of others. This effect may be reduced if traineeships are doled out in the same manner as other support, while fellowships awarded by national competitions may carry additional honor. Merit-Tested Selection Traineeships and fellowships presumably go to the most promising among the pool of eligible candidates. Ironically, those most likely to be selected are also those most likely to become biomedical researchers without the traineeship or fellowship in question. As a result it is difficult to estimate the net effects of biomedical and behavioral research personnel programs. If every promising candidate receives some support, there will be no available controls--no persons of equal merit who were not chosen. (Statutorily ineligible persons, such as foreign nationals, differ from those chosen in other important respects.) Programs that use merit-tested selection can only be evaluated for their net effects by drastically altering the selection process in what may be regarded as undesirable ways. It might be tempting, for example, to randomly deny fellowships and traineeships to selected persons in order to form controls, but such a strategy would surely produce both unsatisfactory controls and strong opposition. Marginal Effects A final policy concern is the marginal effects of the programs: how much would be gained from expanding the program? Marginal effects are especially of interest for programs that are not likely to be terminated, have not reached saturation coverage, and in which policy concerns center around whether the program should be expanded (or contracted). Biomedical and behavioral research training programs are unlikely candidates for termination, but their level of support does sometimes come under scrutiny. The issue of coverage saturation is not settled: there may or may not be additional traineeship or fellowship candidates who are qualified for support. The import of this discussion is that, whatever estimates are made of the effects of the biomedical research personnel programs, attention should be given in the first place to its marginal effects in preference to estimates of main effects. THE EVALUATION OF TRAINING AND FELLOWSHIP PROGRAMS Recent evaluation activities related to the NRSA programs have addressed the above questions and serve to identify areas of highest priority for future research. (A detailed discussion of these activities is found in the commissioned paper by Georgine Pion, found in Volume III of this report.) Such evaluations may serve two different purposes: definitive or descriptive. That is, an evaluation may aim at a definitive statement describing the effects of a program (i.e., a statement of how the world would be different if the program did not exist and evidence of the truth of that statement that is sufficiently rigorous to be acceptable to the scientific community). This requires well-ciesigned research: the programs may have small effects, delayed effects, effects that differ for different subpopulations and under different conditions, and control groups are difficult 70

to find. There has been no evaluation of an NRSA program to date that would meet reasonable scientific standards for a demonstration of causality. Recent evaluation activities have instead sought the much more modest goal of providing some facts about certain aspects of the program. Most were outcome studies that examined selected aspects of the subsequent careers of recipients and compared them to persons who did not receive NRSA support. These outcome studies are reviewed in the next section, as are the gaps in knowledge about NRSA that appear important to fill. These evaluations do not include any causal inference, but they can still support judgments about the attributes of appropriate policy. For example, if practically all graduates of a particular training program have outstanding research careers, it may be judged good policy to continue the program, even if the program had no causal effect on its recipients' careers. If on the other hand very few graduates of a program ever enter research, it may be judged that the program is not worthwhile, even if the program does have a causal effect on those who do succeed. Most evaluations fall between these two extremes. Outcomes What happens to persons who receive research training support from the NRSA program? Recent attempts to answer this question focused on indicators of whether the graduates are engaged in health-related research and measures of scientific productivity (e.g., grant application data, publications, citations).2 These studies typically construct a "comparison group" of persons who did not undergo NRSA training, with which to compare the performance of those who were in the NRSA program. In practice, however, comparison groups have been poorly matched; perfect matching is likely to be impossible. This methodological problem weakens the conclusions that might be drawn about the effect of NRSA training programs on performance differences. The consistent finding of almost all of the evaluation studies is that NRSA awardees outperform comparison group members in terms of research involvement during their careers. The magnitude of the difference between participants and the comparison group depends in part on the composition of the comparison group. For examples . Coggeshall and Brown used two groups tar comparison with those who received pre-Ph.D. support under NRSA.3 The first group consisted of age-matched Ph.D.s who received their degrees from departments that had received an NIH training grant but who had not received an NIH stipend themselves. The second comparison group consisted of Ph.D.s who had received their degrees from other departments and who had not received NIH support themselves. The study found that the performance of participants in NIH-sponsored predoctoral training modestly exceeds the performance of nonparticipants from the same departments and greatly exceeds the performance of the second comparison group. This was true for a wide variety of performance measures, including postdoctoral research support, subsequent involvement in NIH research, publication counts, and citation counts. Given that the first comparison group should have received exactly the same graduate education as the NIH trainees, the differences suggest that at least some training grant directors are effectively selecting their better Ph.D. students for the NIH award. A second study found that NIH post-Ph.D. awarders also go on to have more research-intensive careers than do members of two comparison groups: (1) biomedical science Ph.D.s who indicated on a survey that they planned to take a postdoctoral 2For an extensive discussion of the concept of productivity, see the paper by Helen H. Gee in Volume [~l of this report. 3Porter Coggeshall and Prudence Brown, The Career Achievements of NIH Predoctoral Trainees and Fellows, Washington, D.C.: National Academy Press, 1984. 71

appointment but did not receive NIH support and (2) biomedical science Ph.D.s without postdoctoral plans.4 Again, the differences appear on many performance measures (grant applications, publication counts, citation rates) and were much greater for the second comparison group than for the first. Garrison and Brown also studied the post-training performance of NIH M.D. postdoctoral awarders. Here, because research is unlikely to be the career goal of a physician, there are substantial problems in developing an informative comparison group. Comparison data were collected for (1) all ~ D.s and (2) a subset of M.D.s who said, a few years after their degree, that their primary activity was either research or teaching. Not surprisingly, the proportion of NIH ~ D. postdoctoral awardees who were engaged in research or teaching exceeded that of the typical physician. The subsequent research involvement of M.D.s who had received an NIH fellowship also greatly exceeded the comparison group of self-identified researchers and teachers. However, M.D.s who had received NIH postdoctoral support under a training grant were less likely to be involved in research than the comparison group of self-identified researchers and teachers. This unexpected result merits replication; if the finding is repeated, it merits further investigation as part of a program of research into outcomes of NRSA training programs. A more interesting question is how these outcomes are related to program characteristics. The best information would let one estimate how outcomes for NRSA trainees would change if small amounts of funds were shifted among programs (e.g., from institutional training grants to fellowships awarded to individuals). Training grant programs and individual fellowship applications are assigned priority scores to describe the scientific merit of each application; and (in most NIH programs) funding decisions for each award are made in priority score order. The "payline" is the point at which funds run out. If the outcome for persons who receive training under an application that is close to the payline for each type of grant were known, then one could estimate how the outcomes for NRSA trainees would change if small amounts of funds were shifted among programs. However, none of the studies addressed the relationship between outcome and the priority score given to the fellowship or training grant application. There is some information about the average outcome for recipients of various components of the NRSA awards. It is important to compare only programs with a reasonable chance of having comparable results. For example, one must expect that post- Ph.D. programs will produce a higher return in researchers per trainee than predoctoral programs because of the greater commitment to research demonstrated by those persons who have successfully completed the Ph.D. and applied for a postdoctoral appointment. Also, because the current M.D. curriculum provides little research training, one must expect a greater return in researchers per trainee from post-Ph.D. programs than from post-M.D. programs. To do better it would be necessary for selected M.D.s to have had enough research experience to be similarly committed to a research career. (See page 58, "A Special Note on the Training of Physician/Scientists.") The most comparable programs are training grant and fellowship programs aimed at persons with the same previous research training. The few evaluation studies that addressed this issue found that fellows outperformed trainees on most measures of subsequent research involvement. The differences were less pronounced for Ph.D.s than for M.D.s, however: 62 percent of post-Ph.D. fellows applied for an NIH or ADAMHA 4Howard Garrison and Prudence Brown, The Career Achievements of NIH Postdoctoral Trainees and Fellows, Washington, D.C.: National Academy Press, 1986. 72

research grant, compared to 52 percent of post-Ph.D. trainees; for M.D.s, the corresponding figures are 43 percent for fellows and 17 percent for trainees.5 It would be desirable to know how outcomes are related to other aspects of the NRSA program. The section on the training requirements for physician/sc~entists (below) notes the empirical evidence supporting the hypothesis that the length of time spent in postdoctoral research training is a strong predictor of the subsequent research involvement of M:D.s. What is not known, however, is the amount of time ~D. recipients of NRSA awards spend in research training supported by non-NRSA mechanisms, such as privately supported fellowships. There also have been no adequate studies of the outcomes of two of the most promising (and expensive) ways of training physician/researchers: the Medical Scientist Training Program and the Physician/Scientist Award program. Because they both provide longer periods of NIH-supported research training, they may also yield substantially higher returns to research than the more traditional fellowship and training programs, but the facts currently are unknown. Most evaluation activities have focused on NRSA programs administered by NIH. The few studies that included ADAMHA awardees tended to use fewer outcome measures. There have been no evaluations of the NRSA programs sponsored by the Health Research Service Administration. Similarly, there have been no evaluations of the effect of training grants on the training capacity or training efficiency of recipient institutions. Data Needs for Program Evaluation Program statistics on the number and characteristics of persons receiving each type of award are the most basic information about the training received by NRSA recipients.6 To provide this information, NIH sponsored the creation of the Trainee Fellow File, which provides information on all NRSA students, and the Consolidated Grant Application File, which contains information on programs for advanced research training and on institutional awards. One deficiency in these data bases is the difficulty involved in constructing definitions of attributes, such as field of study, that will provide consistent time series. A second problem is the lack of information about program outcomes. A third deficiency is the lack of a set of adequate measures for career outcomes, including scientific productivity. The proposals for a framework for evaluating program effectiveness and for an evaluation data matrix (discussed below) would remove many of the difficulties involved in the use of these data. - The evaluation matrix proposed in the appendix would deal with the ease of use of currently available statistics. However, there are three areas where all currently available statistics are inadequate: research participation by physicians'7 non-NRSA sources of support for research training, and program evaluations by former trainees. In many medical schools the faculty roster conducted by the Association of American Medical Colleges (AAMC) is not answered by the individual faculty member, as a result, the information in the survey is frequently out of date or otherwise inaccurate. The only firm information available on the amount of time that physicians spend on research comes-from 5Garrison and Brown, op. Cit., Tables 4.2 and S.2A. 6See the appendix for a further description of existing and proposed data sets discussed in this chapter. 7Research participation by Ph.D.s is covered in the SDR. 73

a one-time survey of the faculty of departments of internal medicine.8 Information is lacking on other specialties. Information about sources of support would be obtained most accurately from a survey of the training sponsors, although it could also be collected on an individual basis. This information would give a more accurate picture of the total training received by NRSA recipients and would greatly facilitate the design of more effective evaluation studies. Some outcome measures are available from data sets such as the SDR and the Institute of Scientific Information's Science Citation Index. Other basic measures are available only from former trainees themselves (and, where appropriate, from credible comparison groups). Former trainees' assessments of the impact of NRSA training on their subsequent careers is just one basic set of information that could be of substantial value in future versions of this report. Other valuable items would include sense of satisfaction with one's career and sense of contribution to the field. Because the SDR is based on a small sample, it is usually inappropriate as the source of inferences about small populations such as NRSA trainees in a given field of science. In this case, occasional surveys of former trainees and appropriate control groups are altogether warranted. Very little is known about the process used to select trainees for institutional grants. No records are kept of unfunded applicants or of persons who are offered a traineeship but turn it down. The lack of such basic information about the demand for training makes it difficult to assess important parameters of the program, such as the level of stipends and the effects of the payback provision. Finally, and perhaps most importantly, there is a great need for basic research on the determinants of a research career. NRSA programs attempt to intervene in a complex decision process that is poorly understood. Little is known about how the characteristics of a training" program affect the research abilities of persons who participate in that training. Although the recent evaluation studies suggest that NRSA training is correlated with success as a researcher, the correlations are very small: the total effect of NRSA training and other indicators of preexisting quality explained only 6-14 percent of the variance in outcome measures. Better understanding of the factors that influence career decisions and research ability is the key to designing more effective and efficient training programs. A FRAMEWORK FOR PROGRAM EVALUATION There are two major limitations in conducting an adequate evaluation of NRSA programs: (1) inadequate control groups with which to compare the awardees and (2) inadequate measures of many of the outcomes that need to be assessed. As discussed above, the problem of control groups is related to the process by which trainees are selected: those selected might have more successful careers (by whatever measure) than those not selected, independent of the advantages provided by the training program. An ideal experimental design would consist of choosing trainees randomly, independent of their characteristics, so that differences in career outcomes could be attributed to the effect of the training program. This ideal methodological approach is unreasonable in practice. A reasonable and practical alternative to random selection would be careful study of the process that FIG. S. Levey, et al., "Postdoctoral Research Training of Full-time Faculty in Academic Departments of Medicine," Annals of Internal Medicine, vol. 109, no. 5 (September 1988), pp. 414-418; their findings are discussed in Chapter 5 of this report. 9Coggeshall and Brown, op. cit.; Garrison and Brown, op. cit. 74

determines selection as an NRSA trainee. This approach would also provide insights that can be used to improve the selection process and, to the extent that the process is modeled adequately, it would be possible to introduce statistical controls into the analysis of the effects of being a trainee on career outcomes. Consequently, the committee's first recommendation in designing future evaluation studies is to include detailed information on the process by which trainees are selected from all applicants. The next step is to model the effects of the training program on career outcomes, including productivity measures. The commissioned paper by Helen H. Gee (see Volume III of this report) establishes guidelines that should be used in planning productivity assessments, including a number of general points about the use of productivity measures for NRSA programs: o o o o Define program goals specifically enough to provide guidance in constructing measures of their success. For example, "contribute to the research enterprise" does not narrow down the many ways this can be accomplished--through publications, patents, administration, and teaching. Recognize multiple pathways (activities and career paths) that can lead to those goals by designing evaluation studies that assess the variety of potential outcomes. Exclude those scientists whose career paths and research productivity cannot be assessed adequately with available methods and data. For example, if methods for assessing the productivity of nonacademic scientists are not practical, those scientists should be excluded from comparisons of other groups. Identify the uses to which the results of the assessment are to be put and let them guide the design of evaluation studies. Evaluations designed to assist program managers, for example, will not necessarily provide the information required by those making policy decisions. Recent evaluation studies have tended to focus exclusively on the single measure of publications and the single characteristic of whether or not the trainee sought funding from NIH. The committee recommends that future studies consider a broader spectrum of outcome measures, including the following: o o o o o o o o receipt of a Ph.D. (for predoctoral trainees); time required to complete the Ph.D. (for predoctoral trainees); years of postdoctoral training; type of employer; type of work activity; pursuit and receipt of NIH and ADAMHA funding; publications and citations; and area of research. For the evaluations to be most effective, these measures (many of which have been used in other studies) should be followed over an extended period of the career rather than be measured only at a single point in time. Longitudinal studies should be used that track changes in employers, work activity, grant activity, publications, citations, and area of research over at least the first decade of the career. Statistical comparisons of the career activities of trainees and the control group will provide a much better insight into the effectiveness of NRSA training programs. In summary, it is possible to design and carry out research that will produce unbiased estimates of marginal program effects by carefully expanding the program to 75

include additional trainees and fellows. Persons selected under this controlled expansion need to be followed over a period of time. Furthermore, the heterogeneity of the programs' aims and mechanisms also create difficulties because there are many kinds of intended effects and additional side effects--some desirable, some simply benign, and others possibly subversive of the main aims of the programs. Thus, the committee recommends that two evaluations of program effects be undertaken: 1. A comprehensive assessment of the effects on institutions, departments, and individual trainees and fellows and 2. A less comprehensive evaluation of the effects of program participation on individual awarders. A SPECIAL NOTE ON THE TRAINING OF PHYSICIAN/SCIENTISTS Program evaluation for clinical investigators is further complicated by the complexities of training and tracking the academic physician/scientist. ~D. faculty are supported in their research training not only through NRSA fellowships and institutional training grants, but also by a variety of foundations and volunteer health agencies. Thus, receipt of NIH support for post-M.D. training and receipt of post-M.D. training are not synonymous. Evaluation is further complicated when application for and receipt of NIH research grants (generically called R01) by former trainees are used as program outcome variables. The subsequent careers of M.D. awardees may involve (1) no research, (2) bench-type research, or (3) academic "hands-on" patient research. It is predominantly those in the second category--a comparatively low number--who are likely to apply for and obtain NIH R01 research support. Yet there is evidence that far more M.D. faculty in the third category--perhaps as many as 50-60 percent of the total NRSA M.D. trainees--are doing productive clinical investigation but not at the bench level that is generally required for NIH R01 funding. There is a vital need for well trained clinical investigators who can take the enormous explosion of knowledge in molecular biology and apply it to the care of patients. Over the last decade, however, it has been increasingly difficult for clinical investigators doing hands-on patient research to obtain funding through NIH. These individuals may account for a very large proportion of the "unsuccessful" trainees from the NRSA institutional grants. If so, they must be identified and quantified for adequate program evaluation. The same holds true for the cadre of clinical investigators who will have to be trained in the methodologies of epidemiology, biostatistics, health services research, economics, and outcome assessment in the near future. James Wyngaarden, a former director of NIH, has emphasized that research training and career development programs have a priority for NIH "virtually equal to the support of research project grants.") He also acknowledges, however, that NIH-sponsored training programs have variable success rates, with the least certain being the traditional training programs for physician/scientists. Far too few of these M.D. trainees apply for and receive NIH research grants, according to Wyngaarden, and some training programs merely serve as support vehicles for subspecialty clinical training. He calls for a comprehensive, critical review of NIH research training programs, specifically whether examining current training programs for physician/scientists should be modified. A. B. Wyngaarden (memorandum to BID Directors and OD Staff), "Review of NIH's Biomedical Research Training Program," April 19, 1989. 76

Wyngaarden's position is echoed by Lloyd.H. Smith (see Volume III of this report). Smith's position is that the serious physician/scientist must receive in-depth training in- a scientific discipline relevant to medicine and that rigorous scientific training can rarely be achieved in a specialty division of a clinical department. Smith argues that the training of physician/scientists should be comparable to Ph.D. programs in rigor and scope and that the physician should not be burdened with clinical responsibilities during the research training period. Smith believes that at least three years of rigorous training in modern biological science is usually necessary for most individuals to achieve independence as an investigator Smith's paper buttresses remarks made by Joseph Goldstein in his 1986 address to the American Society for Clinical Investigation. Paraphrasing from that address, intelligence, curiosity,.and drive are necessary but not sufficient for the productive physician/scientist; there must also be technical skill and the ability to reduce a complicated clinical phenomenon to a manageable biochemical problems Given the complexities of modern biomedical research, a clinical investigator must have a sophisticated understanding of the fundamental sciences, a mentor in the sciences to direct development, the opportunity to learn techniques, and uninterrupted time in the laboratory to conduct the research. Those committee members who have experience in the training of physician/scientists endorse the suggestions made by Smith and Goldstein and suggest that the following changes be made in the postdoctoral institutional training programs for physician/sc~entists: a true consortium between the clinical and preclinical departments of the institution, with shared responsibility for the design and administration of the program, selection of trainees based on evidence of some previous experience in research and overall promise; ~ o formal course. work in the physical and biochemical sciences sufficient to give graduates a theoretical background comparable to those with graduate degrees in the biological sciences; not less than three years of research training, primarily in direct research experience under the supervision of a mentor; and modules of instruction, specifically tailored to the needs of the physician trainee, in such areas as basic laboratory techniques, chromatography, radioimmunoassay, protein purification, advanced instrumental techniques, fundamental principles of enzymology and molecular biology, subcellular i. fractionation techniques, computer technology, evaluation of experimental data, epidemiology, and statistics and data base management, as well as grant and manuscript writing. A 1986 survey of full-time faculty in departments of medicine made similar recommendations regarding postdoctoral research trainings The survey identified several its. L. Goldstein, "On the Origin and Prevention of PAIDS (Paralyzer! Academic Investigator's Disease Syndrome)," Journal of Clinical Investigation, vol. 7S, 1986, pp. 848- 854. i2G. S. Levey, et al., op. cit. 77

features of training experiences that were associated with the faculty member currently being an active researcher, including the following: 1. Most postdoctoral training occurred in medical schools and the primary source of funding was NIH. 2. For faculty members with an M:D. degree, the length of training was a significant predictor for subsequently being an active researcher and principal investigator for a peer-reviewed research grant. The average length of time between the end of postdoctoral research training and obtaining the first peer-reviewed research grant was 24 months, regardless of length of training, source of training support, training site, or type of academic degree (M.D., M.D./Ph.D., or Ph.D.~. 4. Respondents advocated incorporating formal course work, particularly in the basic sciences and statistics, within the structure of the postgraduate training programs, with less time allocated to patient care. 5. Contributing factors to being a successful researcher in academic medicine include the following: two or more years of postdoctoral research training, including formal course work in the fundamental sciences pertinent to biomedical research; two to three years of full research funding from an academic institution until the first extramural grant is obtained; and the investigator's commitment of at least 33 percent of time to research activities. The former trainees, upon reflection, favored changing the curriculum to include more formal course work and training in fundamentals, particularly mathematics/computer science, statistics, research techniques, grant administration, and medical writing. Of equal interest, the vast majority (65 percent) wanted less time devoted to clinical medicine during the training program. The committee as a whole finds merit in these suggestions and recommends that NIH establish a committee, conference, or study to consider whether changes should be made in the program of study in postgraduate institutional training grants for physician/scientists. Deficiencies in the evaluation of these training programs are described in detail in the commissioned paper by Georgine Pion (see Volume III of this report). There are inherent difficulties in retrospective survey designs, and evaluation must focus on the career development of those who trained as many as 10-15 years ago to determine long- range effects of these training programs. The evaluation must also define what constitutes "success." For example, although this committee may conclude that institutional training grants need to be revised, it also recognizes that even in their current form, these programs have made a positive contribution. Graduates of the institutional training grants have populated all the clinical departments-in our medical schools and, even during their short training periods, they have done valuable research work in the laboratories of established investigators. 78

Next: 5. Recommendations »
Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings Get This Book
×
 Biomedical and Behavioral Research Scientists: Their Training and Supply: Volume I: Findings
MyNAP members save 10% online.
Login or Register to save!

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!