National Academies Press: OpenBook

Large-Scale Biomedical Science: Exploring Strategies for Future Research (2003)

Chapter: 3. Models of Large-Scale Science

« Previous: 2. Defining “Large-Scale Science” in Biomedical Research
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 29
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 30
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 31
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 32
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 33
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 34
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 35
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 36
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 37
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 38
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 39
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 40
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 41
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 42
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 43
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 44
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 45
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 46
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 47
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 48
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 49
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 50
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 51
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 52
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 53
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 54
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 55
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 56
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 57
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 58
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 59
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 60
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 61
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 62
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 63
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 64
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 65
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 66
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 67
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 68
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 69
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 70
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 71
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 72
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 73
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 74
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 75
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 76
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 77
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 78
Suggested Citation:"3. Models of Large-Scale Science." Institute of Medicine and National Research Council. 2003. Large-Scale Biomedical Science: Exploring Strategies for Future Research. Washington, DC: The National Academies Press. doi: 10.17226/10718.
×
Page 79

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Models of Large-Scale Science To further elaborate on the concept of large-scale biomedical science as defined in this report, this chapter provides an overview of several examples of past and current large-scale projects or strategies in biology and other fields. It begins with a sum- mary of the Human Genome Project (HGP), the largest and most visible large-scale science project in biology to date. Many examples are drawn from NCI, in part because NCI has a longer history and more extensive ex-perience with directed, large-scale projects compared to other branches of NIH, and also because a major focus of this report is on cancer re- search. Several initiatives recently launched by other branches of NIH are described in detail, followed by examples of National Science Founda- tion (NSF) programs, industry consortia, public-private collaborations, and initiatives sponsored by private foundations. The chapter concludes with an example of a nonbiology model of large-scale science for con- trast that of the Defense Advanced Research Projects Agency (DARPA). The DARPA model is commonly cited as a potential strategy for under- taking large-scale, high-risk, and goal-oriented research, but this model has rarely been replicated in biology. A review of federally funded large- scale research projects in nonbiology fields such as high-energy physics is provided in the Appendix. The common theme among the examples described in this chapter is that they are all formal programs launched by funding agencies, founda- tions, or industry. There is certainly no shortage of other ideas for poten- tial large-scale biomedical research projects among scientists. Without an 29

30 LARGE-SCALE BIOMEDICAL SCIENCE initiative by a fonder, however, individual scientists may find it very difficult to obtain the funding necessary to launch an expensive, long- term, large-scale project because of the nature of traditional funding mechanisms (see Chapter 4~. Another common thread among these projects is their dependence on new or developing technologies. Technical innovations drive scientific discovery and determine what can be accomplished in the field. The pace and variety of new innovations have increased greatly in recent years, in turn increasing the feasibility of and opportunities for large-scale projects in biology (see Box 3-1~. For example, the advent of DNA arrays and the development of software for analyzing the data they generate have made it feasible to study the entire transcriptional profiles of cells in health and disease or under various conditions. However, such projects are not only much larger in scale, but also much more expensive to undertake.

MODELS OF LARGE-SCALE SCIENCE THE HUMAN GENOME PROJECT 31 Ever since the discoveries of genetic inheritance and the chemical structure of DNA, there has been interest in "unlocking the secrets of life" by deciphering the information encoded in the genome. Initially, scien- tists concentrated on small pieces of the puzzle because they lacked the ability to investigate genetic material efficiently on a large scale. As tech- nological advances were made, however,1 some molecular biologists be- gan to discuss the feasibility and potential value of mapping and sequenc- ing the entire human genome (see Figure 3-1~. The first editorial published in a major scientific journal advocating a large-scale approach to sequence the human genome brought the concept to the scientific mainstream, with an emphasis on cancer research (Dulbecco, 1986~. Nobel laureate Renato Dulbecco suggested that a project to map the human genome was the best way to make progress in the "war on cancer," which had been launched by the Nixon Administration in 1971. Dulbecco compared the significance of such a project to that of the U.S. space program, arguing that a genomic approach would facilitate a greater understanding of the genetic changes that lead to cancer, which would be essential in eradicating the disease. But he also noted that research on other diseases would certainly benefit as well. At about the same time, a number of influential scientists were pub- licly discussing and advocating the possibility of sequencing the entire human genome (reviewed by Sulston and Ferry, 2002; Davies, 2001; Cook- Deegan, 1994, Kevles and Hood, 1992~. In May 1985, Robert Sinsheimer, chancellor of the University of California Santa Cruz and a well-known molecular biologist, brought together a group of leading American and European molecular biologists to discuss the technical prospects for a human genome project. At this symposium on DNA sequencing, one of the strongest advocates for a large-scale HOP was Walter Gilbert, a Nobel laureate from Harvard, who had developed one of the first methods for sequencing DNA. The following year, in early March 1986, Charles DeLisi, director of the Office of Health and Environmental Research at the U.S. Department of Energy (DOE), held a workshop to discuss the idea of undertaking an HOP under DOE. Although DOE may not have appeared to be the logical choice of a federal agency to oversee such a project, it had a long-standing research program on the effects of radiation on mutation rates, and the Life Sciences Division at Los Alamos National Laboratory had already established Genbank, a major database for DNA sequences, in 1983. DOE 1 These technical advances included recombinant DNA methods, DNA sequencing meth- ods, techniques for genetic mapping, and computer analysis.

32 LARGE-SCALE BIOMEDICAL SCIENCE May 1985 -- Robert Sinsheimer, UCSC chancellor hosts a meeting to discuss the technical prospects of the HGP. March 1986 -- Editorial by Renato Dulbecco suggests that the HGP is the best way to make progress in the War on Cancer. March 1986 -- Charles DeLisi holds a workshop to discuss the possibility of a DOE-sponsored HGP. May 1986 -- A molecular biology meeting at Cold Spring Harbor includes a special session to discuss the possibility of the HGP. February 1988 -- A report from the U.S. National Research Council endorses the HGP. April 1988 -- The Congressional Office of Technology Assessment endorses the HGP. September 1988 -- NIH establishes the Office of Human Genome Research, with James Watson as its head. October 1989 -- The new NIH office becomes the National Center for Human Genome Research (NCHGR). April 1990 -- NIH and DOE publish a 5-year mapping and sequencing plan, with a projected budget of $200 million/year. 1991 -- NIH funds ~175 genome projects, with an average grant size of ~$300,000/year. July 1991 -- Craig Venter, then at NIH, reveals that NIH has applied for patents on expressed sequence tags (ESTs) identified by his laboratory. April 1992 -- Watson resigns as head of NCHGR. Francis Collins appointed as his replacement in 1993. June 1992 -- Venter leaves NIH to set up The Institute for Genomic Research (TIGR), a non-profit devoted to identifying human genes using EST methods. October 1993 -- NIH and DOE publish a revised 5-year plan, with full completion expected in 2005. October 1993 -- The Wellcome Trust and the U.K. Medical Research Council open the Sanger Center to sequence the human genome and model organisms. September 1994 -- French and American researchers publish a complete genetic linkage map of the human genome, one year ahead of schedule. December 1995 -- Another group of American and French scientists publishes a physical map of the human genome containing 15,000 marker sequences. February 1996 -- International HGP partners agree to release sequence data into public databases within 24 hours. January 1997 -- NCHGR renamed as National Human Genome Research Institute (NHGRI). October 1997 -- Only 3 percent of the human genome is sequenced in finished form by the projected midway point of the 15-year HGP. 1998 -- ABI PRISM 3700 automated sequencing machines enter the laboratory market. May 1998 -- Craig Venter announces formation of a company, later named Celera, to sequence the human genome in 3 years, using the whole genome shotgun approach. May 1998 -- The Wellcome Trust announces that it will double its support for the HGP. May 1998 -- Collins redirects the bulk of available NHGRI funds to three sequencing centers. October 1998 -- NIH and DOE publish new goals for 1998-2003, expecting a working draft of the genome by 2003, and a full sequence by 2005. March 1999 -- NIH moves the expected date for release of a working draft ahead to spring of 2000. March 2000 Celera and academic collaborators release a draft sequence of the fruit fly genome, obtained using the whole-genome shotgun method. March 2000 -- Possibility for collaboration between Celera and the public HGP wanes. Disagreement over data access is a major obstacle. June 2000 -- HGP and Celera jointly announce a working draft of the human genome sequence. FIGURE 3-1 A timeline of the human genome project. SOURCE: Adapted from Macilwain (2000:983~.

MODELS OF LARGE-SCALE SCIENCE 33 was also accustomed to big-science projects involving sophisticated tech- nologies. It tended to oversee big, bureaucratic, goal-oriented projects, in contrast to the smaller, hypothesis-driven research that was the standard at NIH. DeLisi, formerly chief of mathematical biology at NIH, had been exploring the feasibility of such a project, and in 1986 he proposed a plan for a 5-year DOE HGP that would comprise physical mapping, develop- ment of automated high-speed sequencing, and research into computer analysis of sequence data. Soon after, in May 1986, a meeting on molecular biology hosted by lames Watson at Cold Spring Harbor included a special session dedicated to discussing the possibility of an HGP. During this session, Walter Gil- bert estimated the cost of sequencing the human genome at $3 billion (approximately $1 per base). Many scientists opposed the endeavor on the basis of cost, as they assumed it would take funding away from other projects. The project was also viewed by many as a forced transition away from hypothesis-driven science to a directed, hierarchical mode of big science. Many argued that sequencing efforts should focus on the genes rather than the entire genome, which included large areas of repetitive DNA of unknown function. Searching for and characterizing genes hy- pothesized to be associated with human diseases was thought by oppo- nents of the project to be the more scientifically valid approach than "blindly sequencing the denote. However advocates for the Project en 1 tJ tJ ' 1 ~ . . . . ~ . . . . . . . . . . . . argued that a large-scale libel' would be a less risky undertaking than b~g- science programs in space or physics. A failed space mission or particle accelerator would be extremely expensive and would be unlikely to yield partial benefits. In contrast, accomplishing even some of the goals of the HGP (e.g., an incomplete map or a partial sequence) would likely be very beneficial. Others suggested, however, that such a project would not ad- vance medical science, because knowing the sequence of a gene does not necessarily foster progress in developing new treatments. For example, the single base-change mutation responsible for sickle cell anemia has been known for more than 20 years, but no therapies based on this knowl- edge have yet been developed. Many biologists also viewed DOE's efforts as a means of expanding its influence and involvement in biological re- search, as there were questions at the time about the future of the Na- tional Laboratories, given the volatility of national defense and energy policy since the 1970s (Cook-Deegan, 1994~. They argued that a federally funded large-scale HGP, if undertaken at all, should be carried out through NIH. One incentive for undertaking a federally funded HGP was to main- tain a U.S. lead in biotechnology. In the late 1980s, genome efforts were gaining momentum in several other countries as well (reviewed by

34 LARGE-SCALE BIOMEDICAL SCIENCE Davies, 2001; Cook-Deegan, 1994; Kevles and Hood, 1992, Sulston and Ferry, 2002~. In 1988, the European Community proposed the launch of a European Human Genome Project. A modified proposal was adopted in 1989, authorizing a 3-year commitment of 15 million euros, 7 percent of which would be devoted to ethical issues. Meanwhile, human genome programs at the national level were also prospering in Europe. For ex- ample, in 1989 the British government committed itself to a formal human genome program, funded at 11 million pounds per year for the first 3 years. In France, the Centre d'Etude du Polymorphisme Humain (CEPH), a key player in developing the genetic linkage map of the human genome, was founded by Nobelist lean Dausset with funds from a scientific award and gifts from a private French donor. Through additional support from the Howard Hughes Medical Institute (HHMI), CEPH made clones of its DNA available to dozens of researchers in Europe, North America, and Africa. Japan, which had thus far been involved only marginally in bio- technology research, was also pushing hard to develop new automated sequencing technologies, with the objective of a major sequencing initia- tive. In 1988, the international Human Genome Organization (HUGO) was formed, primarily with funding from HHMI and the Imperial Cancer Research Fund in Great Britain (Kevles and Hood, 1992~. Its goal was to help coordinate human genome research internationally; to foster ex- changes of data, materials, and technologies; and to encourage genomic studies of organisms other than human beings, such as mice. Because of the controversies surrounding the proposed U.S. HOP, the National Research Council (NRC) was commissioned to undertake a study to determine a strategy for the project. The NRC study, chaired by Bruce Alberts, generated a report (NRC, 1988) advocating an interna- tional program led by the United States and containing the following recommendations: · Postponing large-scale sequencing until the necessary technology could be improved, thereby reducing the cost per base (estimated to be about a 5-year delay) · Making technology development for sequencing a high priority · Focusing first on mapping the human genome · Characterizing the genomes of model organisms (e.g., mouse, fruit fly, yeast, bacteria) · Providing $200 million in funding per year for up to 15 years The report did not make a recommendation as to whether the NIH or DOE should oversee the project. In 1988, however, NIH and DOE reached an agreement on their working relationship for the next 5 years: NIH would primarily map the chromosomes, while DOE would develop tech- nologies and informatics, with collaboration occurring between the two

MODELS OF LARGE-SCALE SCIENCE 35 . . . . agencies In overlapping areas. In 1988, DeLisi submitted a budget from DOE of $12 million. In the same year, NIH Director lames Wyngaarden offered lames Watson, Nobel laureate and codiscoverer of the helical structure of DNA, the position of associate director of human genome research. Watson built political sup- port for the project, and made a commitment to devote about 5 percent of its budget to the study of the project's ethical, legal, and social impli- cations.2 In October 1989, the unit became the National Center for Hu- man Genome Research, with a budget of $60 million for fiscal year 1990 (Davies, 2001~. The HGP actually entailed three related endeavors: genetic mapping, physical mapping, and sequencing. Genetic mapping is accomplished by determining the order and approximate location of genetic markers, such as genes and polymorphisms, on each chromosome. Physical mapping involves breaking each chromosome into small, ordered, overlapping fragments and placing these fragments into vectors that can easily be stored and replicated. For the sequencing phase, fragments of each chro- mosome are processed to determine the base pair code.3 The U.S. HGP was inaugurated as a formal federal program in 1991, receiving about $135 million. Seven NIH centers were involved: five fo- cused on human gene mapping, one focused on mouse gene mapping, and one focused on yeast chromosome sequencing. These centers were supported on a competitive, peer-reviewed basis. In 1991, the largest cen- ter budget was $4 million, divided among several research groups. The genome installations at DOE's National Laboratories were focused on developing technologies for mapping, sequencing, and informatics. Four additional projects, funded jointly by NIH and DOE, were engaged in large-scale sequencing efforts and innovations. In addition, dozens of smaller, investigator-initiated gene mapping and sequencing projects aimed at single disease-associated genes were funded by NIH in laborato- ries across the country. For example, in 1991 NIH funded about 175 differ- ent genome projects, with an average grant size of $312,000 a year (about 1.5 times the average grant size for basic research, and about equal to the average AIDS research grant). Thus, the HGP initially was characterized more by loose coordination, local freedom, and programmatic and insti- 2 This commitment of NIH funds to ethical debate was unprecedented, as was making bioethics an integral part of an NIH biological research program. 3 The original plan called for carefully orchestrated sequencing of the fragments derived from physical mapping; more recently, however, a "shotgun" method has been used to sequence random fragments from a chromosome, followed by application of computer algorithms to determine the order of the sequence fragments.

36 LARGE-SCALE BIOMEDICAL SCIENCE tutional pluralism than by strong central management or external hierar- chy (Kevles and Hood, 1992~. Criticism of the program continued, however, especially with regard to funding priorities at NIH. During the late 1980s, the proportion of grants funded by NIH fell from 40 percent to less than 25 percent (Davis, 1990~. For example, the National Institute for General Medical Sciences (NIGMS) awarded more than 900 new and competing renewal grants for projects unrelated to the genome in 1988; in 1990, it awarded only 550, a 43 percent decrease. Across NIH, the total number of grants had fallen from 6000 to 4,600 a year (fewer than the number funded in 1981~. This drop caused great consternation among biomedical scientists, and many assumed that it was due directly to the transfer of funds to the HGP, though close examination of concurrent changes in NIH funding patterns suggests that this was not the case. In the mid-1980s, the average grant period was extended from 3.3 to 4.3 years to provide greater stability for funded projects and reduce the frequency of grant applications; the aver- age amount of funding per grant also increased significantly. But this in turn reduced the funds available for new awards or renewals. During the same period, the production of Ph.D. scientists in the field of biomedicine greatly increased, so more people were competing for grant money. Sup- porters of the HGP argued that the project was bringing appropriations to biomedical research that simply would not otherwise have been received. In any case, NIH expenditures on the project in 1991 accounted for only 1 percent of the agency's total budget of $8 billion (Kevles and Hood, 1992~. In addition, the project's deliberate emphasis on technological and methodological innovation was contrary to the tradition and preference of many in the biomedical research community. However, much progress in biomedical science has been fostered and accelerated by sophisticated tools and technologies, often those developed through work in other fields, such as the physical sciences (Varmus, 1999~. Furthermore, unlike technologies in the field of high-energy physics, those in biology tend to become smaller, cheaper, and more widely obtainable and dispersed as they improve. Thus technology development in biology is more likely to benefit a large number of scientists in the long run, rather than making the field more exclusive. The HGP faced a new challenge in 1992 when lames Watson resigned. Earlier that year, a controversy had arisen regarding patent applications on gene fragments. T. Craig Venter, who was working at NIH at the time, had used a high-throughput technique for sequencing fragments of genes from cDNA libraries (known as expressed sequence tags, or ESTs). NIH applied for patents on hundreds of ESTs on Venter's behalf. The patents were even- tually rejected by the Patent Office on the grounds that they did not meet the criteria of nonobviousness, novelty, and utility. Initial rejection of an

MODELS OF LARGE-SCALE SCIENCE 37 application is not unusual, and NIH had the option to appeal the decision, but in 1994 a decision was made to abandon the effort. These patent appli- cations were widely criticized by the scientific community at large, and the issues surrounding DNA patents continue to be controversial. Francis Collins was appointed in 1993 to be Watson's successor. Col- lins had been among the first to identify a human disease gene (for cystic fibrosis) through positional cloning, a technique that relies on genetic and physical mapping. By the time of his new appointment, he had also been involved in the discovery of several additional disease genes4 using simi- lar methods. The HOP soon faced new criticism. By 1997, the midpoint of the 15- year project, only 3 percent of the human genome had been sequenced in finished form, and there were many technical difficulties with the physi- cal maps of the chromosomes (Rowen et al., 1997; Anderson, 1993~. A1- though the first 6 years of the project had deliberately focused on smaller genomes and on the development of techniques that would allow for a more efficient and cost-effective approach to large-scale sequencing of the human genome, sequencing technologies had not yet been sufficiently improved to either dramatically speed the sequencing process or reduce the cost (Pennisi, 1998~. As a result, there was concern about whether the project could be completed within the projected timeframe or budget. In 1998, the technology of DNA sequencing took a major step forward when the Applied Biosystems Incorporated (ABI) PRISM 3700 entered the laboratory market (Davies, 2001; Wade, 2001~. While not the first auto- mated sequencer, the ABI PRISM was still an evolutionary advance over existing commercial automation because it provided increased capacity and throughput. It incorporated two major modifications to the original Sanger sequencing method: it used fluorescent dyes instead of radioactiv- ity to label the DNA fragments, so that a laser detector and computer could identify and record each letter in the sequence as the DNA frag- ments were eluted; and it separated DNA fragments in ultrathin capillary tubes filled with a polymer solution, rather than the traditional polyacry- lamide slab gels. These improvements were the inspiration of Michael Hunkapiller, and the machines were produced by ABI, originally an inde- pendent company that had been purchased by the scientific instrument maker Perkin-Elmer (PE) and now a subsidiary of Applera. As a result of these technological advances, DNA samples could be separated much more quickly, and several samples could be processed each day using very small volumes of reagents. The new machines required only about 15 minutes of human intervention every 24 hours, compared with 8 hours 4 The genes for neurofibromatosis ~ and Huntington's disease.

38 LARGE-SCALE BIOMEDICAL SCIENCE for the traditional machine. These changes cut sequencing time by 60 percent, reduced labor costs by 90 percent, and produced sequence about eight times faster (about 1 million bases a day) than traditional sequenc- ing methods (Davies, 2001~. The new sequencing machines were used early on by Craig Venter, who had left NIH in 1992 to found The Institute for Genomic Research (TIGR), a nonprofit organization devoted initially to identifying expressed human genes using EST methods. The organization had since branched out into other areas of genomic research, such as sequencing the genomes of bacteria. It was also a major player in the federally funded HGP. TIGR was the first center to use and verify the effectiveness of the "shotgun" method for sequencing the relatively small, simple genomes of microbes. The advent of the new sequencing machines led Hunkapiller to consider the possibility of rapidly sequencing the entire human genome using a similar approach, and he brought the idea to Venter. In 1998, Venter left TIGR to found Celera, initially an independent subsidiary of PE Corpora- tion and now a subsidiary of Applera the same company that produced the ABI PRISM 3700 sequencing machines with the goal of doing just that. The feasibility of such a project was widely questioned in the scien- tific community. The PRISM sequencers were still largely untested, and the shotgun method had never been used on anything other than bacterial genomes. Many predicted that the final product would likely have many more gaps and errors than would result from the methodical approach of the public project because of the size, repetitiveness, and complexity of mammalian genomes as compared with microbial genomes. Venter and colleagues (1998) argued that these challenges could be overcome, and Celera launched a test project to sequence the genome of the fruit fly Drosophila, a complex eukaryote whose genome was about one-twentieth the size of the human genome. It took Celera 4 months to prepare a rough sequence draft of the Drosophila genome, suggesting that the human ge- nome could be deciphered in this way as well (Loafer, 2000; Pennisi, 2000a). To accomplish the goal of producing a complete rough draft of the human genome sequence by 2001 (4 years ahead of the public project's timetable), Celera purchased about 300 PRISM 3700 sequencers and a supercomputer for sequence analysis. The company also recruited a large number of people who specialized in developing algorithms and soft- ware for sifting through and organizing the huge amounts of data to be generated. Most notable was Gene Myers, who had already been working on shotgun assembly algorithms at the University of Arizona. Venter estimated that the total cost to sequence the human genome would be about $200-500 million. By this time, $1.9 billion had already been in-

MODELS OF LARGE-SCALE SCIENCE 39 vested in the publicly funded HOP, but questions were raised as to whether Celera's efforts would now make continuation of the public project redundant and unnecessary. On the other hand, supporters of the public project believed the new challenge from Celera was ample reason to accelerate the public effort. Some of the concern stemmed from the potential commercial exploitation of genomic data, although the com- pany had announced that it would seek patents on only 100-300 genes. The Celera business plan entailed selling access to sequence analysis, such as information on gene identification, DNA variants, medical rel- evance, and comparisons with other species. Celera still planned to re- lease raw sequence data free of charge, but only every 3 months, as op- posed to every 24 hours as in the public project (Davies, 2001~. Shortly after the launch of Celera, the Wellcome Trust doubled sup- port for the Sanger Center, Great Britain's main sequencing center in the public effort. Francis Collins also suggested producing a public rough draft of the sequence first, by 2001, to coincide with Celera's target date. The public consortium would then release a finished, "gold-standard" version by the original deadline in 2004, a goal that Celera had never established. To meet this new deadline, Collins redirected the bulk of available NIH funds to just three centers, announcing that these three centers would receive $80 million over 5 years. At about the same time, the Wellcome Trust announced that it would provide another $7 million to the Sanger Center. Thus the lion's share of the draft sequence would be produced by five major genome centers: Sanger, three centers funded by NIH (Whitehead Institute, Washington University, and Baylor College of Medicine), and DOE's Joint Genome Institute. To meet the new goal, hundreds of PRISM sequencers (or similar machines) were purchased by the publicly funded centers (Davies, 2001~. The competition and animosity between the public and private ef- forts to sequence the genome escalated (reviewed by Davies, 2001; Wade, 2001), but as the self-imposed deadline to finish the draft sequence ap- proached, a compromise was brokered between the leaders of the two projects. On June 26,2000, Craig Venter and Francis Collins came together for a White House press conference to formally announce completion of the draft sequence. The first publications on the draft sequences were published about 7 months later in the journals Nature and Science (Lander et al., 2001; Venter et al., 2001~. Science has been criticized for its decision to publish Celera's analysis because the company was allowed to post its data in its own database with some restrictions on its use, rather than depositing the sequence into a public database such as Genbank, as is usually required for publication. Leaders of the public project have also noted that Celera's analysis was dependent upon access to the public databases, suggesting that the company's shotgun method alone could

40 LARGE-SCALE BIOMEDICAL SCIENCE not have produced an assembled sequence of high quality (Waterston et al., 2002a). The public consortium has continued its efforts to analyze the se- quence and to fill in gaps and correct errors; completion of the finished version was announced in April 2003 (Pennisi, 2003~. However, the rough draft sequence is now freely available to any biomedical scientist in the world. Recently, a draft of the mouse genome was also published (Waterston et al., 2002b). These sequences provide a rich resource for biomedical research. The process of identifying disease-related genes, once an expensive and arduous undertaking, has become a rapid, highly automated process limited primarily by access to the relevant human populations. Of course, the lag time between finding a gene and devel- oping a clinically relevant therapy for a disease is still likely to be quite long. Nonetheless, the completion of the HOP has accelerated the pace of biomedical discovery. The sequence is likely to have an equally dra- matic effect on other areas of basic biological research, such as evolu- tionary biology. The HOP's goal of producing a powerful research tool has been met, in spite of the criticism and controversy surrounding the project. Knowl- edge of the human sequence, as well as those of model organisms, has already greatly facilitated basic research in such areas as microarray analy- sis and proteomics. There is also, as noted, great hope for developing clinical applications of the new knowledge to directly advance human health. Questions may still be raised, however, as to whether the project was carried out in the most effective and efficient manner, or even whether competition from the private sector was a positive force in finishing the project. With the completion of the HOP has come an increased interest in taking on additional publicly funded large-scale biology projects. Thus, these are important questions to address when considering a new large- scale undertaking in biomedical science. PAST EXAMPLES OF LARGE-SCALE PROJECTS FUNDED BY NCI Three large-scale programs developed by NCI in the 1950s and 1960s while perhaps not strictly meeting the working definition of large-scale science used for this report, may prove instructive in understanding some of the issues relevant to NCI's more recent large-scale initiatives. Although NCI's extramural grants program, like those of most branches of NIH, has supported mostly investigator-initiated projects funded on the basis of scientific peer review, a markedly different approach was used for much of the research carried out under these three programs in Cancer Che- motherapy, Chemical Carcinogenesis, and Cancer Viruses. Each of these programs entailed large-scale, directed research and often employed the

MODELS OF LARGE-SCALE SCIENCE 41 contract funding mechanism, with comparatively little input from and control by the scientific community; rather, NCI staff assumed responsi- bility for the programs and had authority over the assignment of research contracts to investigators (reviewed by Rettig, 1977~. Over time, contract research grew to be a substantial portion of the NCI budget, about 80 percent of which was devoted to these three directed programs in 1971 (U.S. Department of Health, Education, and Welfare, 1973~. Cancer Chemotherapy Program The cancer chemotherapy program was launched in 1955. During World War II, it was discovered that nitrogen mustard could induce tem- porary remissions in certain forms of leukemia and lymphoma, and this discovery led to the search for additional chemical agents for cancer treat- ment. The methodical search for chemotherapeutics took place in mul- tiple stages. First, a large number of chemical compounds were procured and screened for antitumor effects. Promising compounds were then evaluated for toxicity, first in animals and then in humans. Finally, com- pounds were tested in human clinical trials for therapeutic effect. Be- tween 1955 and the late 1970s, more than 500,000 chemicals were tested on laboratory animals in NCI's chemotherapy program. Several hundred of these chemicals had also been tested in clinical trials, and about 45 chemicals had been found to have some effect against 29 forms of cancer (DeVita and Goldin, 1984~. One of the great challenges for the program was establishing the protocols and appropriate animal models for screening the antitumor effects of compounds. Early on, the contract research system appeared to be a logical approach for large-scale screening of chemicals, especially given the substantial need for animal production facilities. Administra- tive integration of the program components was less complicated using a centrally managed contract system as opposed to a more traditional pro- gram of extramural grants. A large portion of the contract work was actually performed by private industrial firms. For many years, this program was the subject of great controversy within the scientific community, dividing scientists committed to funda- mental research and those with a focus on targeted or directed research. The program was widely criticized for its dependence on contract re- search and its lack of communication with the scientific community. In- deed, a 1965 White House report, commissioned to determine whether Americans were getting their money's worth from NIH-sponsored medi- cal research, singled out the cancer chemotherapy program for harsh criti- cism. The report noted that many medical scientists had questioned whether the cost of the program could be justified by its output. The

42 LARGE-SCALE BIOMEDICAL SCIENCE review group for the program concluded that a substantial fraction of the contract work done within the program was of relatively low scientific quality and showed evidence of inadequate central supervision (U.S. President's NIH Study Committee, 1965~. Another independent committee was appointed by the secretary of Health, Education and Welfare in 1966 to review the funding of NIH research, including the cancer chemotherapy program. Chaired by lack Ruina, who had extensive experience with grant and contract support of research and development in the Department of Defense, the committee concluded that the grant mechanism was inappropriate for directed re- search and development programs, and that contracts should be used instead. Nonetheless, the scientific community continued to express dis- satisfaction with NCI's directed research efforts. The committee's report stated that plans for directed research, including objectives, justifica- tion, expected funding levels, management plans, and types of contrac- tors, should be submitted to an appropriate advisory council for review and approval prior to a program's initiation, termination, or substantial change in scale or direction. The committee recommended, however, that once a program had been initiated, a program manager take full responsibility for its execution and oversight. The committee further urged NIH to take significant steps to make career opportunities and status for program managers more attractive. Moreover, it recom- mended that the practice of using intramural scientists to oversee di- rected research be replaced with a strong, independent management structure (U.S. Department of Health, Education and Welfare, 1966~. In spite of this last recommendation, however, NCI staff who managed the directed research programs also continued to have responsibility for related aspects of the intramural program because of the difficulty in recruiting outside scientific talent to assume these management roles. This situation led to conflicts regarding the promotion and tenure of intramural research staff (Rettig, 1977~. The staff's administrative re- sponsibilities for the directed research programs reduced the amount of time they could spend on the conduct of their own research; thus they often published fewer papers than scientists from other branches of NIH. Because the traditional criteria for promotion and tenure stressed productivity in the form of published scientific articles, NCI staff mem- bers were often at a disadvantage in tenure and promotion decisions, which were reviewed collectively by the scientific directors of all the NIH institutes. In 1975 the cancer chemotherapy program was combined with the surgery and radiation branches of NCI to form the Clinical Oncology Program (DeVita and Goldin, 1984~. The development of therapeutic agents continues to be a focus of NCI's Developmental Therapeutics Pro-

MODELS OF LARGE-SCALE SCIENCE 43 gram (DTP). Among the chemical compounds that have been slated for clinical development since at least 1981, 13 have been approved by the U.S. Food and Drug Administration (FDA).5 According to a 1995 report (known as the Bishop-Calebresi report) that reviewed NCI's intramural program and was undertaken at the request of then-director Richard Klausner, this program has become an international resource, available to academic and commercial investigators alike (National Cancer Advisory Board, 1995~. However, the report criticized the program for being intel- lectually isolated and underutilized by both the intramural and extramu- ral communities of NCI, in part because of a failure to reach out to the larger community of scientists. The report further criticized the program for a lack of flexibility in its tactics and strategies, and identified problems with accountability and review. NCI has since initiated a new program called Rapid Access to Intervention (RAID). The goal of RAID6 is to speed up the preclinical testing for promising drugs by targeting academic labo- ratories that have novel candidate compounds, but lack the specific re- sources or expertise needed to develop them further. Chemical Carcinogenesis Program NCI's second large-scale, directed program was launched in 1962. The goal of the chemical carcinogenesis program was to evaluate sus- pected chemical compounds for their cancer-causing properties, using one of two approaches: the first was to analyze occupation settings in which humans were known to be exposed to measurable amounts of specific chemicals; the second was to undertake epidemiological studies to ascertain major differences in the forms and incidence of cancer among various locations and cultures. Of the three NCI programs discussed here, chemical carcinogenesis received the least amount of funding and atten- tion, and was not remarkably productive. The criteria for determining whether a given chemical is carcinogenic were (and still are) very difficult to establish, and a major obstacle to overcome was again the development of biological tests or models that could predict carcinogenic effects in humans. The undertaking was also seen as potentially leading to conflict with and regulation of the chemical industry, which was not the usual purview of NCI (Rettig, 1977~. The research efforts under the program were run by project officers who were trained as research scientists. The project officers set up con- tracts with various extramural investigators to collaborate within the scope of their own research expertise, and they coauthored extramural 5 See <htip: / /dip.nci.nih.gov/docs/idrugs/drugstatus.htmI> [accessed ~ /02/03]. 6 See <htip: / /dip.nci.nih.gov/docs/raid/raid_pp.htmI>.

44 LARGE-SCALE BIOMEDICAL SCIENCE research publications. Contracts were reviewed by project officers, along with an advisory group, about once a month. The reviewers considered future needs in addition to examining the status of current research contracts. In the late 1970s and early 1980s, extensive changes took place within the chemical carcinogenesis program. The first was that program officers were no longer permitted to be associated with extramural research projects. The review process was completely removed from within the individual intramural programs, and outside review was initiated. The officers still provided oversight for contracts, but could not provide any scientific input, and they were no longer included on extramural publica- tions. The second major change was the separation of testing and basic research. The carcinogen-testing portion of the program was moved to the National Institute of Environmental Health Sciences (NIEHS) and was called the National Toxicology Program (NTP). The NTP is affiliated with FDA, and is funded jointly by FDA, the Centers for Disease Control and Prevention (CDC), and NIEHS (its current annual budget is $160 million). The NTP produces the Report on Carcinogens, a list of all substances that either are known to be human carcinogens or may reasonably be antici- pated to be human carcinogens, and to which a significant number of people in the United States are exposed. However, the report does not present quantitative assessments of carcinogenic risk. Basic research in carcinogenesis is now funded through a branch within the Division of Cancer Biology. This basic research is supported by grants, and there are currently no large-scale projects. Cancer Virus Program NCI's third major contract program, the special virus cancer pro- gram, was established in 1964. Scientists knew that certain types of cancer in chickens and rodents could be induced by viruses, and this knowledge led to the hypothesis that viruses could also be cancer-causing agents in humans. The goal of the program was to identify such causative viruses and to develop preventive vaccines against them. Eventually, the contract mechanism for this program was replaced by the more traditional inves- tigator-initiated grants, in large part because of scathing criticism in a report on how the contract research program was being run (Culliton, 1974; National Cancer Advisory Board, 1974~. In particular, the report criticized the contract proposal process because it was dominated by pro- gram officials, included potential and actual contractors in the review of proposals, lacked scientific rigor, and was inaccessible to the larger virol- ogy community. In addition, the contract research program represented an extension of the intramural research work of some program scientists.

MODELS OF LARGE-SCALE SCIENCE 45 As a result of the report, the contract review process was modified to make it more open and rigorous (Rettig, 1977~. The formal program ulti- mately faded away, but research on viruses was continued through other programs at NCI and NIH. The cancer virus program could be considered a significant failure of directed research since it did not lead directly to the identification of any viruses that cause human cancer; however, it had many indirect, beneficial effects on the scientific community. Many viruses (mostly RNA viruses) were found to cause cancer in a variety of animals, but investigators had begun to doubt whether any human cancers could be linked to viruses. The first human leukemia virus (HTLV1) was then identified and characterized by two independent laboratories in the early 1980s (Yoshida et al., 1982; Gallo et al., 1982~. Later, it was discovered that the Epstein-Barr virus (a DNA virus) could also cause human cancer. It is now known that two of the most common cancers in the worldwide population cervical and liver cancer are caused by virus infections (reviewed by Gallo, 1999~. Further- more, the recognition of viral oncogenes has led to the identification of cellular oncogenes and tumor suppressor genes, which play an important role in most non-virus-associated human cancers. These discoveries did not result directly from the targeted research of the cancer virus program, but certainly were aided indirectly by the groundwork and scientific infrastructure developed by that program. An unintended but beneficial return on the investment in cancer virology was the development of technologies for the field of molecular biology- the purification and production of reverse transcriptase being a prime example. Research on human immunodeficiency virus (HIV) also ben- efited greatly from the work on retroviruses that was undertaken through the cancer virus program. Ironically, however, if technology development had been the stated goal of the program, it most likely would have re- ceived less funding. At the time the program was initiated, Congress and NIH were not very receptive to funding programs aimed simply at devel- oping biological technologies. In the current environment, technology de- velopment may be a more acceptable goal in and of itself. RECENTLY DEVELOPED LARGE-SCALE PROJECTS AT NCI The Cancer Genome Anatomy Project The Cancer Genome Anatomy Project (CGAP) is an interdisciplinary program established and administered by NCI to generate the informa- tion and technological tools needed to decipher the molecular anatomy of cancer cells. It was launched after extensive input had been gathered from a committee of external scientists who were considered leaders in the

46 LARGE-SCALE BIOMEDICAL SCIENCE field of cancer biology. The goal of the CGAP is to achieve a comprehen- sive molecular characterization of normal, precancerous, and malignant cells in order to determine the molecular changes that occur when a nor- mal cell is transformed into a cancer cell, and then to apply that knowl- edge to the prevention, detection, and management of cancer.7 Since its inception in 1996, the program has encompassed four primary initiatives: · The Human Tumor Gene Index identifies genes expressed dur- ing the development of human tumors. · The Cancer Chromosome Aberration Project characterizes the chromosomal alterations associated with malignant transformation. · The Genetic Annotation Index identifies and characterizes the polymorphisms associated with cancer. · The Mouse Tumor Gene Index identifies genes expressed during the development of mouse tumors. The goals of CGAP clearly overlap extensively with some of the goals of the HOP (for example, identifying expressed genes and using experi- mental high-throughput technologies). In fact, the director of CGAP was first hired by NIH to oversee technology initiatives for the HGP.8 The program was started when NCI was becoming more open to administra- tive experimentation and to an approach to project management focused on solving problems. The program director reports directly to the NCI director, and is expected to move the field ahead as quickly as possible. This is a somewhat fragile arrangement as it depends on an NCI director who supports technology development, as well as an institutional culture that welcomes an aggressive program management style similar to the DARPA model (see page 74), with an openness to a directive, problem- solving funding mode that does not always rely on external peer review for project selection. The project includes both intramural and contract funding. All data and materials from CGAP are shared openly and quickly with the research community without restrictions. The CGAP website includes databases containing genomic data for human and mouse, in- cluding ESTs, gene expression patterns, single nucleotide polymorphisms (SNPs), cluster assemblies, and cytogenetic information. Informatics tools to query and analyze the data are also developed by the program and made available online. In addition, NCI provides information on new experimental methods and makes biological reagents developed through the program available to researchers at cost. 7 See <http: / /cgap.nci.nih.gov/>. 8 Robert Strausberg, director of NCI's CGAP, in presentations to the National Cancer Policy Board.

MODELS OF LARGE-SCALE SCIENCE 47 Investigators funded through the program are required to sign an agreement stating that they will not patent the sequences they acquire. For sequencing projects, NCI has obtained a declaration of "exceptional circumstances" under the Bayh-Dole Act, meaning that contractors do not retain title to inventions developed with the federal funds. NCI is thereby able to mandate immediate disclosure of data by its contractors. In the future, the program may become involved in the develop- ment of functional genomics and proteomics databases. The ultimate goal in any case is to develop tools and infrastructures for the scientific community. Early Detection Research Network The Early Detection Research Network (EDRN)9 is a relatively new, large-scale program of the Cancer Biomarkers Research Group in the Di- vision of Cancer Prevention at NCI. EDRN is a national network whose purpose is to establish a scientific consortium of investigators with re- sources for basic, translational, and clinical research aimed at developing, evaluating, and validating biomarkers for earlier cancer detection and risk assessment. The network was established in response to concerns in the field that bringing validated biomarkers into the clinic would require a pooling of resources and expertise. It encourages collaboration and rapid dissemination of information among investigators. In an attempt to bridge the gap between laboratory advances and the clinical adoption of biomarkers, EDRN has brought organizations with varied interests and corporate cultures together in a single scientific con- sortium.~° Because many steps are necessary to ensure that a marker is accurate, reproducible, and practical for medical application, the consor- tium is organized into four working and two oversight components. The working components are as follows: · Biomarker Developmental Laboratories that identify, characterize, and refine techniques for finding molecular, genetic, and biologic signs of cancer · Clinical and Epidemiological Centers that focus on providing the network with blood, tissue, other biological samples, and medical infor- mation on families with histories of cancer · Biomarker Validation Laboratories that standardize tests and pre- pare them for clinical trials, serving as crucial intermediaries between the Biomarker Developmental Laboratories and clinical practice 9 See <http: / /edrn.nci.nih.gov>. I See <http://www.nih.gov/news/pr/May 20000/nci-16.htm>.

48 LARGE-SCALE BIOMEDICAL SCIENCE · A Data Management and Coordination Center to develop stan- dards for data reporting and to study new statistical methods for analyz- ing biomarkers The oversight components consist of a steering committee and an advisory committee. The steering committee provides major scientific management oversight, and has responsibility for developing and imple- menting protocols, designs, and operations. This committee determines which markers identified by the Biomarker Developmental Laboratories should advance to the Biomarker Validation Laboratories. Its members are principal investigators from the funded laboratories and centers, NCI program staff, and other ad hoc members invited by the committee. The advisory committee reviews the progress of the network, recom- mends new research initiatives, and ensures that the network is respon- sive to promising opportunities in early-detection research and risk as- sessment. Its members are predominantly investigators who are not in the EDRN. Funding through the program is based on peer review, using crite- ria established by the steering committee to meet the objectives and needs of the EDRN. Collaborations between funded network investiga- tors and investigators from U.S. and foreign institutes and industries are also encouraged. This type of collaboration is referred to as associate membership. In 1999, NCI awarded nearly $8 million to create 18 Biomarker Devel- opmental Laboratories.ll They are searching for potential biomarkers by analyzing thousands of samples of breast, prostate, ovarian, lung, blad- der, and other cancers. Nine of these 18 grantees are collaborating with industry. In the spring of 2000, EDRN awarded an additional $18 million in first-year funding for nine Clinical and Epidemiological Centers, three Biomarker Validation Laboratories, and the Data Management and Coor- dinating Center. Unconventional Innovations Program NCI's Office of Technology and Industrial Relations (OTIR) was tablished with the mission of speeding the progress of cancer research by encouraging the development of new technologies and promoting scien- tific collaborations between NCI and the private sector. It serves as a point of access to NCI for private industry and technology developers, and plays a key role in the management of several programs for NCI, including the Unconventional Innovations Program (UIP).12 The UIP was 11 See <http://newscenter.cancer.gov/>. 12 See <http: / /otir.nci.nih.gov/otir/index.html>.

MODELS OF LARGE-SCALE SCIENCE 49 created in 1998 with the intent of fostering risky technology development to improve progress in cancer research, a goal that was not a traditional aim at NIH.13 Specifically, the program sought technology platforms inte- grating noninvasive sensing of molecular alterations in viva with trans- mission of information to an external monitor, controlled intervention specific for the molecular profile, and monitoring of the intervention. The program was targeted to invest $48 million over a 5-year period. The first five contracts were issued through the program in 1999, totaling about $11 million over 3 years. In 2000, four contracts were issued, totaling about $9 million over 3 years. Before soliciting the first round of applications for UIP funding, NCI requested input on new opportunities for the detection and treatment of cancer at the earliest stages by calling for "white papers" describing those opportunities.l4 The interest and involvement of investigators from disci- plines that have not traditionally received support from NCI were specifi- cally recruited. Ideas and information submitted by investigators contrib- uted to the development of the first Broad Agency Announcement (BAA) solicitation for the UIP in 1999. A BAA stipulates technical goals but does not specify how to achieve them, so applicants are encouraged to propose different technological approaches. This mechanism is commonly used by DARPA (see page 74) and the Office of Naval Research, but it is an unusual approach within NIH. However, the selection process for UIP contracts, similar to most NIH funding mechanisms, is based on peer review. All proposals are evaluated by a peer review group known as the Technology Evaluation Panel, which considers four criteria: potential contribution and relevance to the UIP, technical approach, the applicant's capabilities, and plans and capability to accomplish technology maturation. The management style of the UIP also resembles that of DARPA, involving continued interaction between NCI staff and awarders. Yearly meetings of principal investigators funded through the UIP are held to provide a forum for discussing progress, forging collaborations among investigators, showcasing complementary programs and resources, and soliciting feedback from investigators. (At NCI, convening of a planning group is common, but regular meetings of awardees are not commonly held.) One goal of these meetings is to bring together a critical mass of investigators in a particular area who might otherwise not communicate with each other. The UIP clearly differs from the core scientific programs at NIH in 13 Carol Dahl, former director of the NCI's Unconventional Innovations Program, in a presentation to the National Cancer Policy Board, July 16, 2002. 14 See <http://amb.nci.nih.gov/> [accessed 1/10/00].

50 LARGE-SCALE BIOMEDICAL SCIENCE being focused on high-risk technology development, as opposed to hypothesis-driven research. An explicit mandate of the UIP is to develop enabling technologies and build infrastructure to advance an entire field, as well as to create new fields. The program objectives expressly call for the development of technologies that target quantum improvements in existing technologies or entirely new approaches, rather than incremental improvements to the state of the art.l5 The original request for white papers defined the ultimate goal of the program as follows: Building on the work of CGAP in molecular profiling of tumors, the NCI wishes to create technology platforms that will revolutionize cancer detection, diagnosis and treatment. The NCI is interested in identifying technology systems or components that will enable sensing of molecular alterations in the body in a way that is highly sensitive and specific, yet non-intrusive. The technology system should additionally serve as the platform for, or have a seamless integration with, capabilities for the intervention specific for the detected molecular profile. Building on this ambitious objective will require the development and integration of a series of capabilities including highly specific molecular recognition, signaling capability, controllable intervention capabilities, methods for monitoring intervention release and impact, and biotolerance. This will require the input and collaboration of investigators from a variety of disciplines, many of which have not traditionally engaged in cancer research.l6 Although a number of papers have been published recently by the awardees of the program, it is still too early to measure the program's success. However, satisfaction with the progress of the program was suf- ficient for NCI to enter into a new collaboration with the National Aero- nautics and Space Administration (NASA) for a project with goals that are complementary to those of the UIP. A joint NASA/NCI solicitation was released on January 3, 2001, to support fundamental technologies for the development of biomolecular sensors. Mouse Models of Human Cancers Consortium The Mouse Models of Human Cancers Consortium (MMHCC), as- sembled from multidisciplinary teams of scientists, was established in 1999 for the collaborative development, characterization, and validation of mouse models that parallel the ways in which human cancers develop, 15 "This program seeks to stimulate development of radically new technologies in cancer care that can transform what is now impossible into the realm of the possible for detecting, diagnosing, and intervening in cancer at its earliest stages of development." See <http:// otir.nci.nih. gov / tech /uip.html>. 16 See <http://amb.nci.nih.gov/> [accessed 1/10/00].

MODELS OF LARGE-SCALE SCIENCE 51 progress, and respond to therapy or preventive agents. As in the case of CGAP, the MMHCC was launched with considerable input from the sci- entific community. One goal of the program is to define the standards by which to validate the models for their relevance to human cancer biology and for testing therapy, prevention, early detection, or diagnostic strate- gies. Ultimately, the Consortium is responsible for choosing which exist- ing mouse cancer models warrant full characterization for their relevance to human cancer, and which new models should be derived and charac- terized when no model exists for a given malignancy. The purpose of implementing the MMHCC was to accelerate the pace at which mouse models are made available to the research community for further investigation or application. The consortium enables interactions to foster the rapid exchange of ideas, information, and technology. NCI works with the consortium to organize workshops and symposia, to pro- vide information about the models and related technology, and to plan for distribution of the validated mouse models to the cancer research community. Funding for members of the MMHCC is available through both NIH intramural projects and the U01 funding mechanism (see Chapter 4 for an explanation of the various NIH funding mechanisms).~7 The program has thereby supported many small individual projects with grants similar in size to a typical R01 grant. However, the U01 mechanism is a cooperative agreement, in which substantial NCI scientific and programmatic involve- ment with the investigators is expected. Oversight is provided at three levels the NCI program director, a steering committee, and an advisory group. The program director, an extramural scientist administrator of NCI, has substantial authority to assist, guide, coordinate, and participate in the conduct of the Consortium's activities, and also serves as a voting member of the steering committee. The steering committee, which meets twice a year, is the main gov- erning board of the MMHCC. It sets priorities for model derivation, de- fines the parameters for model validation, identifies technological im- pediments to success and strategies for overcoming them, and decides when models should be made available to the cancer research community for individual investigator-initiated projects. Committee voting members include the principal investigator and an additional senior investigator from each U01 or NIH intramural project, the NCI program director, and three members of the NCI Mouse Models Advisory Group. The advisory group consists of NCI and NIH extramural staff who represent the breadth of scientific expertise and program responsibilities that relate to the goals ~7 RFA CA-98-013, 1998. See <http://grants.nih.gov/grants/guide/rfa-files/rfa-ca-98- 013.html>.

52 LARGE-SCALE BIOMEDICAL SCIENCE of the MMHCC. It meets regularly to review the progress of the MMHCC, to advise the NCI program director about emerging scientific and techno- logical advances that could further the consortium's goals, and to collab- orate on the design and implementation of MMHCC workshops and symposia. Funding decisions are based on peer review of the scientific merit of applications, in which investigators are asked to address questions re- garding the available infrastructure, plans for model derivation, available technology, and plans for interactions with other MMHCC members. The standard review criteria include the significance of the project, the scien- tific approach, the level of innovation, the qualifications of investigators, and the research environment. The Consortium was officially launched when 19 groups of investiga- tors from more than 30 institutions were provided with MMHCC funding to develop and evaluate mouse models for cancers of eight major organ systems breast, prostate, lung, ovary, skin, blood and lymph system, colon, and brain.l8 The Consortium has since grown into an international collaboration involving more than 70 institutions. Specialized Programs of Research Excellence In 1992, NCI established the Specialized Programs of Research Excel- lence (SPOREs~l9 to promote interdisciplinary research through a special $20 million appropriation from Congress. The program focuses on trans- lational research, with the goal of enhancing communication and coop- eration between basic and clinical scientists in order to move basic re- search findings from the laboratory to the clinic more quickly. SPORE scientists are expected to work as teams rather than as independent inves- tigators, with the hope that such collaborations will allow scientists to tackle research questions that could not otherwise be addressed. SPORE grant applications undergo traditional peer review, but are also assessed on a number of criteria specific to the program. Each pro- posed research project must be led by co-principal investigators with expertise in basic and clinical research, and must include at least four independent investigators who currently serve as principal investigators on other peer-reviewed research grants. Proposals must also include a minimum of four research projects that represent a "balance and diver- sity" of translational objectives, such as screening, prevention, diagnosis, and treatment. In addition, NCI requires that a portion of the funds be used to collect and distribute patient tissues and other biological samples. 18 See <http://www.nih.gov/news/pr/dec99/nci-28.htm>. 19 See <http://spores.nci.nih.gov/>.

MODELS OF LARGE-SCALE SCIENCE 53 SPORE proposals must also include a plan for evaluating the scien- tific progress and translational potential of all projects, as well as plans for replacing the projects as necessary. This is most often accomplished through annual meetings at which SPORE scientists share data, assess research progress, and identify new research opportunities and priorities. Replacement projects are reviewed by NCI program staff, but do not undergo additional peer review. SPORE grants are limited to $1.75 million in direct costs and $2.75 million in total costs for 5 years. When first launched, the program solic- ited grant applications through requests for applications (RFAs), but more recently it has switched to program announcements (PAs) in order to broaden the investigator-initiated applications for all types of cancers (for more information on RFAs and PAs, see Chapter 4~. In either case, the P50 funding mechanism (specialized center grant; see Box 4-7 in Chapter 4) has been used to provide grant money through the program. In 2002, NCI funded SPOREs to study cancers of the breast, prostate, lung, gastrointestinal tract, ovary, genitourinary tract, brain, skin, and head and neck, as well as lymphoma. In the coming years, NCI plans to increase the use of the SPORE mechanism to provide funding for other major cancers, including gynecological tumors, leukemia, myeloma, and pancreatic cancer. However, the report from a recent review of the SPORE program noted that while it is a vital component of NCI's translational research effort, it cannot continue to grow at its present rate. The report also recommended that NCI make a concerted effort to improve the efficiency, effectiveness, and evalua- tion of SPOREs (National Cancer Advisory Board, 2003~. The Molecular Targets Laboratory NCI recently awarded a $40 million, 5-year contract to Harvard Uni- versity to establish a Molecular Targets Laboratory. The goal of this labo- ratory is to develop research tools, such as protein arrays, and to synthe- size thousands of small molecules and screen them for their biological effects (ScienceScope, 2002~. Small molecules identified in such screens can provide versatile research tools for the study of protein function (re- viewed by Stockwell, 2000) that can be rapidly adopted by many labora- tories and also provide the first step toward the development of a new therapeutic drug. The data produced by the Harvard group will form the base for an NCI-sponsored database on chemical genetics. Known as Chembank among some supporters, this database would essentially serve as a chemical version of Genbank, NIH's online repository for genetic data (Adam, 2001a). NCI hopes that scientists from around the world will also deposit their data on the effects of small molecules on proteins, cell pathways, and tissue formation.

54 LARGE-SCALE BIOMEDICAL SCIENCE The new facility will be an outgrowth of the Harvard Institute of Chemistry and Cell Biology (ICCB),20 which was founded in 1997 as a collaboration of academic scientists and industrial partners with funding from Merck, Merck KGaA in Germany, and the NCI. The ICCB was estab- lished to facilitate collaborations between chemists and cell biologists, and to conduct high-throughput screens of chemical libraries. RECENT EXAMPLES FROM OTHER BRANCHES OF NIH The recent doubling of the NIH budget provided new opportunities for the initiation of several large-scale research efforts that might not have been feasible or acceptable to the research community in the past. The relatively large and rapid funding increase allowed NIH to launch new programs even while increasing the number of traditional, investigator- initiated grants (known as RO1 grants; see Box 4.7~. This phenomenon is perhaps most striking for the National Institute of General Medical Sci- ences (NIGMS), traditionally known as the "ROT Institute," which estab- lished several new large-scale initiatives in recent years, several of which are described below. A program established by the National Institute for Allergy and Infectious Diseases for distributing tools and reagents made possible by large-scale genomics projects is also described. NIGMS Glue Grants NIGMS launched a new initiative to fund large-scale collaborative projects in 1999. This initiative was the result of consultations with lead- ers in the scientific community who said that the most challenging bio- logical problems require the expertise and input of large, multifaceted groups of scientists. The projects are referred to as "glue grants" because they are meant to provide the resources necessary to bring scientists to- gether to focus on a research topic, with the goal of addressing problems beyond the reach of individual investigators. An RFA was issued in 1999,22 with the expectation that participating investigators would already hold funded research grants related to a pro- posed topic of study that was of central importance to biomedical science and to the mission of NIGMS. Support for new individual research proj- ects was not the intent of these large-scale project awards; rather, a signifi- 20 See <http://sbweb.med.harvard.edu/~iccb/>. 2~ Judith Greenberg, acting director of NIGMS, in a presentation to the National Cancer Policy Board on July 16, 2002. 22 RFA GM-99-007, May 26, 1999. See <http://grants.nih.gov/grants/guide/rfa-files/ RFA-GM-99-007.html>.

MODELS OF LARGE-SCALE SCIENCE 55 cant level of support was offered so that investigators could extend their research efforts by forming a consortium to approach a research problem of overarching importance in a comprehensive and highly integrated fash- ion. It was noted in the RFA that: Biomedical science has entered a new era where these collaborations are becoming critical to rapid progress. This is the result of several factors. First, not every laboratory has the breadth to pursue problems that increasingly must be solved through the application of a multi- tude of approaches. These include the involvement of fields such as physics, engineering, mathematics, and computer science that were previously considered peripheral to mainstream biomedical science. Second, the ability to attack large projects that involve considerable data collection and technology development requires the collaboration of many groups and laboratories. Finally, large-scale, expensive tech- nologies such as combinatorial chemistry, DNA chips, high through- put mass spectrometric analysis, etc., are not readily available to all laboratories that could benefit from their use. These technologies re- quire specialized expertise, but could lend themselves to management by specialists who collaborate or offer services to others. In the fall of 2000, NIGMS announced that it would provide $5 mil- lion for the first year to a consortium of basic scientists called the Alliance for Cellular Signaling (AFCS), with the expectation of spending a pro- jected total of $25 million on the project over the course of 5 years.23 The project aims to study all aspects of cellular communication in two cell types: cardiomyocytes and B-cells. The primary goal of the effort is to map the immense complexity of intracellular signals in both cell types, with the ultimate objective of being able to search for and test "in silico"24 new therapeutic compounds that affect these signaling pathways. The AFCS is a consortium of approximately 50 scientists working at 20 different academic institutions around the country. AFCS investigators work in core laboratories located at several different academic centers, including the California Institute of Technology in Pasadena; the San Fran- cisco Veterans Administration Medical Center; Stanford University; the University of California, San Diego; and University of Texas Southwest- ern. Two biotechnology companies will also participate in AFCS studies by providing custom-made materials, such as antisense reagents (ISIS Pharmaceuticals of Carlsbad, California) and two-hybrid analysis tech- nology, a method used to track interactions between proteins inside cells (Myriad Genetics, Inc., of Salt Lake City, Utah). 23 NIH News release, September 5, 2001. See <http://www.nigms.nih.gov/funding/ gluegrant_release.html>. 24 Using a computer model rather than traditional laboratory experiments.

56 LARGE-SCALE BIOMEDICAL SCIENCE All of the data produced in the core laboratories will be deposited immediately in a publicly accessible database, and investigators will re- linquish patent rights to the information. Once the data have been posted publicly, any scientist, whether a member of the AFCS or not, can use them for research that may lead to patents. The consortium also uses virtual conferencing via the Internet2, a university-based version of the Internet, to encourage open and rapid communication among members. In addition to support from NIGMS, other funding for the AFCS project will be provided by several nonprofit organizations and phar- maceutical companies. They include Eli Lilly and Company, Johnson and Johnson, the Merck Genome Research Institute, Novartis Pharma- ceuticals Corporation, Chiron Therapeutics, Aventis, and the Agouron Institute. In the fall of 2001, NIGMS announced the provision of $8 million for a second glue grant to the Cell Migration Consortium. The institute plans to spend an estimated $38 million on the project over the next 5 years. The project will bring together a large group of disparate scientists (biologists, chemists, biophysicists, optical physicists, mathematicians, computer sci- entists, geneticists, and engineers) from 12 academic medical centers across the country to study the mechanism of how cells move. A second- ary goal of the Consortium is to facilitate the translation of new discover- ies in cell migration into the development of novel therapeutic drugs and treatments. Understanding of cell migration could potentially lead to ad- vances against a variety of diseases, such as cancer, in which cell move- ment leads to lethal metastases. Two additional glue grants have since been awarded for a study of Inflammation and the Host Response to Injury and for a Consortium for Functional Glycomics. The selection of proposed consortia for funding is based on tradi- tional NIH peer review. The standard review criteria are used, including the significance of the proposed project, the experimental approach, the degree of innovation, the qualifications of investigators, and the scientific environment. Applications are actually made in two phases. Phase I ap- plicants submit an overview of the proposed large-scale project for peer review. The purpose of this first phase is to provide resources for detailed planning to applicants who have demonstrated the selection of an appro- priate complex biological problem, an innovative plan, and appropriate commitments to its solution from participating investigators and institu- tions. Successful Phase I applicants receive a $25,000 planning grant, and those applicants who receive awards are eligible to submit a more exten- sively planned and detailed application for a Phase II award to support the large-scale project itself. Phase II applications must provide specific intermediate goals (milestones) and a timeline for their accomplishment. These goals are adjusted annually at the award anniversary date to incor-

MODELS OF LARGE-SCALE SCIENCE 57 porate accomplishments made to date, progress in the field, and input from an advisory committee. Applications must also include an adminis- trative management plan, a project management plan, and a plan for data sharing and intellectual property. In addition to the principal investigator and participating investiga- tors, other essential components of a large-scale collaborative project include a steering committee, an external advisory committee, and a pro- gram director. The steering committee is largely responsible for gover- nance of the project and plays a major role in developing goals and oper- ating procedures. The committee is chaired by the principal investigator, and its membership is chosen from participating investigators and project staff. The external advisory committee meets annually with the steering committee to assess progress and provide feedback on proposed goals for the next year of support. The members of this committee, who are not involved in the project, are appointed by the principal investigator in consultation with the steering committee and with the approval of the NIGMS program director after the Phase II award has been made. The NIGMS program director has considerable influence over the project by facilitating interactions between the steering and advisory committees and by facilitating communication with the scientific community directly affected by the collaborative project. The program director also serves as a voting member of the steering committee. The RFA for large-scale collaborative projects was reannounced in 2001.25 In addition, a related PA was published in 2000. Entitled "Integra- tive and Collaborative Approaches to Research,"26 the purpose of this initiative is to provide groups of currently funded investigators at differ- ent institutions with additional support for collaborative and integrative activities. The initiative is intended to support collaborative research and resources on a modest scale, involving a small number of funded investi- gators working on a common problem. The maximum direct cost per year is $300,000. Unlike an RFA, a PA is an ongoing announcement for which there is no set-aside of funds. NIGMS Protein Structure Initiative NIGMS recently launched a new large-scale, cooperative effort known as the Protein Structure Initiative (PSI) (Smaglik, 2000~. The goal of the 10- 25 RFA GM-01-004, February 28, 2001. See <http://grants.nih.gov/grants/guide/rfa- files / RFA-GM-01 -004.html>. 26 PA-00-099, May 24, 2000. See <http://grants.nih.gov/grants/guide/pa-files/PA-OO- O99.html>.

58 LARGE-SCALE BIOMEDICAL SCIENCE year project is to foster the new field of structural genomics.27 Following the completion of the human and other genome projects, a crucial next step in understanding biology is determining the structure and function of the entire set of gene products (Burley, 2000~. Sequences from the hu- man genome are being analyzed to identify distinct protein families. Struc- tural genomics uses these computational analyses, along with structural determinations of the protein products, to advance the study of protein function. The project will take place in two distinct stages. The first 5 years will be focused on technology development, while the remaining 5 years will be devoted to determining the structures of proteins in various protein families from different organisms, including bacteria, yeast, roundworms, fruit flies, and humans. In September 2001, NIGMS awarded almost $30 million to seven research centers, each receiving approximately $4 million for the first year. The Institute anticipates spending a total of around $150 million on these projects over 5 years. The projects at the research centers are intended to serve as pilots leading to subsequent large-scale research networks in structural genomics. The first goal is to improve and auto- mate methodologies for X-ray crystallography and magnetic resonance spectroscopy. Although structure determination techniques have ad- vanced dramatically in recent years, they are still time-consuming and labor-intensive. The centers are attempting to speed up and decrease the cost of every aspect of the process: protein family classification and target selection, protein expression, protein purification, sample preparation (crystallization or isotopic labeling), structure determination, and analy- ses of results. The effort to develop high-throughput technologies will require the skills of chemists, engineers, and computer scientists, as well as biologists. Unlike the field of genomics, which was accelerated by ro- botic DNA sequencers, structural biology and proteomics are unlikely to be dominated by a single technology (Service, 2001c). Moreover, a recent International Conference on Structural Genomics revealed that technol- ogy development is complex and unpredictable (Service, 2002~. The second 5-year phase was intended to focus on full-scale produc- tion. The plan was to organize all known proteins into structural families based on their genetic sequences. The goal was then to determine the structure of a few proteins from each family, for a total of about 10,000 protein structures by the end of 10 years. However, the current pace of the effort suggests that this goal is unlikely to be achieved in the expected timeframe, so NIGMS will need to make difficult decisions about how to proceed for the second 5 years (Service, 2002~. The information generated 27 NIH news release, September 26, 2000; see <http://www.nigms.nih.gov/news/re- leases/ SGpilots.html>.

MODELS OF LARGE-SCALE SCIENCE 59 in the second phase is intended to form the foundation of a public re- source linking sequence, structural, and functional information. This re- source could also allow scientists to use gene sequences to predict the approximate structures of other proteins. There is also much interest among pharmaceutical and biotechnology companies in pursuing structural genomics projects (Smaglik, 2000, Ser- vice, 2001a-c). However, industry researchers are more likely to focus on medically relevant proteins, rather than whole classes of proteins. More- over, companies are often more interested in the structures of proteins with different compounds bound to them than in the structure of the protein alone. The public project, in contrast, seeks breadth of basic data through the selection of proteins covering a wide variety of structures. The main goal of the NIGMS-sponsored project is to develop a detailed database that can serve as a valuable resource and research tool for scien- tists engaged in both basic and clinically relevant research. In this regard, the project is quite similar to the publicly funded HGP. Nonetheless, some public-private collaborations in structural genomics have also been initi- ated (Stevens et al., 2001; Service, 2002~. For example, the NIGMS-funded Joint Center for Structural Genomics (ICSG) has contracted work with the Genomics Institute of the Novartis Research Foundation, which is col- laborating with biotechnology companies like Syrrx to speed technology development. The ICSG is also seeking collaborations with international structural genomics consortia to improve high-throughput technologies. Such consortia have been launched in many countries, including Japan, Great Britain, and Canada, in the last 2 years (Stevens et al., 2001~. The NIGMS-funded PSI encompasses two PAs28 and an RFA. (For more information on the PA and RFA funding mechanisms, see Chapter 4~. These announcements resulted in part from recommendations made at three NIGMS-sponsored workshops focused on structural genomics, held in 1998 and 1999. The RFA29 was issued in 1999 and again in 2000, but will not be reissued. The seven awards described above, plus two more awarded in the second round, were made through the RFA. The two PAs encourage scientists to develop new methods and technologies for enhancing the efficiency of structure determination by developing high-throughput approaches. The PAs are ongoing and provide support for traditional individual research grants (RO1), program projects (PO1), and small-business research grants (Small Business Innovation Research [SBIR]/Small Business Technology Transfer [STTR]~. In the case of grants made through the RFA (using the P50 research center award mechanism), NIH has set forth a number of special require- 28 PA-99-~17 June 25, 1999; PA-99-~16 June 25, 1999. 29 RFA GM-99-009, June 3, 1999; RFA GM-00-006, July 24, 2000 (Re-announcement).

60 LARGE-SCALE BIOMEDICAL SCIENCE meets for application and post award management. Applicants are solely responsible for the planning, direction, and execution of their projects, so effective plans for management and administration of the research center are crucial for the application process. The principal investigator is ex- pected to make any adjustments in scientific direction necessary to ac- commodate the continually changing technological environment. Each research center must appoint its own external scientific advisory commit- tee, composed of research scientists not involved in the consortium, to provide independent assessment and advice to the principal investigator and staff. This committee is expected to meet at least twice each year. Significant changes in project direction must be reported to NIGMS staff, and scientific and programmatic visits to the grantee are conducted to ensure that the project remains focused on appropriate goals, incorpo- rates new technological advances, and makes sufficient progress. The benchmarks used to assess progress may be changed annually, and NIGMS may include outside consultants in the annual progress review. Funds may be reduced or withheld for failure to meet milestones agreed upon by grantees and NIH staff. In addition, grant recipients are also required to attend annual meetings at NIH to discuss progress and results. Grant applicants must also present plans for adherence to several policies adopted by NIGMS regarding research training, intellectual prop- erty, and data release. Because the research projects of the PSI involve extensive data collection and technology development with limited hy- pothesis-driven aspects, NIGMS generally considers them inappropriate as research training projects for graduate students and postdoctoral scien- tists. The work is more likely to require project managers and technicians. Thus, applicants planning to employ graduate students or postdoctoral fellows on their project must justify the request. NIGMS monitors its grant recipients' activities with respect to patent- ing the structural results and technology developments as well. The re- sults of the structural genomics projects are meant to be freely available for use by the entire research community, and therefore must be depos- ited promptly,30 prior to publication, in the Protein Data Bank (PDB), 30 According to the NIGMS Statement on Coordinate Deposition for Structural Genomics, an international agreement called for releasing structure information on most proteins soon after completion, but setting aside some structures for a limited period of time (less than 6 months) to allow for application for patents. Because the NIH research centers are just beginning their work, it is unclear how much time is needed to ensure that the results are accurate and to prepare the results for publication and deposition in the Protein Data Bank. The current goal of the PSI is to limit this time to 4 to 6 weeks. This should also be adequate time for the investigators to file patent applications for protein structures of commercial interest. See <http://www.nigms.nih.gov/funding/psi.html> [accessed 9/24/01].

MODELS OF LARGE-SCALE SCIENCE 61 which is in the public domain. Grantees are also required to develop and maintain their own public website containing information on strategies for target selection, the status of research on these proteins, technological and methodology findings, high-throughput approaches, efficiency, and cost analyses. The Pathogen Functional Genomics Resource Center The National Institute of Allergy and Infectious Diseases (NIAID) recently established a centralized facility providing the research commu- nity with resources for conducting functional genomics research on hu- man pathogens and invertebrate vectors.3~ NIAID awarded a 5-year, $25 million contract to TIGR to establish the Pathogen Functional Genomics Resource Center (PFGRC), which will provide scientists with microarrays, gene clones, and other reagents and tools for genomics research (Malakoff, 2001~. A scientific advisory committee provides advice to NIAID to assist in guiding the activities of the PFGRC. The impetus for the new center was in part to avoid funding duplicate requests to NIAID by centralizing some toolmaking and training activities, but the center can also now make standardized research tools more easily available to microbial research- ers, including those at small institutions who would not otherwise have such access. The Institute plans to select ten organisms for reagent devel- opment in the next 3 years, three of which are being developed for the first year.32 The PFGRC also aims to support the development of emerg- ing genomic technologies and to train scientists in the latest techniques in functional genomics. Because the microarrays are limited in quantity, scientists interested in obtaining them must submit brief proposals to NIAID describing re- search plans for their utilization. The microarrays will be provided (150 slides for a given organism per request) for both exploratory/develop- mental and established research projects. Proposals, limited to five pages, must include a research plan stating the specific aims, the significance of the research question, the potential impact on the field, and the experi- mental design to be used. Applicants must also provide documentation that they have access to the resources and expertise necessary to design, perform, and interpret the experiments, including data analysis. In addi- tion, requesters must agree to NIAID's data release policy, which requires the timely dissemination of microarray data in either a publicly devel- oped database supported by the PFGRC or another publicly available 3~ See <http://www.niaid.nih.gov/dmid/genomes/pfgrc/>. 32 Further details about how organisms are selected can be found at <http:// pfgrc.tigr.org>.

62 LARGE-SCALE BIOMEDICAL SCIENCE database, as designated by NIAID. Requests are reviewed in a confiden- tial manner by a committee following the usual NIH peer review criteria (significance, approach, investigator, and environment). Investigators are selected on the overall merit of the proposal, but the availability of re- agents may also be taken into consideration. NIAID anticipates that the review process can be completed within 2 weeks of the deadline for re- ceipt of applications. The Women's Health Initiative The NIH launched the Women's Health Initiative (WHI) in 1991 with the broad goal of investigating strategies for the prevention and control of some of the most common causes of morbidity and mortality among postmenopausal women, including cancer, cardiovascular dis- ease, and osteoporotic fractures.33 In October 1997, the WHI was trans- ferred to the National Heart, Lung, and Blood Institute (NHLBI), where it has functioned as a consortium effort led by NHLBI in cooperation with NCI and the National Institute of Arthritis and Musculoskeletal and Skin Diseases. The WHI is one of the largest studies of its kind ever undertaken in the United States (see Table 3-1~. The effects of hormone replacement therapy (HRT) and diet on the health of postmenopausal women were investigated for almost a decade prior to the WHI. Because of a lack of funds, however, no studies of sufficient size and duration to test with confidence the value and risks of these approaches had been initiated (Rossouw et al., 1995~. The WHI involves more than 40 centers nation- wide and 162,000 women aged 50-79, about 18 percent of whom repre- sent minority groups. Enrollment in the study began in 1993 and ended in 1998. Participants will be followed for 8 to 12 years. The WHI consists of three studies: · A clinical trial that tests the effects of three different prevention approaches HRT, diet modification, and calcium and vitamin D supple- mentation on heart disease, cancer risk, and osteoporosis. All three ap- proaches are being studied using a randomized, controlled trial design. Depending on their eligibility, women chose to enroll in one, two, or all three parts of the clinical study. Altogether, the three components involve 68,000 women who are randomized to receive the different interventions. · An observational study involving about 94,000 women to investigate the interplay among health, lifestyle, and other disease risk factors. The goal is to identify predictors and biological markers for disease. The 33 See <htip: / /www.nhIbi.nih.gov/health/public/heart/other/whi/wmn_hIt.htm>.

MODELS OF LARGE-SCALE SCIENCE TABLE 3-1 Women's Health Initiative Costs 63 Average Cost per Year Total Cost All Years Clinical Trial and (in millions (~15 years) Observational Study Budget of dollars) (in millions of dollars) Clinical Coordinating Center 11.7 175.5 40 Clinical Centers 35.3 530.0 Total 47.0 705.5 Clinical Trial by Component Calcium, Vitamin D 1.2 18.2 Hormone Replacement Therapy 15.5 232.4 Dietary Modification 27.7 415.1 Observational Study 2.6 39.8 Total 47.0 705.5 Average Cost Total Cost Clinical and Observational Study Number per Year All Years Cost per Participant Enrolled (in dollars) (in dollars) Observational Study 93,676 28 425 Calcium/Vitamin D 36,282 33 501 Hormone Replacement Therapy 27,347 567 8499 Dietary Modification 48,836 567 8499 All Clinical Trials and Observational Studies 161,809* 291 4360 (community Prevention Study 1994 1995 1996 1997 1998 1999 Total Total Cost (in millions of dollars) 0.16 4.0 4.0 4.0 4.2 4.0 20.36 Women's Health Initiative Total Cost (in millions of dollars) 725.8 *Because some participants may be enrolled in more than one study or trial at the same time, this number represents the total number of enrolled participants, but is not a sum of the numbers above it. SOURCE: Personal communication with Jacques Rossouw, director, Women's Health Ini- tiative, Office of the Director, NHLBI, March 2002.

64 LARGE-SCALE BIOMEDICAL SCIENCE women receive no specific intervention, but their medical history and health habits are followed over the course of the study. · A community prevention study to determine how women can best be encouraged to adopt healthful behaviors, such as an improved eating plan, nutritional supplementation, smoking cessation, physical activity, and early detection of treatable health problems. Conducted through eight community prevention centers based at universities, the 5-year study is aimed at developing model programs that can be implemented nation- wide. This study entails a unique 5-year cooperative venture with CDC. The Fred Hutchinson Cancer Research Center in Seattle, Washington, serves as the WHI Clinical Coordinating Center for data collection, man- agement, and analysis. The WHI is a large-scale-science project not so much because it is employing high technology to discover biological pro- cesses, but more because of its size and collaborative nature. The initiative is focused on studying the impact of practical and feasible interventions for diseases common among women by involving hundreds of investiga- tors at scores of institutions, at a cost of hundred of millions of dollars. Surprising findings were recently reported for the WHI's HRT trial, and that portion of the study was terminated 3 years early for ethical reasons on the basis of those results (Rossouw et al., 2002; Enserink, 2002b). Although many previous observational studies had indicated that HRT was beneficial for reducing cardiovascular disease, the randomized, controlled trial of the WHI showed an increase in heart disease, stroke, and pulmonary embolisms, as well as an increase in invasive breast can- cer, among women taking estrogen and progesterone. Although HRT reduced the incidence of bone fractures and colorectal cancer, these ben- efits did not outweigh the other risks. A similar large-scale study of HRT in the United Kingdom34 was also halted as a result of the apparent risks identified by the WHI study, despite criticism of the design and analysis of the U.S. study (Enserink, 2002a; Couzin and Enserink, 2002~. VACCINE RESEARCH A large-scale approach to research is becoming the norm in the field of vaccine development, especially with respect to AIDS (acquired im- mune deficiency) vaccine research. The field has been boosted by a large influx of funding in recent years from both federal and philanthropic sources. The Bill and Melinda Gates Foundation has provided more than $125 million for the International AIDS Vaccine Initiative since its cre- 34 Women's International Study of Long Duration Oestrogen After Menopause (Wisdom).

MODELS OF LARGE-SCALE SCIENCE 65 ation in 1999 (Cohen, 2002), and NIH has made a similar investment in targeted research for vaccine research in the United States. In May 1997, President Clinton set a goal to develop an AIDS vaccine within 10 years. NIH responded by creating the Vaccine Research Center (VRC),35 a state-of-the-art biomedical research laboratory to facilitate multidisciplinary research aimed at vaccine development. Although the primary focus of VRC research is the development of an AIDS vaccine, the center also has a broader mission to advance the development of vaccines for all diseases, based on the premise that what is learned with other diseases may be helpful in the research on AIDS, and vice versa. The center focuses primarily on the preclinical and early clinical stages of vaccine development, but works closely with the HIV Vaccine Trials Net- work, which conducts all phases of clinical trials. A novel venture within the NIH intramural research program, the VRC receives joint funding from NIAID and NCI and is spearheaded by NIAID, NCI, and the NIH Office of AIDS Research. A new building for the center, costing between $35~0 million, officially opened in spring 2001. The center had an operating budget of $26 million for fiscal year 2000, but the budget has increased as the program has expanded to full capacity. The center employs about 100 scientists and support staff, in- cluding tenure-track scientists, staff scientists, postdoctoral fellows, and graduate students, drawn from an array of disciplines such as immunol- ogy, virology, and vaccine development (Gershon, 2000~. The VCR also works with scientists in academic, clinical, and industrial laboratories through a program of national and international collaborations. In addi- tion, the VRC is directed to actively seek industrial partners for the devel- opment, efficacy testing, and marketing of vaccines. NATIONAL SCIENCE FOUNDATION'S SCIENCE AND TECHNOLOGY CENTERS PROGRAM The Science and Technology Centers (STC) Program of the National Science Foundation (NSF) was established in 1987 to fund basic research and education activities and to encourage technology transfer and innovative approaches to interdisciplinary programs.36 The program offered a novel approach to research by creating large, multidisciplinary programs at univer- sities. STC grants, which are open to researchers working in any area typi- cally supported by NSF, provide up to $20 million over 5 years, with a possibility for 5 additional years of support pending the results of an exten- 35 See <http: / /www.niaid.nih.gov/vrc/default.htm>. 36 See <htip: / /www.nsf.gov/od/oia/programs/stc/start.htm>.

66 LARGE-SCALE BIOMEDICAL SCIENCE sive midterm review. The program thus provides a mechanism by which the basic research community can take a relatively long-term view of science. The goals of the STC Program are to enable academic research teams to: · Exploit opportunities in science and engineering in which the com- plexity of the research problems or the resources needed to solve them require the advantages of collaborative relationships that can best be pro- vided by campus-based research centers. · Involve students, research scientists, and engineers in partnerships to enhance the training and employability of professionals through an awareness of potential applications for scientific discoveries. , (, · Provide long-term, stable funding at a level that encourages risk taking and ensures a solid foundation for attracting quality undergradu- ate and graduate students (with special emphasis on women and minori- ties) into science and technology careers. · Facilitate the transfer of knowledge among academia, industry, and national laboratories. Thus STCs are expected not only to serve as critical national resources for research, but also to improve education in local schools; strengthen undergraduate and graduate training; improve minority representation in the sciences; and develop collaborations with other academic institu- tions, industry, and the community. There have been four competitions for STC grants, with the first cen- ters being funded in 1989. Six new centers were selected in 2002, bringing the total number of STC awards to date to 36. Only 2 centers have been terminated for falling short of their stated goals. Selection entails a 2-year process of proposal development, review, revision, and site visits, with a proposal's management plan being key to an applicant's success. Fund- ing is provided through cooperative agreements with NSF and comes with extensive hands-on supervision by NSF officials (Mervis, 2002~. Initially, the program was controversial.37 Scientists worried that the proposed centers would drain funds from NSF's traditional support for individual investigators, or that they would promote applied research at the expense of basic science (Mervis, 2002~. However, the program's an- nual budget of $45 million is only 1.1 percent of NSF's overall research budget, and many centers focus on very basic research. Indeed, agency officials report that many countries have sought NSF's advice in creating similar programs (Mervis, 2002~. Furthermore, a 1996 review of the pro- gram by the NRC was quite positive. The review panel concluded: 37 The program likely would have been even more controversial if many issues had not already been addressed in the late 1970s in establishing the Engineering Research Centers, which were NSF's first large-scale multidisciplinary centers in universities.

MODELS OF LARGE-SCALE SCIENCE 67 Most STCs are producing high-quality world-class research that would not have been possible without a center structure and presence.... The design of the STC program has produced an effective means for identi- fying particularly important scientific problems that require a center mode of support. Many STCs also provide a model for the creative inter- action of scientists, engineers, and students in various disciplines and across academic, industry, and other institutional boundaries (NRC, 1996: page 2~. The panel also suggested that the center approach was a valuable and necessary tool in NSF's portfolio of support mechanisms, and that the nation and NSF were getting a good return on their relatively small in- vestment. One program cited in the report as particularly successful was the Center for Biological Timing. The panel noted that this center had produced an impressive scholarly output in terms of both quantity and quality, and that these studies could realistically have been accomplished only through center support because of their complexity and long-term nature, as well as the unlikelihood of their being supported through tradi- tional investigator-initiated programs. Indeed, the panel concluded in general that the STC mode of support allows certain types of research problems to be addressed that otherwise would not be taken up. The panel noted that research problems fall along a spectrum some being well suited to an individual-investigator approach to inquiry, others to a center mode, and still others to a facility model with STCs serving as one means of support that helps balance the NSF portfolio of funding instruments. Thus, the NRC panel recommended that NSF continue the STC pro- gram. A number of additional recommendations for improving the pro- gram included placing greater weight on scientific and administrative leadership in evaluating proposals for STCs and in the periodic reviews of centers (NRC, 1996~. Two other independent reviews at about the same time came to similarly positive conclusions, and also resulted in recom- mendations for improving administration and oversight of the program (NAPA, 1995; ABT Associates, 1996~. THE SNP CONSORTIUM Single nucleotide polymorphisms, or SNPs, are common, small varia- tions that occur in human DNA throughout the genome. These polymor- phic markers can be used to map and identify important genes associated with diseases, and thereby provide a valuable resource for taking the first step in developing new diagnostic tests or therapies. They can also be responsible themselves for genetic differences that predispose some indi-

68 LARGE-SCALE BIOMEDICAL SCIENCE viduals to disease and that underlie variability in individual responses to treatment. The potential value of SNPs generated great interest in both the pub- lic and private sectors in identifying and mapping a large number of polymorphic markers, and discussions about establishing a public-pri- vate consortium to undertake such a project began in 1998. This type of cooperative arrangement may appear to be at odds with the business goals of private companies, but it was widely recognized that the indus- try would be better off if information on SNPs were made freely available to all, without the restrictions that could develop if many different orga- nizations held patents on markers scattered throughout the genome. Through collaboration, a high-density, high-quality map could be created more quickly, and with shared financial risk and less duplication of effort than if each company pursued development of a SNP map on its own. These discussions led to the establishment of the SNP Consortium in 1999. The SNP Consortium38 is a nonprofit entity comprising the Wellcome Trust and a group of pharmaceutical and technical companies.39 Its mis- sion is to identify SNPs distributed evenly throughout the human ge- nome and to make information on these SNPs available to the public without intellectual property restrictions. It is governed by a board com- posed of representatives of the member organizations and led by an inde- pendent chairman. The consortium participants provide oversight and technical expertise for the project, and also direct the effort to ensure the public availability of the SNPs that are generated. The consortium files patent applications, but with the declared policy of later abandoning them or converting them to a statutory registration of invention, which simply precludes others from patenting the discovery. Confirmed SNPs have been placed in the public domain at quarterly intervals as they have be- come available, thus providing free and equal access to all in the world- wide medical research community. 38 See <http: / /snp.cshl.org/index.html>. 39 In addition to the Wellcome Trust, the SNP encompasses 13 pharmaceutical and tech- nical companies: APBiotech, AstraZeneca PLC, Aventis Pharma, Bayer AG, Bristol-Myers Squibb Company, F. Hoffman-La Roche, Glaxo Wellcome PLC, IBM, Motorola, Novartis Pharmaceuticals, Pfizer Inc. Searle, and SmithKline Beecham PLC. The work supported by the consortium is performed at four major centers for molecular genetics: the Whitehead Institute for Biomedical Research, Washington University School of Medicine in St. Louis, the Wellcome Trust's Sanger Centre, and the Stanford Human Genome Center. The Cold Spring Harbor Laboratory maintains the consortium's databases. Orchid BioSciences, Inc., performs third-party validation and quality control testing on SNPs identified through the consortium's research.

MODELS OF LARGE-SCALE SCIENCE 69 Recently, the SNP Consortium collaborated with the International Human Genome Sequencing Consortium40 to publish a paper in the jour- nal Nature describing a map of 1.42 million validated SNPs distributed throughout the human genome (Sachidanandam et al., 2001~. Using DNA from a diversified, representative panel of anonymous volunteers, the collaborators identified, on average, one SNP for every 1.9 kilobases of DNA. Such collaboration further demonstrates that public-private coop- eration can be an efficient means of developing basic research tools. In the case of SNP analysis, however, international cooperation was perhaps not as strong as it had been for the Human Genome Project. The SNP Consortium invited Japanese companies to participate in the project, but they declined the offer. Instead, 40 Japanese drug firms decided to provide a total of $10 million to university researchers in Japan to study SNPs in that country's population. They will establish their own database of SNPs, but these data will also be made freely available to other scien- tists (Sciencescope, 2000~. A new public-private consortium was recently established to further build on the work of both the SNP Consortium and the HOP. The $100 million HapMap project, with funds from six countries41 and several phar- maceutical companies, aims to map about 300,000 haplotypes from four populations in Africa, Asia, and the United States within 3 years (Couzin, 2002b; Adam, 2001b). Haplotypes are sets of genetic markers that are close enough on a particular chromosome to be inherited together. Using SNPs alone to identify disease-associated genes can be difficult and ex- pensive, partly because it is difficult to trace individual SNPs in a genome containing 3 billion base pairs. Haplotype analysis will reduce back- ground noise and should make the search for genes easier and faster because the many individual markers are consolidated into more man- ageable clusters. Scientists realized only recently that a haplotype map might be fea- sible when they discovered that relatively large blocks of DNA are inher- ited in this way. Computer simulations predicted that DNA haplotypes 40 This collaborative effort was funded by the National Human Genome Research Insti- tute and the SNP Consortium. Three academic genome research centers the Whitehead Institute for Biomedical Research in Cambridge, Massachusetts; Washington University School of Medicine in St. Louis; and the Sanger Centre in Hinxton, United Kingdom- participated directly in this collaboration. The International Human Genome Sequencing Consortium includes scientists at 16 institutions in France, Germany, Japan, China, Great Britain, and the United States, with funding from government agencies and public charities in several countries. 41 Funders include NIH in the United States ($40 million) and the Wellcome Trust in the United Kingdom ($25 million).

70 LARGE-SCALE BIOMEDICAL SCIENCE would only be about 10,000 or fewer bases. To their surprise, genome researchers have found that haplotype blocks tend to be much larger (up to 100,000 base pairs), and that many such blocks come in just a few different versions. For example, within some sequence stretches of 50,000 bases, only four of five patterns of SNPs, or haplotypes, might account for 80-90 percent of the population. It is not clear why this occurs, but some chromosome regions may be less likely than others to recombine during meiosis, leading to conservation of the DNA blocks (Helmuth, 2001~. Haplotypes are found by analyzing genotype data, so the new col- laboration will essentially be a high-throughput genotyping effort. The work will be done by several biotechnology companies and public labora- tories, including the Sanger Center and the Whitehead Institute, but deci- sions are still pending on such issues as how data collection will be stan- dardized, how the map will be structured, and how the work will be divided. It is hoped that the new map will provide an invaluable tool to simplify the search for associations between DNA variations and com- plex diseases such as cancer, diabetes, and mental illness. However, many scientists, especially population geneticists, have questioned the value of generating a haplotype map at this time, arguing that there is too little information on the usefulness of such a map or how to best to proceed (Couzin, 2002a). There is also great interest in developing more efficient, cost-effective technologies for high-throughput analysis of SNPs (Chicurel,2001~. With- out such improvements, screening large populations to search for dis- ease- or therapy-associated genes could still be impractical. A number of investigators are attempting to improve on the current technolo~v but to date no coordinated effort has been made. HUMAN PROTEOME ORGANIZATION hi, The Human Proteome Organization (HUPO) is an international alli- ance of industry, academic, and government scientists aimed at determin- ing the structure and function of all proteins made by the human body (Kaiser, 2002; Abbott, 2001~. The mission42 of HUPO is threefold: to con- solidate national and regional proteome organizations; to engage in scien- tific and educational activities that encourage the spread of proteomics technologies, as well as the free dissemination of knowledge pertaining to the human proteome and that of model organisms; and to assist in the coordination of public proteome initiatives. The organization's formation was spurred by concerns that in the absence of such a coordinated effort, 42 help: / /www.hupo.org/.

MODELS OF LARGE-SCALE SCIENCE 71 individual companies would generate their own basic proteomics data and protect them through trade secrecy. The organizers hope to include more countries than participated in the HOP, and plan to generate fund- ing contributions from companies, with matching government funds. HUPO participants have proposed five initial research and technol- ogy development projects to garner interest from potential funders (see Box 3-2~. Several companies have already offered financial support, and a number of countries are launching initiatives related to HUPO's goals. The NIGMS Alliance for Cellular Signaling is one such initiative, but a broader role for NIH in a global proteomics project remains unclear. Some U.S. proteomics experts have proposed establishing a few pilot large- scale centers to identify proteins en masse with uniform standards from healthy and diseased tissues and blood serum (Kaiser, 2002~. But many others question the sensitivity and specificity of current mass spectrom- eters, suggesting that such an undertaking would be premature, and that it would be more useful to fund individual investigators to study small parts of large, complex protein networks (Check, 2002~. HOWARD HUGHES MEDICAL INSTITUTE The Howard Hughes Medical Institute (HHMI) provides an example of an alternative strategy that could be used to undertake large-scale re- search projects. HHMI is a nonprofit medical research organization that employs more than 300 biomedical scientists across the United States at more than 70 universities, medical centers, and other research organiza- tions. It also maintains a grants program aimed at enhancing science edu- cation at all levels. One of the world's largest philanthropies, HHMI had an endowment in mid-2000 of approximately $13 billion, and $600 million

72 LARGE-SCALE BIOMEDICAL SCIENCE was disbursed for medical research ($466 million), science education, and related activities. Created by Hughes in 1953, the Institute has always been committed to basic research, with the charge of probing "the genesis of life itself."43 The organization's charter states that "the primary purpose and objective of the Howard Hughes Medical Institute shall be the promotion of human knowledge within the field of the basic sciences (principally the field of medical research and medical education) and the effective application thereof for the benefit of mankind." The Institute draws a clear distinction between itself and other foundations that provide money for biomedical research in that it operates as an organization with investigators across the country. Hughes investigators are employed by the Institute but con- duct their research in the laboratories of their host institutions. The Institute's work has traditionally focused on five main areas of research: cell biology, genetics, immunology, neuroscience, and structural biology. More recently, clinical science programs have been added, as well as a new focus on bioinformatics. Investigators are free to pursue their own research interests without the burden of writing detailed proposals for each project, but their research progress is reviewed by HHMI every 5 years. Scientists who are not renewed as HHMI investigators are pro- vided with additional phase-out funds for 2-3 years so they will have an opportunity to seek other funds or gradually scale back their activities. This approach also eases the strain on affected staff and trainees in the lab who need time to seek other positions. In what was perhaps the Institute's first foray into large-scale science (as defined in this report), HHMI held an Informational Forum on the Human Genome at NIH in 1986. Subsequently, HHMI played a role in the HOP by supporting several databases, including one at Yale University; one at the Centre D'Etude du Polymorphisme Humaine in Paris; and one at the Jackson Laboratory in Bar Harbor, Maine (Cook-Deegan, 1994~. Recently, HHMI announced a novel research endeavor for the organi- zation. This new 10-year, $500 million project44 may be viewed as another form of large-scale science funded by a nonprofit organization. HHMI plans to build a permanent biomedical research center that will develop advanced technology for biomedical scientists and provide a collabora- tive setting for the development of new research tools. Slated to open in 2005, the new center will have an annual operating budget of about $50 million (Kaiser, 2001~. Research topics have not yet been fully defined, but are likely to focus on such areas as bioinformatics, proteomics, and imag- 43 http://www.hhmi.org/. 44 See <http: / /www.hhmi.org/news/020101.html>.

MODELS OF LARGE-SCALE SCIENCE 73 ing tools (e.g., electron microscopy). Investigators are likely to include computational scientists, chemists, physicists, engineers, and biomedical scientists with cross-disciplinary expertise. The center will provide laboratories for up to 24 investigators (who will not have tenure), plus their research staffs, for a total of 200-300 people. In addition, laboratories and other facilities will be built for visit- ing researchers and core scientific support resources. Visiting scientists will be able to stay for as little as a few weeks or may take a sabbatical year. Organizers hope this format will allow for rapid shifts into new areas that show unusual scientific promise and for quick adaptation of new discoveries for use in biological research and health-related sciences. For collaborative research at the new center, HHMI will request pro- posals from the scientific community at large, as well as from its own investigators. The Institute will seek out proposals focused on cutting- edge scientific and technological goals, and will give preference to projects that bring together diverse individuals and expertise from different envi- ronments. To be successful, proposals will have to demonstrate original- ity, creativity, and a high degree of scientific risk taking. One goal of these collaborations is to ensure that all HHMI investigators, regardless of their home institution's facilities, can obtain access to expensive, high-technol- ogy tools and the expertise needed to run them (Kaiser, 2001~. HHMI leaders have acknowledged that the kind of research they are proposing for the center is more typically undertaken by biotechnology companies. The Institute will encourage patenting of discoveries made at the center, which may foster the launch of new startup companies. How- ever, the generation of royalty revenues or new private businesses is not a stated goal of the Institute (Kaiser, 2001~. Because this project is still in the very early stages of planning, predicting its effectiveness or impact on the broader scientific community is impossible. Nonetheless, it provides a novel and unique model for consideration. SYNCHROTRON RESOURCES AT THE NATIONAL LABORATORIES Two institutes from NIH, the NIGMS and the NCI, are providing $23 million over three years to support the design and construction of a user facility at Argonne National Laboratory's Advanced Photon Source (APS), the newest and most advanced synchrotron in the country. After two years of planning, NIGMS and NCI, which represent two-thirds of the life-science synchrotron user community, finalized an agreement early in 2002 to increase synchrotron resources by constructing three new beam lines at Argonne's APS that will be fully operational by 2005. The facility is operated by the University of Chicago, but beam time will be adminis-

74 LARGE-SCALE BIOMEDICAL SCIENCE tered by NIH. Half of the beam time will be allocated to peer-reviewed research. NIGMS and NCI grantees will have access to the beam through a peer-review process for research grants. Twenty-five percent of the beam time will be divided between NIGMS and NCI for special projects, and the remaining beam time will be reserved for staff use and maintenance. The NIGMS/NCI facility will be fine-tuned to focus on the aspects of X- rays most useful for biological studies. Demand for beam time is increas- ing because of such projects as the NIGMS PPSI. NCI is particularly inter- ested in how the synchrotron facilities will advance the study of cancer-related molecules, because an understanding of detailed protein structure will help cancer researchers develop targeted drug therapies. NIGMS and NCI anticipate that information about molecular structures will allow scientists to help develop new medicines and diagnostic tech- niques. Once construction is complete, operation costs for the beam line are estimated to be $4 million a year, of which NCI has committed $1 million annually. (Cancer Letter, 2001; Softcheck, 2002~. DEFENSE ADVANCED RESEARCH PROJECTS AGENCY The Defense Advanced Research Projects Agency (DARPA) provides another alternative strategy for undertaking large-scale research projects. DARPA is the central research and development organization for the Department of Defense. It manages and directs selected basic and applied research and development projects for the department, with a focus on projects in which the risk and potential payoff are both very high, and in which success could provide dramatic advances for traditional military roles and missions. The agency was created in 1958 by President Eisenhower following the Soviet Union's surprise launch of Sputnik (Malakoff, 1999~. An inves- tigation blamed delays in the U.S. military satellite program on bureau- cratic infighting and an unwillingness to take risks. Intent on keeping the United States at the forefront of technological innovations, Eisenhower ordered Pentagon planners to create an agency that would be completely different from the conventional military research and development struc- ture and, in fact, would serve as a deliberate counterpoint to traditional thinking and approaches. The new agency relied on a small group of experts to look beyond near-term military needs and to fund areas offer- in~ great potential to revolutionize military capabilities. Today, the em- ~ ~ ,' ,, . . . .... .. . . . . .. ~ .. . ~ .. phases Is skill on seeking out and pursuing novel Ideas. A list of the agency's founding principles, which are still followed, is provided in Box 3-3. Best known for its role in developing the Internet (Norberg and O'Neill, 1996), DARPA has funded work focused primarily on computer

MODELS OF LARGE-SCALE SCIENCE 75 and software development, engineering, materials science, microelectron- ics, and robotics. The agency has had only a limited and very recent interest in basic molecular biology, and most of its biology research relates to just one function—protecting personnel against biological weapons. However, some of this work could potentially have broader implications for biological research, such as novel approaches for DNA sequencing (Alper, 1999) or sophisticated biosensors. Funding for research on this topic began in 1997, with contracts totaling about $50 million going to biotechnology ventures and nonprofit organizations. Although a panel of expert advisors provided some input in launching this program, it is run essentially the same as all other DARPA programs with hands-on over- sight by carefully selected program managers (Marshall, 1997~. With an annual budget of $2 billion, DARPA's small group of about 125 program managers have extensive power to direct high-risk projects that would not normally fare well in peer review. A DARPA program manager will typically spend as much as $40 million on contracts to in- dustry, academic, and government laboratories for one or more projects.

76 LARGE-SCALE BIOMEDICAL SCIENCE The contracts call for defined deliverables and allow less-promising work to be canceled easily. The agency aims to complete 20 percent of ongoing projects each year, and renewals are not made, although projects are occa- sionally reformulated for a subsequent attempt. The funded researchers often attend team meetings, file frequent reports, and work cooperatively with other contractors. Program managers are selected on the basis of their technical exper- tise and their aspiration to leave their mark on a field. They stay for an average of 4 years and often return to their primary field of research when their term is over. In addition to their technical expertise, they must dem- onstrate bureaucratic skills, as they must lobby for their portion of the DARPA budget, and be able to move established research communities in a particular direction or create new collaborations in disparate fields. Program managers identify opportunities in science or technology that appear promising, and then make decisions about whom to fund in pur- suing the ideas. They may make the latter decisions by probing the net- work of experts in a field to identify the most appropriate researchers, or by using written specifications to invite experts in the field to apply for funds. Program managers have only two layers of supervision an office director and the DARPA director, who reports to the Secretary of De- fense. These supervisors monitor the performance of the managers and hold them accountable for advancing their fields, but a major criterion for success is positive peer assessment of the manager's performance. This arrangement is in stark contrast to the current model at NIH, in which peer review is used to select proposals from a competitive pool of grant applications, rather than to assess the performance of program managers. NIH grant management staff generally have a comparatively passive role in project selection. It can also be difficult to determine whether the selected grant portfolios are actually meeting the goals of NIH programs. Ultimately, the strength of DARPA has been in pursuing innovative research directions to create new fields, or in solving specific technical problems by fostering the development of new technologies. The agency is not responsible for sustaining fields in the long run, as is NIH. Thus, adopting a DARPA model of funding for all NIH programs would be unworkable. However, the addition of some DARPA-like programs to the traditional NIH portfolio might add valuable research that would not otherwise be undertaken. Indeed, some leaders at NIH, including former director Harold Var- mus, have recently expressed interest in adopting some DARPA-like pro- grams at NIH to spark innovation (Malakoff, 1999~. Under the leadership of NCI director Richard Klausner, NCI has even launched a pilot program modeled in part after DARPA, as well as other agencies, such as NASA.

MODELS OF LARGE-SCALE SCIENCE 77 The Unconventional Innovations Program (discussed earlier) emulates the DARPA approach by assembling interdisciplinary research teams and pressing them to share information, with the goal of producing break- throughs in cancer detection technologies. NCI's traditional peer review panels still play a major role in selecting projects, but agency managers are more involved in program oversight than is usual. The program seeks input from and collaboration with investigators that have not tradition- ally been engaged in biomedical research. Despite these new developments and the past successes of DARPA, however, such programs do not come without difficult challenges and criticism. One of the greatest challenges to undertaking DARPA-like pro- grams may be the difficulty of recruiting effective managers. The DARPA model works best when the manager is an intellectual peer of the scien- tists being funded. But for biomedical scientists, a 4-year absence from the laboratory and the resultant lack of published scientific papers during that period could very well be disastrous from a long-range career per- spective. In addition, university-based scientists in particular often feel uncomfortable with aggressive supervision and team-dominated research, and biomedical scientists have opposed most initiatives that involve strong external control in the past. Furthermore, it is not uncommon for DARPA-funded projects to fail in meeting their intended goals. This is to be expected, given the high-risk nature of the work, but it may not be a popular approach in other fields. And even when its projects have been successful, DARPA has had difficulty in moving some findings into the military venue or the marketplace (Malakoff, 1999~. All of these issues need to be weighed carefully in attempting to emulate the DARPA pro- gram in other fields of research. SUMMARY As is clear from the examples described in this chapter, the character- istics of large-scale biomedical research projects can vary greatly, even when such research is defined relatively narrowly. However, the examples presented here share many common themes, characteristics, and issues. For example, most are dependent on technology in the sense that they require the use of expensive technologies, the development of novel tech- nologies, refinements to current technologies, or standardization of the way technologies are used and how the information generated is inter- preted and analyzed. Another common feature of the examples described here is a great need for planning, organizational structure, and oversight. The capacity of a large-scale project to efficiently and effectively produce data and other end products that are novel and valuable to the scientific commu-

78 LARGE-SCALE BIOMEDICAL SCIENCE nity can be determined by its design and the skill of the individuals who oversee the work. Many of the large-scale projects described here are also quite collaborative and interdisciplinary in nature. For example, the needs for data assessment and technology development mandate the collabora- tion of scientists who may not have been involved traditionally in biologi- cal research, such as engineers, physicists, and computer scientists. This new approach to biology creates additional challenges in communication across disciplines, and can also lead to difficult questions regarding train- ing and career advancement. If interdisciplinary scientists do not fit well into the traditional models of academic science departments, it may be difficult to assess their contribution and compensate them fairly with promotions and tenure. These issues are also relevant to managers of large-scale projects, who are crucial to the success of the effort, but often do not find themselves on traditional academic career paths, and may be given relatively little credit for the accomplishments of the project. These topics are covered in more detail in Chapters 4 through 6. One issue common to all large-scale biomedical research projects that generate research tools or databases of information is that of accessibility. Concerns are often raised regarding intellectual property rights, open communication among researchers, and public dissemination of data and information. Such concerns may be especially pertinent when for-profit entities are involved in the undertaking. Most projects to date have adopted a policy of making data publicly available, at least in raw form. Research tools and reagents generated through large-scale projects funded by NIH are also often made available to other scientists at cost, but doing so requires a considerable commitment of NIH resources and infrastruc- ture support. Clearly such matters need to be thoroughly addressed be- fore a large-scale project is launched. Chapter 7 examines these issues in greater detail. The issue of peer review also appears to be extremely important for large-scale projects in biology. Many of the early attempts by NCI to undertake large-scale, directed projects resulted in harsh criticism be- cause of a lack of peer review, which has been fairly standard for NIH funding. Traditionally, NIH decisions about which projects and investi- gators to fund have been made following peer review of project proposals in grant applications. But peer review could also take other forms, such as reviewing the progress and achievement of grant recipients to determine whether funding should continue or whether the project's goals or objec- tives should be altered. Peer review might also focus on the performance of program managers who make decisions about which projects and people to fund, as is done under DARPA. Recently, NIH has developed some new large-scale programs that incorporate novel approaches to peer review, whereby steering and advisory committees whose members in-

MODELS OF EUGENE SINE 79 clude scientists not directly involved with the protect assess progress and provide advice on future direchons. It is sUll too early to determine how e~ecUve these mechanisms are' but Bus far they appear to be acceptable to the scientific community. These topics are addressed in more detail in Chapter 4.

Next: 4. Funding for Large-Scale Science »
Large-Scale Biomedical Science: Exploring Strategies for Future Research Get This Book
×
Buy Paperback | $65.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The nature of biomedical research has been evolving in recent years. Technological advances that make it easier to study the vast complexity of biological systems have led to the initiation of projects with a larger scale and scope. In many cases, these large-scale analyses may be the most efficient and effective way to extract functional information from complex biological systems.

Large-Scale Biomedical Science: Exploring Strategies for Research looks at the role of these new large-scale projects in the biomedical sciences. Though written by the National Academies’ Cancer Policy Board, this book addresses implications of large-scale science extending far beyond cancer research. It also identifies obstacles to the implementation of these projects, and makes recommendations to improve the process. The ultimate goal of biomedical research is to advance knowledge and provide useful innovations to society. Determining the best and most efficient method for accomplishing that goal, however, is a continuing and evolving challenge. The recommendations presented in Large-Scale Biomedical Science are intended to facilitate a more open, inclusive, and accountable approach to large-scale biomedical research, which in turn will maximize progress in understanding and controlling human disease.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!