National Academies Press: OpenBook

Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology (2008)

Chapter: 3 Recommendations and Goals: New Horizons in Plant Genomics

« Previous: 2 Assessment
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 53
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 54
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 55
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 56
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 57
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 58
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 59
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 60
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 61
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 62
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 63
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 64
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 65
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 66
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 67
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 68
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 69
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 70
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 71
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 72
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 73
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 74
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 75
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 76
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 77
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 78
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 79
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 80
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 81
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 82
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 83
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 84
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 85
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 86
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 87
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 88
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 89
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 90
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 91
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 92
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 93
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 94
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 95
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 96
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 97
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 98
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 99
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 100
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 101
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 102
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 103
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 104
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 105
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 106
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 107
Suggested Citation:"3 Recommendations and Goals: New Horizons in Plant Genomics." National Research Council. 2008. Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology. Washington, DC: The National Academies Press. doi: 10.17226/12054.
×
Page 108

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

3 Recommendations and Goals: New Horizons in Plant Genomics “Our greatest responsibility is to be good ancestors.” Dr. Jonas Salk THE FUTURE OF PLANT GENOME RESEARCH Plant science today lies at the nexus of potential solutions for global prob- lems that are challenging a human population of more than 6 billion people today and that is projected to reach 9 billion by 2054 (United Nations 1999). Plants are extremely important sources of food, fiber, energy, and animal feed, yet plant biologists are only beginning to understand the fundamental principles of how plants grow and develop; how they cope with daily, seasonal, biotic and abiotic changes in their environment; how they participate in complex communities in di- verse ecosystems; and how they evolved. Provision of adequate food and nutrition, expanded alternative energy sources, and sustainable environmental stewardship will require the development of new technologies for agricultural solutions that rest on detailed scientific knowledge. An understanding of the principles underly- ing plant growth, development, and reproduction will enable scientists to play a role in securing global health, the global economy, and the global environment by providing new options for improving productivity and reducing the environmental footprint of agriculture. The key to understanding those principles is basic research done in the context of the revolution of genome-based science. The committee strongly recommends that the next wave of National Plant Genome Initiative (NPGI) research should have as its top priority innovative, competitive peer-reviewed basic science aimed at detailed and system-wide understanding of the functions of individual genes, how those functions are 53

54 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e connected in networks, and how they control plant growth, form, function, performance, and evolution. The last 10 years have witnessed an explosion of knowledge regarding the various individual pathways that control plant growth and development. Biologists now better understand the principles underlying how plants perceive changes in their ambient environment; how they respond to patho- gens; how they build flowers, leaves, and roots; and how various classes of hormone receptors direct plant growth. Several plant genomes have been sequenced, a few of which were sequenced to high quality. These discoveries, coupled with continued genome sequencing and resequencing, are the springboard for the next 10 to 20 years, a time during which fundamental research would have the definition of a plant that is more than “the sum of the parts” as its goal. Because of the federal research and development investments made over the last 20 years, plant biology is at the doorstep of an era of unprecedented large dataset collection, systems-wide analyses of those data, model building, and ever more precise hypothesis testing. The fruits of this research will be deeper under- standing of how plant genomes condition important traits. However, the current knowledge is simply too underdeveloped, and translation of that knowledge is too costly or too imprecise, for the majority of desired applications. Thus, NPGI should aim to produce knowledge and tools for efficient trait modification and technology leaps so that genomic information can be translated effectively into environmentally sustainable products of benefit to humankind. The committee recommends the following guiding principles to achieve those goals. • The committee strongly endorses the conclusions of the 2002 NRC report, The National Plant Genome Initiative: Objectives for 2003–2008, that studies aimed at defining core concepts of molecular and developmental plant biology are best undertaken rapidly and efficiently in model plant systems. Basic discov- ery that can be most rapidly and efficiently done in these systems should receive high priority. The committee advocates deep investment in the broadest possible set of genomics tools for these carefully selected systems. These systems would be chosen on the basis that they can provide vital paradigms that inform many other aspects of NPGI and can maximally leverage continued, independent investments in Arabidopsis genome science. • Because the diversity of plant form and function utilized by humans is very broad, the committee strongly endorses the approach that parts of the over- all genomics toolkit be deployed to investigate specific aspects of plant tissue and organ development, environmental adaptations, or biochemical processes that are not well represented in core model species. This will include a great deal of genome sequencing along the entire plant phylogeny to inform comparative func-

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 55 tional studies. However, descriptive functional studies aimed at gathering parallel datasets merely because they are derived from crop species would not be a good use of resources and are best avoided. • The committee recognizes the critical need for the development and de- ployment of field-robust, high-resolution genotyping and phenotyping methods for use in molecular-assisted plant breeding across a broad swath of crops. These methods will require DNA sequencing (though certainly not always full genome sequencing) and substantial population sampling to define informative markers. They will require technological breakthroughs at genotype and phenotype levels to produce simple, robust methods available to plant breeders in the United States and around the world. These activities are crucial if the ultimate benefit of the NPGI discovery engine is to be realized. Hence, a scaffold of genomic tools is needed in each of the major crops in order to translate model organism concepts to them. • The committee suggests that the priorities for NPGI and associated plant sciences be framed towards addressing the large challenges facing humanity, including bioenergy, climate change, sustainability, and human nutrition. The committee envisions the growing enablement of genomic tools, systems biology, and trait modification capabilities in a wider range of species than those currently emphasized. However, investments in those tools are only justified when there is a clear social goal and when the technologies for data collection, hypothesis testing, and trait modification become reasonably efficient and robust. • The committee’s nine recommendations for NPGI priorities in the future are listed in Box 3-1. Each recommendation has a set of goals on three different time horizons: The 5-year goals represent immediate, pragmatic “next steps” in plant genome science, 10-year goals require significant development of new tools and resources to en- able transformative solutions to real world problems, and 20-year “achievements” reflect the committee’s desire to define some admittedly long-range, high-risk, high-reward areas that would significantly alter society’s ability to understand how plants work. TOOLS FOR PLANT GENOME RESEARCH IN THE 21ST CENTURY One of the most remarkable impacts of genomics projects is the development and application of facile technologies that allow the global analysis of cellular components, including genes, proteins, and metabolites. After their invention, high-impact technologies are disseminated for use by individual laboratories and by “data production centers” that generate large amounts of data to benefit the entire scientific community. The number of hypothesis-driven, single investigator

56 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e Box 3-1 Overall Recommendations RECOMMENDATION 1: Expand plant genome sequencing, plant-associated microbial se- quencing, and plant-associated metagenome sequencing, and associated high quality annota- tion by (a) using the Department of Energy’s Joint Genome Institute’s sequencing capacity to generally serve plant sciences and (b) empowering individual principal investigators or col- laborative groups to access and utilize next generation sequencing technologies for a broad spectrum of genomics and metagenomics discovery. RECOMMENDATION 2: Develop “omics” resources and toolkits at high resolution in a few, carefully chosen plant species, including expansion and deeper investment in currently lead- ing model species. RECOMMENDATION 3: Develop “omics” resources at a broader, shallower level across a number of additional species to (a) expand the phylogenetic scope of functional inference, particularly when this is justified to test clearly specified hypotheses, (b) understand physiologi- cal and developmental processes to a depth that is not feasible in the model systems, and (c) provide the foundation to improve U.S. competitiveness of important crop and tree species. RECOMMENDATION 4: Use systems-level approaches to understand plant growth and de- velopment in controlled and relevant environments, with the goal to create the iPlant, a large family of mathematical models that generate computable plants genuinely predictive of plant system behavior under a range of environmental conditions. RECOMMENDATION 5: Increase the understanding of plant evolution, domestication, and performance in various ecological settings via investment in comparative genomics, and in the metagenomics of living communities of interacting organisms. RECOMMENDATION 6: Enable translation of basic plant genomics towards sustainable de- liverables in the field, and continue to use NPGI as a foundation for new, agency-specific, mission-oriented plant improvement programs. RECOMMENDATION 7: Develop and deploy sustainable, adaptable, interoperable, accessible, and evolvable computational tools to support and enhance Recommendations 1–6. RECOMMENDATION 8: Improve the recruitment of the best, broadly trained scientists into plant sciences. RECOMMENDATION 9: Promote outreach on plant genomics and related issues that are criti- cal to educating the American public on the value of genomics-based innovations.

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 57 research projects will have to grow so that creative scientists can avail themselves of these technologies and capture the resultant benefits for society. It is equally imperative that groups of investigators, whether within or across institutions, be supported for collaborative projects when they are scientifically warranted. Col- laborative group formation, however, should not be a requirement for funding because these can be “forced marriages of convenience” that are often not syner- gistic in their output. Finally, the plant genomics community has benefited from the establishment of high-throughput production centers and will continue to do so. These are particularly well suited to generation of data and resources for use by the broader community. In principle, production centers that produce physical or information resources have the advantages of higher efficiency and uniform quality control standards to ensure that useful reagents and information are produced. The guiding principle would be that the quality of information produced by a resource center be equal to or greater than that typically produced by an individual research laboratory. Examples of genomics technologies that have had significant impact on the plant biology community include T-DNA and transposon tagging strategies, DNA microarrays, and mass spectrometry. These technologies are now sufficiently wide- spread that they are accessible to most researchers for individual experiments. In plant sciences, the accessibility of these technologies can be largely attributed to NPGI and the Arabidopsis 2010 Project of the National Science Foundation (NSF). At the same time, these technologies also are used in production projects. For ex- ample, the ends of T-DNA and transposon insertions are sequenced to locate the position of each in the genome. TILLING collections now exist for various species, and they allow investigators to screen for point mutations using polymerase chain reaction. DNA microarrays are used for large-scale analysis of gene expression and mapping transcription factor binding sites. Mass spectrometry is readily used for large-scale mapping of protein-protein interactions. DNA Sequencing: The Basis of Genomics Recognizing and taking advantage of opportunities to “upgrade” large-scale datasets as new, quantitative, rapid, and cost-effective technologies are released is critical to NPGI. It is also important that NPGI lead the development of such technologies, which would then drive their deployment via the mission-based member agencies like the U.S. Department of Agriculture (USDA) and the U.S. Department of Energy (DOE). An example of opportunities for “data upgrades” is the new, high-throughput next-generation DNA sequencing technologies that have emerged in the last year—for example, pyrophosphate sequencing (454 Life Sciences™/Roche) and localized cluster sequencing (Illumina, Inc.). Others will no doubt emerge very soon.

58 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e The new technologies grew out of a specific funding mechanism from the National Human Genome Research Institute (NHGRI) to support the goal of sequencing a human genome for $1,000. The technologies have had enormous im- pact on the sequencing of new genomes and have profoundly altered the ability to resequence, at a huge savings, natural variants of species where a reference genome sequence already exists. The new, next-generation DNA sequencing technologies will revolutionize the ability to map transcribed regions and transcription factor binding sites across a genome and to address how these phenotypes change over developmental time and in response to various stresses. Resequencing technologies can open new vistas in creative analysis of natural variation and evolution, and in understanding the complexity of organisms pres- ent in environmental samples of plants and their associated microorganisms. In turn, next- generation DNA sequencing technologies have created demand for new informatics tools that can deal with the collection and assembly of small DNA frag- ments. This interplay results in a familiar and compelling cycle—important new technologies drive the creation of new ancillary technologies and create horizons for new biological experimentation that were previously unreachable, leading to new levels of detailed experimental understanding. Thus, the committee can now credibly propose to use genomics to understand the principles underlying plant genome structure and evolution. Understanding how plant genomes expand and contract through polyploidization, segmental duplication, and subsequent loss or silencing of genetic information, for example, is now within reach. Furthermore, what were once puzzling and unappreciated features of plant genomes, such as the very high proportion occupied by trans- posons in some lineages, can be understood within a solid theoretical framework with genome sequencing on the scale recommended in this report. The committee does not, however, anticipate that physical chromosome maps and complete draft sequences will be required for all projects. Judicious choices for genome sequenc- ing, in addition to those species listed below in Table 3.1, should consider how polyploidization has led to variable plant gene function and the evolution of novel traits of interest. RECOMMENDATION 1: Expand plant genome sequencing, plant-associ- ated microbial sequencing, and plant-associated metagenome sequencing, and associated high quality annotation by (a) using the Department of Energy’s Joint Genome Institute’s sequencing capacity to generally serve plant sciences and (b) empowering individual principal investigators or collaborative groups to access and utilize next generation sequencing tech- nologies for a broad spectrum of genomics and metagenomics discovery.

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 59 As noted in Chapter 1, genome sequence is the raw material for functional, evolutionary and translational tool development at the center of plant genome sci- ences. Plant sciences will benefit from the generation of the first “reference genome” sequences for a growing number of species that define key points in plant evolution. The next-generation sequencing will enable both “reference genome” sequencing and resequencing for purposes of population and evolutionary genomics (see also below). As an example, it is likely that sequences from closely related species will constitute a powerful way to inform the functional biology of target genomes (from patterns of evolutionary conservation of sequence motifs, functional domains, and so on). Hence, for every species whose genome is chosen for a reference sequencing project, parallel sequence analysis of a related taxon of appropriate evolutionary distance (something on the order of 30 to 50 percent divergence at silent sites being optimally informative) would be appropriate. Furthermore, the metagenomes of cultivated plants and plants in natural ecosystem communities will provide rich arenas for future discovery of important interorganismal associations that have positive or negative impact on plant per- formance (NRC 2007a). Metagenomics has been embraced by NIH, and has led to a major program on the human metabiome. A similar large-scale investment in plant-associated metagenomics is justified because of the diversity of plant-associ- ated microbial communities and their impact on plant productivity. For example, the communities of microorganisms associated with candidate perennial biofuels crops, in monoculture or in more natural assemblages, are not well understood. As another example, the rhizosphere community, both microbial and animal, can in- fluence root growth and development. Certain microorganisms can protect plants from other pathogenic microorganisms. Hence, a merging of metagenomics with root genomics would be rewarding. Thus, the committee strongly endorses the recommendation that NPGI make major investments in both plant genome and large-scale metagenomics sequencing efforts. The unique role played by the Department of Energy’s Joint Genome Institute (JGI) in the service of NPGI is critical. Although there are several high-throughput genome centers devoted to the missions of NHGRI, only JGI has plant biology as a central component of its mission. JGI has established a peer-reviewed policy for high-impact reference plant genome sequencing, which it has implemented suc- cessfully (see Chapter 2). The economies of scale gained from JGI’s expertise and throughput, especially with their addition of next-generation sequencing capabili- ties, is unlikely to be matched by another sequencing center that has a deep interest in plant genomics. JGI is thus uniquely placed for the development of projects that combine traditional Sanger sequencing with the next-generation sequencing tech- nologies that will lower the costs of reference sequencing considerably and allow economies of scale for resequencing projects. Table 3-1 provides a list of species for

60 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e which one could argue a strong case for inclusion for NPGI genome sequencing plans in the next 10 years of NPGI. Even at the economies of scale provided by combining JGI’s throughput and next-generation sequencing, the sequencing of one reference genome from each of the species listed above will be costly. Therefore, use of other criteria to prioritize the list is necessary. The committee’s recommendation for criteria to prioritize or- ganisms for sequencing, as applied to different sets of biological and technologic is- sues, can be found in Recommendations 2–4 below and in the 2002 NRC report. JGI should also seek to upgrade its basic and limited annotations, preferably via collaboration with groups containing the relevant expertise or by expanding its own activities in this area. Interaction between JGI and the NSF’s Plant Cyberin- frastructure awardees could be synergistic in this regard. The committee therefore considers it highly desirable that DOE continue to take a broad view of JGI’s unique position in the plant science community. It is critical to the success of NPGI that JGI continue to serve a broad remit for sequencing and resequencing of plant genomes, a remit not limited to only the sequencing of plants that are directly important to bioenergy production. To narrow JGI’s mission would imperil a successful pillar of the NPGI infrastructure. The next-generation sequencing technologies and supporting bioinformatics will also make resequencing of many different genotypes of small genome species a reasonable goal for individual principal investigators (PIs) or for groups of PIs. Re- sequencing is a critical new tool in the genomics toolkit because it allows scientists to understand how individuals vary at the DNA level, and how that variation shapes differences between individuals of the same species, and across short evolutionary distances by sequencing individuals of closely related species. Resequencing is es- pecially important in the context of understanding evolutionary mechanisms and the natural diversity of plant form and function. Resequencing is already having a powerful impact on Arabidopsis genomics (Clark et al. 2007; Kim et al. 2007), and a project underway to resequence many rice relatives will certainly have similar impact in the understanding of rice evolution and domestication. It seems reasonable that the JGI would take the lead on generating a broad swath of new plant genome sequences, because plant science still requires many high-quality draft sequences to serve as reference sequences for those species and branches of the evolutionary tree. In addition, other existing large-scale sequenc- ing centers could be recruited to participate in NPGI activities. The costs of se- quencing will likely drop, and many of the major crop species could be sequenced. Furthermore, multiple reference sequences might be necessary to cover the major halpotypes of a given species, if the haplotypes are divergent enough from one another. By contrast, resequencing efforts could be done by individual laborato- ries with access to the new sequencing technologies, or consortia of investigators

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 61 interested in specific questions in population, evolution, and ecology that require a large resequencing component. Indeed, the Arabidopsis Landsberg-er and Cvi-0 accessions have been resequenced in two weeks each at a fraction of the cost of the original reference Col-0 sequence (J. Ecker, Salk Institute, personal communication, October 20, 2007). Goals for Sequencing (Recommendation 1) 5-year goals • Sequence the genomes of 25–50 strategically chosen plants and resequence the genomes of hundreds, if not thousands, of wild accessions of the plants chosen for the full “omics” effort. These sequencing programs would be accompanied by standards-based annotation. 20-year achievements • Hundreds of reference plant genomes will be draft sequenced to high cover- age and annotated for comparative purposes and development of mapping tools. These will blanket the plant evolutionary scale. • Tens of thousands of plant genomes, or more, will exist as annotated resequences. “Omics” Resources and Toolkits RECOMMENDATION 2: Develop “omics” resources and toolkits at high resolution in a few, carefully chosen plant species, including expansion and deeper investment in currently leading model species. All well-planned genome initiatives involve systematic development of re- sources that enable next-generation experimentation. NPGI is no exception. These resources include tools for genomics, epigenomics, transcriptomics, proteomics, metabolomics—often referred to collectively as “omics” tools. The tools result from large datasets that, for example, catalog mRNAs or small RNAs, proteins, or metabolites. But they also result in experimental materials, such as mutant plants, cDNA clones, and recombinant proteins. Development of omics tools most commonly requires high-throughput, computationally intense methods, and it is technology-driven. As computation and technology advances, so do the quality and quantity of omics data and resources. The utility of omics tools depends on accessibility and applicability to a broad community of researchers.

62 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e TABLE 3-1  Desirable Reference Genome Sequences Not Currently Funded. This Table Lists Other Future Projects of Direct Relevance to Food, Feed, and Fuel Needs of the United States Genome Species Common Name Size (Gb) Notes Phaseolus vulgaris Common bean 0.5 An important crop in its own right, Phaseolus is also an unduplicated outgroup for the recent soybean tetraploidy. Pinus taeda Loblolly pine 20 Wood crop, forest resources. Other gymnosperms (for example, spruce) also desirable, but note extremely large genome size. Pennisetum glaucum Pearl millet 2.7 Drought tolerant grass, cereal of “last resort.” Panicum capillare Diploid switch grass 0.5 A genetically tractable diploid relative of the tetraploid Panicum virgatum (switchgrass), a leading biofuel crop. Triticum aestivum Hexaploid bread wheat 17 Hexaploid wheat and its diploid relatives, which are major sources Aegilops speltoides, Diploid wheats related to 2-4 of nutrition around the world and a Triticum progenitors of bread wheat system for understanding genetic monococcum, effects of domestication and Aegilops tauschii polyploidy. Malus x domestica Apple 0.7 Along with peach, these two rosaceous crops are at strategic Fragaria vesca Strawberry 0.2 phylogenetic distances for intrafamily sequence comparisons Musa acuminata Banana, plantain 0.6 Outgroup for grasses and the grass-specific paleotetraploidy, and therefore key to understanding important crops, especially in developing world. Vulnerable through limited genetic diversity. Citrus sinensis Sweet orange 0.4 Major U.S. crop that is highly sensitive to frost. Genome sequencing could aid genetic improvement for cold resistance. Marchantia Liverwort 0.4 Primitive land plant that will assist polymorpha in understanding the polarization of changes along the stem leading to angiosperms and gymnosperms. Manihot esculenta Cassava 0.8 Source of carbohydrates in developing world. Sample sequencing project is underway, with no full genome commitment

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 63 TABLE 3-1  Continued Genome Species Common Name Size (Gb) Notes Gossypium sp Diploid cotton; polyploidy >1.0 Valuable fiber crop is polyploid, with cottons diploid relatives. Sample sequencing project is underway, but no full genome commitment. Saccharum Sugarcane, Chinese silver 2-3 Rapidly growing C4 grasses with officinarum, grass potential for biofuel feedstocks. Miscanthus sinensis Sugarcane is octoploid, Miscanthus is diploid, also providing a rich system for studying polyploidy. Citrullus vulgarus Watermelon 0.5 Would provide a cost-effective reference genome for cucurbits Lactuca sp. Lettuce 2.3 Diverse complex of species provides rich gene pool for breeding hardier varieties. Solanum tuberosum Potato 0.9 Although related to tomato, potatoes were independently domesticated. Solanum chacoensis Wild potato 0.6 Important comparators within Solanaceae. Ipomoea sp. Morning glory 0.7-1 Morning glory, diploid closely related to sweet potatos, which are typically polyploid. Potential genetic model system for tuber formation. Helianthus anuus Sunflower 2.4 Important source of edible oil worldwide Antirrhinum majus Snapdragon 1.6 Genetic model system that would be invigorated by genomic resources Medicago sativa Alfalfa 0.9 Forage crop, tetraploid relative to M. truncatula model system Boechera holboellii Rockcress 0.2 Model system for asexual (apomictic) reproduction in plants, with an international user community. Closely related to Arabidopsis. Useful, integrative, Web-based computational resources that allow the broader community of scientists to derive high value and to form testable hypotheses are a critical component of a full omics effort. For example, what good is an omics project to assemble a deep catalog of molecular and metabolic responses to drought stress if plant biologists working on important problems of drought stress cannot access, synthesize, understand, and analyze the data? Furthermore, these Web-based

64 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e computational tools should integrate as many omics resources as possible. Integra- tion of distinct datasets, originating from different laboratories and production centers, is essential to understand the whole plant system. Although the technology to derive omics data can be applied to a broad range of plants, including crop species, the extent to which value can be extracted from the resulting data and resources depends on a number of factors. The criteria that enable the most productive generation and utilization of omics resources have not changed significantly since their articulation in the 2002 NRC report. How- ever, the decrease in costs has certainly opened new vistas for some resequencing projects where a reference genome sequence now exists. Criteria for Investing in a “Full Omics Effort” Priority would be given to “full omics” efforts to a small, select set of plant spe- cies, in which the most advanced resources and tools are developed and applied, and the data disseminated in a comprehensive format. For some resources, the full omics effort might involve invention or development of new technology. The full omics effort also involves redevelopment or refinement of resources as technology advances. In all cases, however, the tools and resources would serve as models, refer- ences, and guideposts for the broader plant science community. Maintaining a set of intensely studied full omics effort plant systems, especially through cutting-edge development of technological resources, is viewed by the committee as critical for subsequent discovery and application by a broad plant science community that extends beyond those working in core model systems. Detailed, and still relevant, criteria for full omics support were previously outlined in detail (NRC 2002). They are summarized here: • Complete, well-annotated genome sequence. The importance of a high- quality, well-annotated genome sequence for development of the full omics toolkit cannot be overstated. The genome sequence provides the backbone resource on which all other resources depend. One cannot adequately interpret, for example, gene expression patterns from a microarray dataset without understanding where all the introns and exons occur. Similarly, epigenomic resources, such as maps of DNA methylation, cannot be assembled and placed in context without a properly annotated, complete genome sequence that identifies both protein-coding and noncoding genes. • Extensive bioinformatics resources. Well-developed, community stan- dards-based bioinformatics resources for a plant species are critical to place omics data in context. Community-supported resources, such as the Arabidopsis Informa- tion Resource (TAIR) or Gramene, serve as both clearinghouses for omics data and

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 65 sources for information required to develop omics tools. As omics data in model systems grow, so do the needs for better, more integrative bioinformatics and computation. Of particular importance in the future are computational methods, tools, and databases that integrate disparate omics data from multiple technology platforms and laboratories. • Traits of a “classic” model system. Omics data, like most other types of scientific data, have to stand up to experimental validation and serve hypothesis- driven experimental science. Omics data have little value if robust experimental tools that meet community standards are not readily available. Therefore, omics tools and resources are best developed and tested in plant species that have served as model systems for basic experimental science. Broadly applicable model systems have several common features, including well-established and easily managed ge- netic properties (for example, the ability to self- and cross-fertilize, and abundant genetic markers); reasonably fast generation cycle times; small size; simple genetic transformation methods; well-documented protocols for biochemical, cell biologi- cal, and physiological assays; and easily accessed public resources, genetic stocks, and other experimental tools. • Ability to make “knockouts” or “knockdown” mutants. Omics data allow researchers to form hypotheses about gene or protein functionality. Testing these hypotheses frequently requires analysis of mutants with defects in genes of interest. Public availability of indexed mutant collections (based on T-DNA insertions or transposon generated mutations), and more recently gene silencing collections, for the Arabidopsis and maize research communities have had an enormous positive impact on both the rate of progress and quality of the resulting science. Indeed, assembling and indexing of these mutant collections are themselves omics activities (functional genomics). • Genetic resources that allow definitions of gene and genome function across the range of available natural variation. The domestication of plants from wild progenitors, and the ability to tap the breadth of ancestral genotypes as they are reflected in modern germplasm is, combined with next-generation DNA se- quencing technologies, a new and powerful criterion for selection of a species for a full omics build out. See the specific list of criteria for such selection in Recom- mendation 3 later in this chapter. • A critical mass community of scientists. Formation of high-quality omics resources and subsequent creative utilization requires a vibrant, enthusiastic com- munity of scientists working on a common plant taxon. Again, the aggregating effect of a well-established plant model system promotes development and utili- zation of tools. Among the Arabidopsis worldwide community, for example, over 1,000 laboratories are listed on TAIR. Similarly, the list of maize “cooperators” at the Maize GDB approaches 3,000 individuals and organizations worldwide.

66 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e In view of these criteria, and in light of limits to financial resources, only a handful of plant species will warrant the investment to generate and exploit the full omics toolkit. As appropriate species are selected, they should represent major evolutionary branches within the plant kingdom (as described in the 2002 NRC report) and should leverage the productive investments in crop species made to date by NPGI and by the Arabidopsis 2010 Project. RECOMMENDATION 3: Develop “omics” resources at a broader, shal- lower level across a number of additional species to (a) expand the phylo- genetic scope of functional inference, particularly when this is justified to test clearly specified hypotheses, (b) understand physiological and develop- mental processes to a depth that is not feasible in the model systems, and (c) provide the foundation to improve U.S. competitiveness of important crop and tree species. Criteria for Investing in a “Partial Omics Effort” Although only a few species will fulfill the criteria for full omics treatment, a broad range of plant species will warrant development of a partial omics toolkit. There is not a strict, sanctioned list that defines which omics comprise a “partial omics toolkit,” as the omics most relevant to a particular species or crop plant will vary. However, omics that are anticipated to have particularly broad relevance to many plants include transcriptomics (mRNA), small RNAomics, proteomics, and possibly epigenomics (for example, genome-wide DNA methylation patterns). Generally, partial omics plant species will have important practical, economic, or scientific features that are biologically distinct from those of intensely studied model plant species. Partial omics efforts enable concepts to emerge, and to be significantly extended, from model species through comparative approaches. This concept was discussed in detail earlier (NRC 2002); that discussion remains ger- mane and can now be expanded to include more species because costs for many of the omics tools have dropped and will certainly continue to do so over the coming decade. Criteria for consideration for partial omics tool development include the following: • Important clade-specific processes and attributes. Many plant species pos- sess unique but important and interesting biological features that can be addressed using less than full omics approaches. Development of diverse fruits, formation, and properties of fiber and wood, and adaptation to fill broad niches or to toler-

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 67 ate extreme environments, are but a few examples of processes or attributes that cannot be studied sufficiently using core model systems. • Excellent comparative properties. A major outcome of plant-based genom- ics over the last decade has been a revision of concepts about genome evolution. Many more questions can now be addressed using omics technology applied to close relatives of model and reference plants and to key points and transitions in plant evolution. In particular, high-throughput sequencing technology and computation can now provide vast datasets to increase scientists’ understanding of natural variation, speciation, clade-specific adaptations, polyploidy, and many other evolutionary events. • Leveraging, adding value to a significant community, and making best use of existing resources. NPGI has invested heavily in crop plant genomics. In many instances, the investments have provided omics resources that can now be exploited through translational genomics. Although often overlooked, the public- sector plant breeding community capacity for field phenotyping in crops such as wheat and soybean is substantial. However, access to cutting-edge field phenotyping and genotyping tools, some of which will require new technologies to be specifi- cally developed—for example sophisticated remote sensing over time. In addition, national needs (for example, biofuels and specialty crops) are dictating both private and public research investments beyond the scope of NPGI funding. Getting the most out of current and previous productive investments needs to be considered in future omics investments of the NPGI. In addition to the development of new resources, both the full and par- tial omics efforts will require the development of physical and information resources. Physical resources include items such as clone and mutant collections. Information resources include DNA sequences, datasets that catalog information such as protein-protein interactions, gene expression, and transcription factor binding. Those resources are extremely valuable for both individual researchers and those attempting to understand whole systems. A particularly relevant example is provided by the seed collections that are the backbone of mutant identification in any species. The committee notes that creation of user-friendly data interfaces for these physical resources requires a different skill set than data analysis. Information resources are also particularly powerful for elucidating the emergent properties of systems. Global analysis of regulatory circuits in yeast has revealed logic concerning network organization and common regulatory motifs. Information resources are critical to modeling biological properties of plants. Such information is a prerequisite for proper prediction of biological properties and outcomes of plant manipulation.

68 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e Goals for “Omics” (Recommendations 2 and 3) 5-year goals • Develop two to three centers for technology development, application to experiments, data management and analysis, and user interfaces. These centers should also curate the data generated by individual laboratories. These centers will serve broad communities of researchers. • Develop full omics kits for a small set of model plant species that, via comparative approaches, can benefit a broad range of plants. Initially, the full omics toolkits would be built for Arabidopsis and a few other plants. This could be expanded as the price to conduct functional genomics decreases, and as the com- munity resources required for a full omics effort are established in other species. • Use the full omics toolkit to accelerate definition of gene function across the full omics model species and their sequenced closest relatives. • Begin development of, or expand existing, partial omics kits for plant species that have important economic, ecological, developmental, or biological features that cannot be fully studied in the core model species. Some examples to consider: a legume (for nodulation studies), tomato (for fruit ripening), a major woody tree such as poplar, and additional high-value agriculture crops. • Invest in the application and development of new methods to analyze gene sequences and allelic, genomic, and population-level DNA variation, gene expres- sion, protein expression and interactions, and metabolite expression. Although the committee does not suggest that fully independent technologies be developed for plant research, it recognizes that some technology development might be necessary to adapt existing technologies currently used for animals and microorganisms to plant systems. This could begin with the full omics species and expand outward as costs drop and overall funding increases. The data obtained constitute the core information needed for developing an integrated systems approach, as outlined in the next section. Although the human genome project will develop many of these protocols, NPGI still needs to play an active role in technology development. The technologies include, but are not limited, to: o  ethods to quantitatively monitor and analyze gene expression, includ- M ing in specific cell types, at a greatly reduced cost. o  ethods to critically characterize all protein-DNA interactions to identify M regulatory sites on the genome of Arabidopsis and perhaps one additional model organism ENCODE (MOD-ENCODE) project. This should be enlarged as costs allow. o  igh-throughput production of affinity reagents to follow protein abun- H dance, localization, and other applications. o  ethods to monitor the expression of all proteins and other molecules M (including low abundance) in a small cellular sample.

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 69 Methods to monitor all protein modifications in a small cellular o  sample. o  nalyzing thousands of protein-protein interactions in parallel. A o  nalyzing large number of enzyme activities from diverse plant species, A and at low cost. 10-year goals • Implement high-throughput phenotyping and omics data collection under both controlled and field conditions. • Understand the functional consequences of epigenomics. o  escribe the full suite of changes to chromatin (histones, methylation) D as a function of development and response to environment. o Define the genomic content of all small and ncRNAs. o Define functions for all miRNA and tasiRNAs (and all the others). • Develop computational protocols for systematic integration of disparate omics datasets from different plant species. • Create genome-wide knockouts and complete full-length cDNA collections; strive for accurate genome annotation, functional analysis and other omics tools for both core full omics models, and leading partial omics species. • Implement translation of discoveries in model plants to most key crop plant species, including transformation methods for all major crop species. • Use artificial chromosomes for rapid analysis and manipulation of complex traits. 20-year achievements • In core model plants, omics data, high-resolution imaging, computation, and modeling will yield accurate, predictable mathematical models for major growth, signaling, and response pathways. • Technology for de novo sequencing and resequencing, coupled with com- puting, will bring analysis of all plant species within the genomics umbrella. • Technology will allow metaomics analysis of plants in communities, over single and multiple generations. RECOMMENDATION 4: Use systems-level approaches to understand plant growth and development in controlled and relevant environments, with the goal to create the iPlant, a large family of mathematical models that generate computable plants genuinely predictive of plant system be- havior under a range of environmental conditions.

70 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e NEW HORIZONS FOR PLANT GENOME RESEARCH From Reductionism to Systems Biology to Modeling and Back NPGI research to date has generated impressive advances with respect to gener- ating DNA genome and mRNA sequences from basic model plants like Arabidopsis and key crop plant species like maize and rice, with more in process (Chapter 2). Further, the NSF Arabidopsis 2010 Project and the Interagency Working Group on Plant Genomes NPGI have defined an initial set of key pathway components required for the regulation and manipulation of plant growth and development and of plant responses to pathogens and environmental stress (see Chapter 2). As with any discipline, the reductionist experimental approach mostly creates spo- radic information—a static biological “parts list.” Critically, however, reductionist genetics-based data is explicitly linked to genetic causality, and thus typically results from this kind of research are associated with high confidence. The parts list is meant to reconstruct a dynamic view of the living plant that will then be used as the basis for crop improvement. Although the genome sequence of any given plant is a critical piece of information, it is not by itself a blueprint for understanding genome function. The emergent properties of complex systems cannot be elucidated by exam- ining the individual parts; rather, a systems-level approach is required to con- sider how many components act in concert. Gene-by-gene approaches, and even single network approaches, cannot lead to understanding of the diverse function(s) of every reference plant gene. To date about 20 percent of plant protein-encoding open reading frames—(ORF)eome genes—still have unknown functions. Over 40 percent of plant genes can be considered poorly characterized with respect to their function. These unknowns are the tip of the “anony-nome” iceberg, as the noncoding and small RNA genomic content are yet to be fully described, let alone functionally characterized. Considering that genes can have several functions when integrated over developmental time and in various environments, only a small fraction of plant genes can be said to be functionally characterized. Representation of the regulatory plant needs to move beyond wall posters depicting two-dimensional metabolic pathways toward expansive, dynamic rep- resentations of plant processes that require multidimensional data visualization. Only with multidimensional visualization can the constantly switching circuitry used by plants while they are acting in their environments be observed with suf- ficient fidelity and resolution. The systems biology approach recognizes that any given gene functions only in the context of particular regulatory modules made up of interacting nodes and hubs (other gene products and their interactions), and that each regulatory module is

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 71 in constant flux with respect to others to form a biological network. The individual plant itself constitutes an organismal network of such component biological net- works, and ultimately each organism participates in larger ecological networks made up of many types of organisms and their biotic and abiotic interactions. The grand challenges toward achieving this level of characterization are • A scalable view of global regulatory networks, from intracellular events to whole ecosystems, arrived at by large-scale collection of system-wide data sets from naturally variable genotypes (from genomes to phenomes) assessed across growth and stress conditions, and in association with other organisms like pathogens, mutualists, commensals, and symbionts. • Families of models incorporating these datasets that both describe system behavior and predict outcomes of subsequent system perturbations. • Validated computational representations of individual plant cells, tissues, and, eventually, whole plants interacting within their multiorganismal commu- nities that enable accurate prediction of growth patterns and responses of plants under diverse natural environmental conditions. Characterizing regulatory networks in a systems manner that remains firmly mechanistically based is a major challenge for plant genomics over the next two decades. Networks define plant growth and development, and plant responses to environmental challenges. It will therefore be imperative to elucidate the networks underlying the traits of interest to scientists and breeders alike, in both controlled environments and under conditions relevant to their ecological and economic properties. These traits include growth and development through the life cycle, heterosis, abiotic stress tolerances, the control of flowering and fruit or seed pro- duction, the mechanisms of resistance to pathogens and herbivores, and symbioses with beneficial microbes, to name some of the most important. Many of the basic intracellular algorithms can likely be determined from a few well-studied models; heterogeneous data collection and analysis will not be necessary for every plant species. As discussed in the section on omics toolkits, many of the most important basic lessons learned from the systems view of model species can be applied to other species. The plant research communities working on Arabidopsis and other well-developed species are beginning to accept and adopt the systems view and approach; incentives could be provided to the research com- munities to broaden these efforts. However, as studies progress beyond basic plant biology and experimentally controlled environments and into specialized biological processes and uncontrolled environmental responses across the breadth of plant biology, the data from the models might fail to make useful predictions about higher level, emergent proper-

72 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e ties such as water and nutrient use efficiency, susceptibility to complex diseases, and yield. A major effort will, therefore, be needed to calibrate, and extend, beyond algorithms based on the elite model organisms. Selection of these species would be in alignment with the criteria for partial omics kits discussed above. The long term output of systems-based research on regulatory networks should be a greatly enhanced ability to create custom plants highly tailored to meet a required need. Those needs could include predicting the capacity to produce adequate food and fuel under specific environmental challenges—for example, changes in pathogen loads or abiotic stresses associated with climate change. Molecular Regulatory Networks Plants might appear to the human eye to be “sessile,” but they are just as dy- namic in organizing cellular activities and sensing and responding to a constantly changing environment as animals. Indeed, plants’ inability to move into and out of different environments has resulted in the evolution of remarkably powerful biosensing capabilities (for example, light quantity and quality, volatile molecules, presence of pathogens compared to symbionts and mutualists) that are necessary for their survival in a given niche. Although plant biologists have made important progress in defining components of the key regulatory modules and their hubs in plant species, a far more detailed and dynamic view is needed. Biological networks have all evolved to be generally robust—that is, they are resistant to most stochastic changes they experience. However, the system robust- ness often comes at the price of fragility to unusual perturbations. The fragility can be exploited by the use of perturbing factors (for example, mutations, chemical treatments, stresses, and others) to probe how a network responds to the alteration of a given node. An understanding of the composition and regulation of all plant transcriptional networks is necessary to predict plant responses to perturbations. Plants have greatly expanded transcription factor families when compared to animal genomes, making straightforward loss-of-function genetic approaches less productive than systems approaches using omics tools. How the transcriptional networks are organized in a hierarchical fashion, the dynamics over multiple tim- escales (from seconds to seasons) of these networks, and how the transcriptional networks are layered with many other control modes (posttranslational regula- tion, protein dynamics and complex formation along with ion and metabolome fluxes and metabolic enzyme activity) need to be defined. Gene families encoding proteins responsible for regulated protein degradation are also expanded in plants, representing another layer of regulatory modules that needs to be fully elucidated. Given the recent development of mass spectrometry-based methods for profiling

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 73 protease and kinase activities, these types of experiments are now within reach of the NPGI programs. Preparing for the Application of Epigenomics to Crops Plant biologists’ knowledge of how epigenetic changes in the genome control critical plant processes is growing rapidly, but is still at a relatively early stage. For instance, from studying model systems, it is known that key developmental changes in plants can be controlled by small RNAs. The role that small RNAs and changes in the transcriptional status of chromatin play in gene expression is still largely a “black box” in crops. Given its relatively low cost, cataloging small RNA populations in several crop species could be an initial step towards understanding the mechanisms of epigenetic change across diverse crop species. It is also now feasible to use chromatin immunoprecipitation methods together with tiling arrays or deep sequencing to describe changes to histones and DNA methylation during development and response to environment in a variety of crop species. Model Development, Iterative Experimental Refinement, and the iPlant The large-scale data collections that are required to elucidate biological net- works present a set of tough challenges to any research community. Although plant biologists can deposit omics data sets into well-organized databases, it is difficult to extract or predict emergent properties of a system from database formats. Such large data sets also overcome any intuitive ability to build component interactions into networks. Two key approaches are needed to solve those problems. The first is to develop next-generation visualization tools that enable plant biologists to envisage and explore the complex operating networks and their outputs from database informa- tion. The second is to develop fully parameterized scalable mathematical models (both deterministic and stochastic) that describe every relevant biochemical reac- tion (transcriptional, protein dynamic, metabolic), cellular morphogenesis event (division, expansion, cell-cell interactions, organogenesis), environmental response (biotic and abiotic), and community interaction (involving multiple individuals and species) that can then be validated against the growing large-scale data sets (for example, temporal, spatial, and perturbagen response) collected by the NPGI community. Network models are then refined in a hypothesis-driven, experimental manner, and iterative refinement between modeling and experimentation gener- ate clusters of evolving computable plants that are genuinely predictive of system behavior under any given condition—the iPlant. The iPlant models will be a community product. They will depend on a sub-

74 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e stantial cyberinfrastructure to standardize and curate experimental data from both large-scale projects and individual investigator-driven data. The latter is an abso- lutely critical component, as model refinement and testing largely depends on the particular biological domain expertise of individual scientists and their students and colleagues. The models need to provide useful and detailed predictions to agriculturally relevant specialists, especially physiologists and breeders, in order to ultimately serve the economic interests of U.S. agriculture. The iPlant will be constantly refined as predictions from models are tested, and will become the basis for creating highly effective novel plant strains for food, fuel, and fiber. This provides a detailed rationale for integrating germplasm, natural variation, and transgenes into the development and deployment of highly modified, or entirely new, chromosomes—the effects of which on the individual plant and its environment can be accurately predicted. Importantly, the iPlant will represent a collection of powerful research tools, allowing not only investigators to collaborate and explore extant data in an unprecedented manner, but also to help focus investment in experimental resources to the most productive lines of inquiry for both large-scale collaborative and individual investigator projects. The iPlant will likely reveal biological principles that might also be found in similar endeavors in microorganisms and animals. Such common themes of biological systems architecture will accelerate biologists’ understanding of all or- ganisms, and shed light on how the knowledge can be used for the benefit of hu- mankind. In addition, those common themes of modular biology will themselves help reduce the considerable computational power required for the success of a biological “moonshot” such as iPlant, as researchers can make simplifications and abstractions in the absence of fully parameterized equations for a given process. The iPlant will also present novel educational paradigms, as it can be used to generate many “virtual plants” that are far smaller algorithmic clones of the iPlant, to be used for effective education and community outreach programs not only for plant science, but for biological systems in general. Goals for Research in Molecular Regulatory Networks (Recommendation 4) 5-year goals • Gather homogeneous data sets, and develop dynamic gene expression data­ bases, for one or two “core omics” models—with a focus on key, major develop- mental transitions (for example, seedling emergence, transition to and time course through, flowering) and environmental interactions (pathogen infection, a very few abiotic stress conditions). • Collect temporal data for the above.

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 75 • Develop a plant MOD-ENCODE project. o patial transcription factor expression databases (laser captured dissect- S ing microscope; single cell / FACS collected). o Binding site definition for all transcription factors. o  hromatin immunoprecipitation sequence of transcription factor targets C (selected tissues, cells, and time domains). o  etwork analysis of gene expression patterns versus factor binding N sites. • Begin to create the protein interactome map for several key organs at dis- crete times in development. The following would need to be achieved in order to do this: o Develop a fluorescent tagged protein resource for each cDNA. o Create interactome over time map. o dapt automated green fluorescent protein (GFP) imaging platforms to A measure more plant tissues. o Create searchable database of community GFP imaging experiments. o  Incorporate image analysis expertise from disperse fields (physics, de- fense, neuroscience). o  etermine the number of unique cell types in one or two models. Curate D collection of GFP enhancer trap lines to completely represent all cell types, with markers for nucleus, cytoplasm and plasma membrane in each plant cell type. o  Initiate dynamic four-dimensional imaging database for cellular mor- phology through most developmental stages using GFP collections. • Begin mapping posttranslational modifications. o efine technologies for mass spectrometry-based phosphomapping in R plants. o Expand phosphome for soluble and membrane proteins. o Match kinases with phosphosites using orthologous approaches. o dapt enzyme activity profiling to measure enzyme activation (primarily A kinases and proteases) under many growth conditions. • Further analyze the ionome and metabolome (including hormonome) at discrete developmental time points. o  reatly expand application of technology platforms throughout plant G science. o Perform spatiotemporal mapping of major ion constituents. o Define plant metabolic networks in key areas such as cell wall synthesis.

76 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e 10-year goals Phenomics • Develop phenotypic response data to extend iPlant model data to a second tier of models to enable emergent, organismal properties to be incorporated. o  evelop assays for measuring real-time genetic and epigenetic changes D under stress conditions, pathogen infection, other dynamic conditions. • Continue progress in cyberinfrastructure and informatics. o  reate advanced, integrative portals to combine data sets and to make C valid comparisons between distinct omics data sets. o  reate and make available intuitive tools for network reconstructions. C o  enerate visualization platforms for plant data from molecules to G ecosystems. o Initiate iPlant project in selected signaling networks and organogenesis scenarios. Structural genomics • Generate 3D structure for a member of each unique protein family in plants. 20-year achievements • Primary biochemical functions for all plant proteins will be known. • Posttranslational modification of all plant proteins will be measured under many conditions. • Plant interactomes will be defined in vivo, along with subcellular localizations. • Chemical probes will be commonplace for many plant enzymes (inhibitors, antagonists, allosteric activators, agonists). • Metabolic fluxes and ion levels will be known for all tissues throughout development. • Regulatory networks driving morphogenesis and environmental response will be mostly known. • iPlant interface will be fully operational, data collection and refinement will continue. • iPlant algorithms will be developed and their accuracy and value will have been critically tested in many partial omics model species that represent major crop types and scientifically important species. • Synthetic biology will be in widespread use in plant science. • Remote sensing phenomics will be established for majority of world’s crops,

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 77 reference plant experimental sites, and wide range of selected niches—automated and satellite controlled. Evolutionary, Ecological and Communities Genomics RECOMMENDATION 5: Increase the understanding of plant evolution, domestication, and performance in various ecological settings via invest- ment in comparative genomics, and in the metagenomics of living com- munities of interacting organisms. Selection of plants with particular traits by humans, the process of domes- tication, has increased production in crop species, altered many plant devel- opmental characteristics in relation to the wild ancestors of crop species, and allowed humans to sculpt ecosystems dramatically through agriculture and urbanization. The selection process can be viewed as rapid, facilitated evolution for human use. A better understanding of the basis of phenotypic diversity in developmental, environmental, and evolutionary contexts will allow better ma- nipulation of crop systems for improved agricultural productivity and enable the preservation of near-wild ecosystems. Some plant genome sequences are already completed and more are expected to be complete in the near future (see Chapter 2). In most systems, high-throughput approaches have been undertaken with the goal to analyze gene expression data for various tissues, organs, and growth con- ditions. The next challenge is to understand how the network of gene regulation gives rise to phenotypes (see Recommendation 3). Understanding the networks that ultimately determine plant form and function is greatly facilitated by comparison of natural, large-scale genomic sequence variation and gene expression data both within and across species. Understanding and managing natural genetic variation is central to all ef- forts in crop improvement. Natural variation provides the basis for hybrid vigor, local adaptation, and biodiversity. In agricultural and natural populations, trait- genotype association studies and quantitative trait locus mapping will play a major role in analysis of trait variation. Advances over the next decade will revolutionize the understanding of trait variation in plants. Genomic analyses of genetic diver- sity are needed for association studies to discover agriculturally important genes and to serve as a source of genetic polymorphisms for functional analyses of plant biology. Comparative genomics data cannot yet be fully exploited in plants, as has been done so successfully in the primate lineage for example, because of insuf- ficient genome sequence coverage both within key model species and across

78 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e species from different parts of the green tree of life. Hence, the ability to un- derstand how the evolution of natural variation sculpts plant form and function at a mechanistic level, and how that variation allows plants to exploit particular ecosystem niches at a population level, is currently limited largely by the amount of genome sequence available (see above). Environmental Benefits of Plant Genomics Research Photosynthetic organisms play a central role in all of the Earth’s major eco- systems. As a result, understanding how plants function, and how to modify and improve their ability to carry out specific physiological processes—the ultimate goals of plant genome research—could have large benefits for terrestrial and linked aquatic ecosystems. Because NPGI programs improve the basic knowledge and infrastructure for understanding, managing, and breeding all plants, its environ- mental impacts will be deep and far reaching. Genomics research is the fundamental enabling technology that allows plant scientists to optimize plant yields and quality as raw materials for energy pro- duction. Plants are a main converter of carbon dioxide in the atmosphere to fixed carbon. Engineering crops, including forest trees, to better sequester carbon offers the potential to increase the conversion of atmospheric CO2 into soil biomass, ameliorating or even reversing the increase in atmospheric CO2 generated by combustion of fossil fuels. In addition, there will likely be a dramatically expanded role for plants in the generation of liquid fuel alternatives to oil and its refined products. Ethanol from corn and sugar cane, though only a first-generation ap- proach to alternative fuel with limited environmental benefits and supply capacity, has nonetheless validated the concept of using plant-derived biomass as a source of fuel for transportation. In the future, this is likely to come increasingly from lignocellulosic sources including grasses such as Miscanthus and switchgrass, woody tissues from fast-growing tree species, and cellulosic crop waste (DOE 2005a). These developments will not only slow the rate of depletion of fossil fuels and de- crease the net addition of CO2 into the atmosphere, but also provide a scientifically feasible, economically realistic medium-term solution for U.S. energy security and independence from foreign oil. The first generation of genetically engineered crops, though not resulting from modern genomics research and NPGI per se, nonetheless relied on earlier genomic information, including GenBank and other electronic databases of gene function and conservation, similar to the myriad informatic resources that have since been expanded as a result of NPGI and related efforts. These earlier resources informed gene isolation, modification, and analysis. Experience from the first generation of these crops (summarized by Traxler 2004; Delmer 2005; Fernandez-Cornejo and

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 79 Caswell 2006) shows how genomic resources generated by NPGI can lead to sub- stantial environmental benefits for agricultural systems. For instance, • Varieties expressing insecticidal proteins from Bacillus thuringiensis often require fewer insecticide applications than nonengineered varieties, leading to improvements in the biological diversity within agroecosystems. • Reductions in pesticide application, or improvements in ecotoxicological properties of pesticides, can reduce ecological impacts on farm workers and water systems (FAOSTAT 2007; Huang et al. 2005; Huang et al. 2002; Qaim and Zilber- man 2003). • Reductions in tillage associated with herbicide-tolerant crops appear to promote a number of environmental values, most notably improved soil carbon, reduced erosion, reduced energy expenditure from tillage, and improved wildlife habitat. Because of the complexities of agroecosystems and human behavior, the over- all environmental impacts from changes in plant varieties are difficult to predict. Nonetheless, new types of plants whose improvement is a direct result of genomics research either via conventional breeding or genetic engineering will likely provide substantial environmental benefits. Examples of benefits are as follows: • Increased efficiency of extracting nitrogen, phosphorus, and potassium from the soil, or improvement in its availability to animals (Raboy 2007) would decrease use of fertilizer and lead directly to a decrease in energy use and in eutrophication of wetland ecosystems that drain agricultural lands, and associated lower costs of production. • Increased drought tolerance and decreased use of water (Tuberosa and Salvi 2006) would reduce energy costs from irrigation systems, unsustainable changes to water tables, and salinization and associated dysgenic changes to soil quality in arid agricultural systems. • Modifying lignocellulose chemistry in trees and grasses would lead to a diminished need for chemical or physical pretreatment during pulping or fermen- tation to produce ethanol, with an associated decrease in energy use, toxicity of effluents, and processing costs (Chapple et al. 2007). • Ability to remediate toxic chemicals in the environment would improve either by sequestering them from the soil for harvest and removal, or by enzymati- cally breaking them down into less persistent or toxic forms (Doty et al. 2000).

80 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e Genomics and the Major Transitions in Plant Evolution There are more than 250,000 species of plants, representing a wide variety of growth habits, adaptive responses, and potentially useful traits. Modern plant genetics and genomics can provide experimental and computational tools for comparing genomes and for providing insights about the similarities and differ- ences among organisms, the basis of ecologic adaptations, and their origins and persistence. There is untapped value in natural variation as a source of functional information since variant alleles can express altered function that might not be phe- notypically uncovered in typical forward or reverse mutant genetic screens because of functional redundancy or epistasis. Furthermore, natural variants are enriched for alleles that exhibit high fitness in harsh natural environments, including associa- tions with microbial communities that can influence plant productivity. Many important issues in evolutionary and ecologic genomics can be ad- dressed through comparisons among a diverse array of sequenced species. In the 2002 NRC report, the development of genomics resources in species outside the model species Arabidopsis, other reference species, and their crop relatives was advocated. In response to this recommendation, NPGI, largely under the auspices of JGI and following JGI’s community-based, peer-reviewed system, undertook sequencing of reference genotypes of the plant species listed in Chapter 2. For several of these species, NPGI is positioned to take advantage of large collections of natural variation in the form of diverse cultivars, accessions, and wild relatives within each species and next-generation sequencing technologies. The natural variation available encapsulates various important plant characteristics including, among others, perennial and annual life cycles, broad differences in the ability to grow in different ecosystems and respond to pathogen and abiotic stress, different mating systems, and different ploidy levels. Research in plant evolutionary genomics can build on resources developed in NPGI and the various Arabidopsis programs and has great potential to bring fundamental scientific advances in years to come. Maintaining a balance between crop-centered and nondomesticated plant species and populations is critical be- cause diverse experimental systems expand scientists’ ability to examine hypotheses and biological functions. In addition, genomic analyses of the wild relatives of crop plants could identify agriculturally important sources of genetic diversity. Those analyses are essential to understanding the responses of forests and grasslands to changes in climate and other environmental conditions. Both intra- and inter- specific comparative genomics approaches will allow unique analyses of questions in plant biology, which would be difficult or impossible outside of NPGI because of associated genomics costs. To ensure that the genomics investment in additional species builds effectively on existing resources, it was recommended (NRC 2002) that the evolutionary-

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 81 genomics community pursue the selection of a modest number (10–20) of key species spanning critical evolutionary nodes in preparation for community-wide genomic investigation over the next 10–15 years. In response, several large com- munity-driven projects concerning the evolutionary and ecological genomics of a cluster of crucifer species surrounding Arabidopsis thaliana, including A. lyrata, Capsella, Brassica, Thellungiella, and Boechera, have recently begun. These im- portant genome sequences will significantly enhance the ability to define gene function and genome evolution trends, and will significantly expand the utility of Arabidopsis as a broadly relevant model. As well, the sequencing of two additional species, Mimulus and Aquilegia (Columbine), will open new genomics-based vis- tas in systems that are historic and well-characterized models of evolutionary and ecological plant biology. In addition, the genome sequencing of a moss, Physcomitrella patens (a good experimental model for an important lineage of plants), and a lower vascular plant, Selaginella, fill important gaps in our understanding of plant evolution and diversity. Further effort to generate bacterial artificial chromosome (BAC) librar- ies from 22 nodal species and 12 wild rice relatives were also funded (NSF 2007c). This trend is laudable and should continue because it provides an excellent way to leverage reference genome sequences and focus efforts using reference species to determine gene function and network operation to maximum effect. For example, the imminent completion of the first reference maize genome would open the door to sequencing of close relatives such as Zea diploperennis and Zea luxurians. The diploid Tripsacum dactyloides would be valuable to validate gene predictions and regulatory sites in maize and to understand evolutionary trends over a 4–5 million year evolutionary timeframe. The overall goals of evolutionary and ecological genomics, as articulated above, have been readily adopted by the Arabidopsis, rice, and maize genetics communities, where a history of community-based resource development has been facilitated by NPGI and the NSF’s Arabidopsis Genome and 2010 Programs. Consensus building within the evolutionary biology community led to community-developed goals and subsequent sequencing efforts at JGI on the species described above. The poten- tial value of these exercises to crop improvement cannot be overstated: The more that is known about how the diversity of plant life evolved, the better equipped scientists will be to manipulate basic plant processes for practical and sustainable benefit. These facts provide additional support for the argument that an expanded role for JGI, or another, equivalent large scale sequencing center, is critical to the overall success of NPGI. While efforts to date are laudable, the committee strongly recommends a significant expansion of evolutionary and ecological genomics, and of the study of plants in association with other organisms (interspecies interactions genom-

82 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e ics). Tools for genomic studies of diverse evolutionary and ecological focal species need to be explicitly comparative and developed with nonspecialists in mind. The aim would be to broaden the community of researchers who have access to and can effectively use plant-genomic tools and data in the future. Technologies such as low cost resequencing, initial reference genome sequencing, and association genetics greatly lower the threshold for which new species can be considered useful experi- mental organisms. The iterative reduction in costs is relevant for both plants and for the associated organisms that can add to or detract from plant performance. Hence, the set of desirable criteria for choosing new reference species for evolution- ary, ecological, and “interspecies interactions” genomics programs would include (for both the plants and where relevant the “interacting organisms” under study): • Distributed position in the phylogeny and clustered locations to bring added value to related, sequenced species. • Genome size, with emphasis on small genomes and, at first, simple ploidy. • Genetic tractability. • Ease of crossing and population development. • Ease of growth. • Short generation time, except where this is precluded due to interest in ecologically or economically important species (for example, perennial or woody species). • Availability of existing tools such as germplasm collections, mapping popu- lations, genetic and physical maps, large insert clone collections, mutant collections, and genetic transformation. • Size of the research community versus required investment. • Economic importance of the focal species or close relatives. • Exceptional properties of interest, such as selected species that have under- gone recent polyploidization. Goals for Plant Evolutionary, Ecological, and Communities Genomics (Recommendation 5) 5- to 10-year goals • Sequence species representing the major phylogenetic nodes of plant diver- sity and clusters of relatives to bring added value to related, sequenced species. • Sequence microbial species representing the major phylogenetic nodes of pathogen, commensal, and symbiotic diversity, and clusters of their relatives. • Sequence and characterize several plant associated metagenomes from cul-

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 83 tivated and natural systems using well-understood model systems as departure points. • Improve understanding of natural genetic variation at the levels of nucleo- tide polymorphism and trait variation in selected species. • Develop representative models to address major evolutionary issues, such as speciation, adaptive trait evolution, adaptation to environmental conditions, and evolution between and within population. • Study domestication in crop plants to understand evolutionary principles, agriculturally important genes, natural genetic variation among cultivars and landraces, adaptation to heterogeneous environments and changing climates, and phylogenetic differences among plant groups. 10-year goals • Develop representative systems to analyze how plants interact with their associated microbial and insect communities (from pathogens and herbivores to commensals and pollinators), in the laboratory and in ecological settings, both natural and cultivated, over co-evolutionary time scales. • Study in depth the developmental and adaptive mechanisms that are ecolog- ically or economically important. Studies would include the diversity of adaptations to abiotic and biotic stresses, development of complex structures such as wood and fruits, and mechanisms for control of complex life cycles and phase changes. • Explain core evolutionary principles, such as speciation and adaptation mechanisms, the role of genetic drift and of reproductive systems on evolution, and the evolutionary processes that influence complex trait variation. • Develop new informatics tools for massive data sets and for integration of diverse data types, including genomes, pathways, phenotypes, and environmental context. • Develop efficient transgenic methods to bring to nonmodel species. 20-year achievements • Pan-genomic data—that is visually intuitive, manipulable, useful, and quan- titative—will be integrated and easily accessible. • The role of natural and artificial selection in shaping genetic variation and genome structure will be understood. • The role of epigenetics in development, heterosis, adaptation, evolution, and genome architecture will be explained. • The mechanisms and consequences of polyploidy and copy number varia- tion will be explained.

84 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e • The convergence of systems biology and evolutionary biology will be facili- tated by integrating ecologists with genetics, genomics, and evolutionary biology communities. Translating Plant Genomics into Plant Improvement RECOMMENDATION 6. Enable translation of basic plant genomics to- wards sustainable deliverables in the field, and continue to use NPGI as a foundation for new, agency-specific, mission-oriented plant improvement programs. The first 10 years of the NPGI have made a strong start toward understand- ing the fundamental challenge of how plants work. In order to most effectively translate knowledge from the basic science at the core of NPGI into commercial innovation, and to accelerate the pace of translation to practical outcomes, ad- ditional enabling tools and methods for enhanced transfer from model systems to crop species should be developed. Given the considerable fundamental advances in understanding plant genomes and the substantial progress in translation (discussed above) since NPGI’s recent addition of this area of research to its portfolio (NRC 2002), the arguments for a continuation and expansion of translational genomics are clear and strong. A broad genomic survey—beginning with draft genome sequencing and expressed sequence tags—is needed in as many crop species as possible to en- able translation. These genomic surveys can be prioritized on the basis of criteria including: taxonomic diversity, the likelihood of the information gathered being “pulled through” into commercial breeding by a critical-mass community of sci- entists (in both the private and public sector), and the economic footprint of the crop. The prioritization is analogous to the differentiation between investment in “full” or “partial” omics toolkits outlined above in Recommendation 2. Challenges Translating from Models to Multiple Species with Specialized Traits For many of the traits in high demand by consumers or important to industry, model systems provide only very basic leads. They include nutriceutical-associated traits such as antioxidants, fiber, vitamins, and allergens; consumer acceptance traits such as aroma, texture, and ripening; vegetative architecture, woody tissue structure, and growth habit; adaptations to specific and complex abiotic and biotic stresses; and crop-specific domestication, which might differ significantly from the domestication traits that have been studied in cereals.

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 85 Trees present special challenges for translational genomics. Commercially valu- able tree phenotypes are typically polygenic. The unspecialized manner with which most wood and pulp feedstocks are grown in the United States reduces the value of genomics investment. The outcrossing and heterozygous structure of breeding populations—coupled with multiple-year delays to onset of flowering—makes genetic introgression difficult. Marker-assisted breeding might not be cost-effective even where strong marker-trait associations are discovered (Johnson et al. 2000; Strauss et al.1992). A large number of genotypes are typically used in production plantations. There are high costs in time and space to gather statistically reliable phenotypic information about growth, physiology, and wood properties. For these production systems, the costs and efficiencies of genotyping and phenotypic assess- ment methods have to continue to improve if translation is to be effective. The Unique Challenges of Plant Breeding Plant genomics has specific requirements that can differ significantly from the requirements of human genomics, for example. More importantly, marker-assisted breeding projects benefit from large plant population sizes (tens of thousands to millions)—the more individual recombinants that can be tested in an introgres- sion program, the fewer generations are required to cleanly recombine the genes of interest from two or more genomes. However, because marker-assisted breeding in plant biology must span many species (compared to human population genetics, where only one species is relevant), there will continue to be large costs associated with genomic sequencing and subsequent definition of a robust SNP set for asso- ciation mapping. Additionally, it would be ultimately desirable to develop methods for SNP detection and mapping “in the field,” and this is likely to drive technology development that differs from the technologies now being successfully applied in human genetics research. Accelerating the Practical Application of Plant Genomics: Markers, Markers, and More Markers Clearly, plant scientists’ thirst for high-throughput, cost-effective DNA markers has not been quenched. Marker technology empowers gene and quantitative trait loci (QTL) discovery and marker-assisted selection in crop-breeding programs, and facilitates the study of plant evolution across all taxonomic levels. The application of marker technologies has significantly advanced the study of both single- and multi-genic traits of economic importance, has become a standard tool for many plant breeders, and has been widely employed in research studying the genomic conservation among related plant genomes and the evolution of plants. However,

86 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e markers are not inexpensive enough to assay or abundant enough to be employed for breeding in most economic plant species. Likewise, the infrastructure, informat- ics, and expertise for most of those species are not adequate to maximize economic benefit from markers. The decrease in cost of DNA sequence acquisition, which leads to an ever- increasing availability of DNA sequence for crop plants and their wild relatives, will provide a template for additional SNP discovery. In some of the major crop species, resequencing of multiple genotypes will enable massive SNP discovery, and will provide the first steps toward enabling association genetics in these important crops. SNP marker development could be expanded for a number of crop plants that are of considerable economic importance but do not occupy major U.S. land areas and for plants that are being targeted to meet the nation’s emerging biofuel needs. Forest tree species, particularly the conifers with their exceedingly large ge- nomes, will not be feasible to sequence completely with current technologies, but survey sequencing to identify nucleotide polymorphisms is already underway and further support is worthwhile and beneficial. An essential component in turning the polymorphisms discovered by compara- tive sequencing into useful breeding tools is the development of detection technol- ogy that can be deployed in multiple species at low cost. Although polymorphism detection tools are well developed in the private sector, particularly for the large commodity crops, a major emphasis on developing and disseminating technology that can be cheaply deployed in relatively understudied species is necessary. Provid- ing breeders with training in the applications of these newly developed detection technologies in ongoing cultivar development programs is critical. Understanding Plant Phenotypes at Multiple Resolutions Economic plants are grown for their unique and valuable phenotypes. Manifes- tation of those phenotypes is a result of the interaction between specific genotypes and their environment. The environment affects plant growth and development, the biochemical composition of plants, and can even result in a change of heri- table genetic variation. Scientific understanding of plants grown in the context of their environment is inadequate at present. Crop models need to be studied in the diverse set of environments that they might experience in production settings to generate predictive models. Functional and comparative genomics offer powerful tools to understand the mechanisms controlling interactions of plants with their environment, and the underlying genes and gene interactions. However, without a high level of precision in the collection of phenotypic data, the ability to determine genotype will vastly outpace the ability to associate it with a specific phenotype. One component of high-resolution phenotyping will be the development of novel

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 87 biomarkers that accurately report the physiological status of the plant in the context of its environment in real time and the biotic or abiotic stresses it encounters. As genomics research is moved to the field, expertise in supporting areas including informatics, statistics, and quantitative biology will be required. Research pro- grams that integrate genomic scientists, plant breeders, ecologists, physiologists, informaticists, and engineers will be able to measure and interpret massive data sets that describe real-time growth and development of plants in the context of complex environmental signals and challenges. The programs need to include the development of novel engineering solutions for remote sensing of plant physiology from the population (field) level down to the level of an organ of an individual. Because production crops and systems are diverse, numerous plant model systems will be needed to fully explore the breadth of economically important plant sys- tems. Examining the feasibility of creating centers where standardized, efficient, high-throughput phenotyping technologies can be applied to several major crop and tree species would be a logical next step. Such centers are already established in the private sector. Producing Plants That Overcome the Limitations of Their Environment NPGI targeted biotic stresses as part of its initial translational genomics effort. Initial successes in the understanding of plant responses to biotic stress demand that this area of research be scaled up and expanded to include abiotic stress im- provement as a core goal for the next 10-year cycle of NPGI. The ability to sequence multiple isolates of a single plant pathogen species cost-effectively, coupled with the availability of dense marker maps in several crop species as well as large mapping of natural populations, allows detailed studies of the interaction of crop plants and their pathogens and their insect enemies. In addition, dissecting the genetics of traits that allow plants to perform in suboptimal environments (for example that are critically limited by water, temperature, nutrients, and light) is particularly im- portant in preparation for a changing global environment. Developing innovative approaches to phenotype large plant populations with the necessary throughput, precision, and cost-effectiveness to provide replicated data collection within and across multiple environments is critical to studying plants in such environments. Unlocking the Genic Treasures Within our Plant Germplasm Collections The United States spends considerable resources in the collection, maintenance, and characterization of crop germplasm collections. These collections have been routinely used to identify and deploy the genetic diversity necessary for plant breeders to respond to pest adaptation, the unintended import of new pests, and

88 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e changing consumer demands. The collections also contain many desirable alleles for more complex traits, such as seed protein content and productivity. Historically, the approach to finding desirable alleles was based solely on the accession’s pheno- type. A quantitative trait by its very nature is the result of the effects of multiple genes and likely their interactions. One potential of densely saturated DNA marker maps in regions surrounding important QTL of crop species would be applications in identifying germplasm accessions with additional allelic diversity for those QTL, or the identification of accessions that likely possess positive alleles at previously undetected QTL. Research is encouraged into novel approaches to mine the U.S. germplasm collections for unique alleles and introgress them into improved cultivars. Hence, a better under- standing of the wealth of genes present in these collections via genomic analyses and more effective methods for their deployment would be worthwhile. Reaping Rewards from Genome Comparisons Plant biology has a vast phylogenetic sweep. As such, genomics increasingly becomes a unifying theme for plant biology. Plant breeders can take advantage of the variation across the genomes of closely related populations, as well as the distant relatives of crop species. The availability of genomic resources and a clear understanding of the evolutionary divergence of various plant species can facilitate the identification of agriculturally important genes in crop species through the knowledge of its genomic location in a model species. More research in comparative genomics is necessary to expand the value and utility of the model plant species, as discussed under Recommendation 1, above. As a five-year goal, NPGI should provide tools for the successful deployment of improved crop cultivars from im- proved alleles discovered from comparative mapping. Integrating Informatics Resources for Breeders, Evolutionary Biologists, and Ecologists The importance of informatics cannot be overstated in any genomics program. Informatics will become increasingly important as plant biologists attempt to push the fundamental findings of NPGI toward applications that benefit human welfare. The last five years have seen advances and even mergers in the development of plant sequence-based and breeder-based databases. Although there is much more to achieve, the recent advances have been empowering to plant scientists. The de- velopment and implementation of software programs that clarify the relatedness of individuals within a study population and enable identification of associations between DNA markers and phenotypic traits have advanced efforts to find the

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 89 genes underlying important phenotypes in the U.S. crop germplasm collections. In addition, these research advancements are now being applied to study the genetic makeup of natural populations in the environment. Plant genomic databases have become standard tools of genomic researchers, and the advances in translational genomics have enhanced their utility to crop breeders and environmental scientists. On one hand, the ability to compare DNA sequences across plant species is essential to leverage information from one spe- cies to another. On the other hand, plant breeders need to be able to navigate from sequences to genes to phenotypes within a single species. Although many plant genomic databases contain similar sorts of information, the way this information is accessed by genomicists and breeders is often different. To address the needs of diverse users, the most effective and efficient approach might be for database developers to create different entry portals for a unified database. Such entry por- tals would provide scientists of various backgrounds a user-friendly interface that enables direct access to relevant data (for example, multiple ways to access and manipulate the data well). For crop breeders, the ability to navigate from maps to genes to traits to plant accession by augmenting existing databases with phenotypic data (to include expression data QTL) and genotypic data is important. Enabling Plant Improvement Through Plant Transformation NPGI has not made a significant investment in the development of transgenic tools to support functional and translational genomics. Many of the advances that will soon be possible as a result of the rapidly advancing knowledge of plant ge- nomics will require the ability to routinely, cost-effectively insert modified genes, or replace alleles, in a number of different plant species for research or for deploy- ment in breeding populations. Effective transgenic tools will, at the minimum, be required to rigorously test the function of genes identified via mapping or associa- tion studies. Beyond basic descriptions of gene expression, transgenic methods are generally the most powerful means for determination of gene function due to the wide variety of modifications to expression that can be introduced and studied. The practical exploitation of many of the advances made possible by genomics will be best carried out by introducing a desired trait in a highly specific way without the limitations of generation time or the feasible size of breeding populations. Trans- formation is especially valuable in outcrossing and long-lived crops with complex genomes, such as trees and grasses important for bioenergy, because of their long generation times and intolerance of inbreeding. Transformation does not need to be used solely for wide transfers of genetic information, which are the applications that elicit the most social concern. It can also be used to modify native genes or patterns of gene expression, and to increase

90 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e trait diversity in highly bred or elite clonal varieties. Certain technologies—such as “all-native transformation” (Rommens 2007) that relies on plant-derived promot- ers, plant genes conferring traits of interest, marker-free integration, and “P-DNA” sequences sufficiently similar to T-DNA borders to mediate DNA delivery—are among the relevant new transformation technologies that may find higher levels of social acceptance among the wider public, and especially the organic and sustain- able agriculture communities that have mostly rejected transgenic technologies to date. Artificial Chromosomes: Stacked Traits in a Breeding Block Current transgenic techniques are ill-suited to the engineering of complex pathways and multigenic traits because they can introduce only few genes at a time. Moreover, these genes are inserted into the host chromosomes creating un- predictable effects for both host gene function and the function of the transgene. Plant artificial chromosomes (or minichromosomes) offer the potential to engineer entire pathways and stack independent traits efficiently in one, easily controlled autonomous DNA molecule that does not integrate into the host chromosomes. Similar technologies in yeast and bacteria (yeast artificial chromosomes and BACs) were revolutionary. Engineered minichromosomes, made through de novo artificial chromosome construction or by natural formation (Yu et al. 2007), will provide a similar quantum leap forward for plant engineering. They will enable the delivery of gene stacks to improve biofuel feedstocks; to produce food crops that are resis- tant to multiple diseases, pests, and abiotic stress; and to generate plants capable of synthesizing molecules useful for medicinal or pharmaceutical applications. Realizing these potentials depends on developing minichromosomes in multiple plant species that can be easily engineered and function with high fidelity. That development will require a combination of practical plant biotechnology and basic research into the mechanisms of chromosome structure and function. Goals for Translation (Recommendation 6) 5-year goals • Broaden genomic DNA sequencing survey to as many crop species as possible. • Expand marker development, SNP marker discovery, and develop polymor- phism tools that can be deployed in multiple species at low cost. • Develop novel approaches to mine unique alleles from existing germplasm.

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 91 • Deploy improved crop cultivars from improved alleles that were discovered as the result of comparative mapping between model organisms and crops. • Test new approaches to improve transformation and allele-replacement capabilities. • Amass and analyze temporal and developmental catalogues of mRNA and small RNA populations under different growth and stress conditions. • Conduct feasibility studies on the creation of a few of centers where stan- dardized, efficient, high-throughput phenotyping technologies can be applied to several species. 10-year goals • Develop novel biomarkers that accurately report the physiological status of the plant in the context of its environment in real time. • Develop crop plant genomics databases that easily access complete genotypic and phenotypic data in the field. • Establish public field evaluation centers to integrate these data from bio- chemistry to agronomy. • Create data-mining and modeling approaches for dissecting complex trait architectures, and visualization tools for understanding these architectures. • Create rapid field-based phenotyping approaches and field-based physi- ological sensors that include remote sensing of individual plant performance. • Establish models that accurately predict agronomic phenotype from genotype. • Demonstrate efficient site-directed transformation and allele-replacement capabilities in a limited number of species • Develop artificial and replacement chromosome technology for major crop species using alleles of specific genes for specific traits. • Develop containment mechanisms for transgenes and artificial chromo- somes that function across a range of species. 20-year achievements • Accessible, useful, integrated informatics resources and tools will be available and used by breeders, biotechnologists, and environmental scientists. • Artificial chromosomes and transformation will be used routinely as a pre- cise mutagenesis and gene transfer tool, in combination with a customized suite of containment methods when needed, in a wide variety of crop species. • Plant biologists will have the capability to produce a plant with specific performance characteristics.

92 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e • Synthetic research and crop plants will be widely disseminated. • Land and air based remote sensing of biomarkers (phenomics) will be de- ployed for the majority of world’s key crops, at experimental sites and wide range of selected niches—fully automated and satellite controlled. Informatics, the Mesh That Envelopes Plant Genomics RECOMMENDATION 7: Develop and deploy sustainable, adaptable, in- teroperable, accessible, and evolvable computational tools to support and enhance Recommendations 1–6. NPGI’s future success relies on the development of the computational methods, tools, and databases that enable the integration of disparate data from multiple technology platforms and geographically distributed laboratories. These methods would provide a wide variety of means for collecting and collating data, create new visualization paradigms, and enable researchers to quickly integrate new analytical methods. The successful implementation of computational infrastructure will en- sure that every genomic effort has useful payoffs that are beyond the original goals of individual projects by providing an overarching structure to access, manipulate, and view the data flowing out of each endeavor. Multiple perspectives are necessary to generate a complete picture of the in- formatics needs. This section is divided into three views on the challenges: data collection and exchange, data management, and organizational requirements that provide a means of involving the plant genome community. The common thematic element of the three views is the importance of open standards and specifications that are widely published and are developed through community efforts. Examples of current standardization efforts that should be supported and emulated are the Gene Ontology, the Plant Associated Microbes Gene Ontology (PAMGO, a do- main-specific contributor to the Gene Ontology), the Open Biological Ontologies, and Plant Ontology efforts. The integration of knowledge and the interoperability of tools are only possible through a sustained commitment to standardization. Standardization ensures the reproducibility of computational analyses, and a high degree of confidence is critical for the further accrual of knowledge. It also reduces the cost and risks to plant genome research by removing the reliance on individual developers and suppliers, providing more flexibility in software development, en- hancing the integration of systems, and investing the ownership of the informatics in individual members of the community. One role of NPGI will be to support standardization efforts because they benefit the entire plant research community.

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 93 Data Collection and Exchange Pattern recognition, from which inferences are drawn and hypotheses are formed for subsequent experimental testing, is best achieved from uniformly gathered, high-quality data. Improved informatics technology does not necessarily alter the process by which the information is gathered, but it improves the speed and reliability of inference generation from larger and larger volumes of data. Data typically flow from their original collection and processing to an organized reposi- tory where they can be retrieved and analyzed. Hence, technological improvements can enable scientists to find important connections between data that previously were maintained in isolation. Collection Broadly speaking, the resolution of the information collected correlates with the experimental approach; observation of individual samples provides highly nuanced and detailed information over a limited number of cases, whereas high- throughput studies in which multiple samples are examined concurrently cover a larger portion of the genome but with a certain loss of detail. NPGI aims to integrate these strategies so that research data are combined to generate highly detailed information across the breadth of entire genomes. The term “well-anno- tated” infers three characteristics. First, the annotation comprehensively covers the entire range of genomic features from protein-coding genes to functional RNAs to regulatory regions to binding sites. Second, the annotation is complete: That is, every occurrence of every different feature type is identified and precisely and accurately located. Third, all of the ancillary biological data associated with these features (such as phenotypic descriptions, time of expression, location of expres- sion, and so forth) are richly described. Because the genome sequence provides the backbone resource on which all other resources depend, its utility depends on it being as comprehensive, complete, precise, and detailed as possible. Ideally these data should be captured at the source, including as much biologi- cal context as possible and integration with existing data sets in real time; delay often leads to omissions and inaccuracies. Portable genotyping devices are on the foreseeable horizon. These devices will incorporate cameras, image recognition software that allow the specimen to be compared to similar plants, capability to access locally installed and remote databases, and global positioning systems so that the location of a sample is automatically collected. Tools and standards need to be developed that allow pan-genomic data to be collected in such a way to enable data integration that is accessible—visual, easily manipulated, and quantitative.

94 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e Exchange There are many technical challenges to data exchange. For certain key compo- nents, after developing and testing alternative methods, there was adoption of an effective community standard that provided the solution. Widely distributed data are only accessible if there are transport mechanisms in place that use standard protocols for retrieving information. The success of the Internet is based upon the universal acceptance of a single transport mechanism as the standard—the trans- mission control protocol and Internet protocol. Similarly, standardization upon common query languages such as structured query language has made efficient query of immense datasets possible. Convergence on shared data compression techniques has provided the speed and performance needed to efficiently retrieve voluminous data sets from remote locations. Common syntaxes, such as extensible markup language, allow basic syntactic parsing of the data that is collected. The outstanding challenge is to interpret the data semantically, which requires the de- velopment of a shared descriptive language to enable the data integration needed for plant genome research. All stakeholders need to be involved in the development of these standards. Standards cannot be dictated by the developers, but they have to evolve based on real-world usage. In their development, the standards need to be applied to actual data sets being generated by researchers. Data Management Databases permeate the plant genome community. They have three important roles: for information management and analyses, as community resources, and as archives. Databases are invaluable for laboratory information management and ex- perimental analysis. High-throughput experiments and geographically distributed projects would be difficult without a database component for managing the data that is generated. Databases also provide a centralized community resource (for example, TAIR or Gramene databases). Laboratory databases need to have a clearly articulated description of how and when the transfer of data into the community repositories will be achieved. Archival databases are distinguished from “in use” databases in that the latter use the most accurate information currently available to draw inferences for building experimental hypotheses. Archival databases typi- cally include all the data as a historical repository to understand the path to the current best available data. Hence, the most useful community databases, such as TAIR and Gramene, clearly indicate how they distinguish between archival and current data.

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 95 Organizational Requirements and Involving the Community To ensure adoption, it is essential that plant biologists direct the building of the bioinformatics resources, tools, and standards. Without such involvement even the most computationally sophisticated tools will not likely be used. Successful implementation of informatics technology for plant genomics would require the technology to be: • Sustainable. A mechanism for maintaining a useful knowledge environment over time is necessary. • Adaptable. The informatics resource needs to continuously absorb and incorporate new knowledge in its subject domain. • Interoperable. The components of the system have to be easily integrated with each other, so that new knowledge can be gained by comparative studies across multiple data forms and levels of biological organization. • Accessible. Information and annotations need to be interpretable by a wide variety of plant scientists, breeders, and students with basic informatics training, not just genomic scientists. • Evolvable. The informatics resource needs to quickly adapt to the require- ments of the plant genomics research community and these requirements would be the prime selection pressure on the technology. The large amount of information housed in genomics databases and the expected explosion in data generate pressures on the organization of research to effectively mine that data. The plant research community will have to place greater emphasis on integrating bioinformatics approaches into its work. The committee proposes a national strategy for bioinformatics that includes training, collaboration with large data centers, and bioinformatics-oriented research, such as the creation of specialized databases, new analytic tools, and semantic standards. Goals for Informatics (Recommendation 7) 5-year goals • Develop and adopt open semantic standards and data exchange specifica- tions for the plant genomics community. • Develop software utilities for using these standards to describe data as they are collected initially in individual laboratories, including the use of widely under- stood terms and language to promote accessibility. • Develop an architectural infrastructure so that each project could explain how it fits into that architecture.

96 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e • Develop new informatics tools for the integration of massive data sets of diverse data types, including genomes, pathways, phenotypes, physiology, and environmental context. 20-year achievements • Plant biologists will have comprehensive knowledge environment for plant genomics that comprises highly detailed biological information across the breadth of entire genomes. • Data-capturing tools that are fully integrated with the knowledge generation by the researchers who collect the original data will be available; the tools will have a seamless operation from data generation to data management. EDUCATION AND OUTREACH Education RECOMMENDATION 8: Improve the recruitment of the best broadly trained scientists into plant sciences. In the last decade, public awareness of the dramatic advances in biology has focused primarily on medical applications and controversies. The high profile of the Human Genome Project and explorations of medically relevant questions have overshadowed progress in plant genomics, at least in the eye of the public. Attract- ing new and diverse scientists to plant genomics will likely necessitate reaching out to students who might not have considered plant biology or genomics, or indeed biology at all, when they commenced their training. Converging challenges—environmental degradation and resource limita- tions, climate change, increasing demand for food, and demand for renewable energy sources—will require solutions that are based both in conservation and technology. The latter will require increased numbers of appropriately trained scientists over the next 20 years. While all STEM disciplines will provide some of the solutions to these global problems, plant science has a leading role to play. A well-known catalyst to attract highly skilled scientists to important and emerging research areas is funding, particularly at early career points. Graduate-level and postdoctoral fellowships are among the best mechanisms to recruit talented scien- tists into a field. It follows, therefore, that targeted Ph.D. and postdoctoral fellow- ship programs, within the context of NPGI, will infuse the plant science community with more and better young scientists to address key societal problems.

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 97 Given that plant genomic analysis is becoming increasingly interdisciplin- ary, new scientists should be given incentives to enter the field from computa- tional disciplines, such as computer science or statistics. For example, as genome sequencing expands to capture diversity and natural selection, more students with backgrounds in ecology may also potentially be drawn to plant genomics. Interna- tional cooperation will likely continue to play a significant role in translational ge- nomics, given the expanding repertoire of plant species for which genome sequence and genomic tools are available, coupled with the increasingly global nature of scientific inquiry, agricultural markets, and global concerns about sustainability. Despite the large number and expansive scope of education and outreach initiatives, metrics to measure their impact and success against their goals are essentially lacking. Although there is a pervasive expectation among reviewers and panelists that grant recipients would engage in some form of outreach activity, there is little expectation that the efficacy of that outreach will be formally assessed. De- veloping metrics to assess outreach initiatives would help funding agencies evalu- ate whether the investments were well-made. It is essential that funding agencies institute rigorous reporting requirements, mechanisms to track the longer-term career paths of trainees, and standardized metrics to assess outreach initiatives, so that they can evaluate whether investments were well made. NPGI should be a leader in education of interdisciplinary scientists. The recommended future directions for plant genomics, outlined above, will require a cadre of students who bring broad skills and knowledge to a wide-ranging set of problems. In particular, the next generation of plant genomics practitioners needs to be adept at computational and systems-levels approaches, and at least comfort- able with modern scripting languages. As the nature of research and the questions posed throughout the discipline shift dramatically to include data sets requiring sophisticated statistical analysis, future students will need the confidence to ap- proach biological phenomena quantitatively (Bialek and Botstein 2004). At the graduate and postdoctoral levels, the need for computational expertise currently often entails ad hoc arrangements to bring students trained in one of the disciplines “up to speed” in the other on a need-to-know basis, although meth- odological and cultural differences between the fields of biology and computer science pose significant challenges (Zauhar 2001). A variety of curricular models have evolved to bridge gaps in communication between biologists and computer scientists (Dyer and LeBlanc 2002; Gerstein et al. 2007; Pevzner 2004), and there has been rapid growth in the number of graduate and even undergraduate-level programs in computational biology and bioinformatics (Zauhar 2001). Centers and departments devoted to systems biology represent perhaps the most high- profile approach to integrating computer modeling, large-scale data analysis, and empirical biological research (Check 2003; Ideker 2004). At least in the short term,

98 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e however, those programs are unlikely to meet the dramatically rising demand for scientists with interdisciplinary expertise that includes bioinformatics (Zauhar 2001). The NRC report Rising Above the Gathering Storm (NRC 2007b) recommended increasing the number of U.S. citizens pursuing graduate study in “areas of national need” by funding 5,000 new graduate fellowships each year. NPGI should build mechanisms to ensure that the number of graduate and undergraduate students with rigorous training in both biological and quantitative approaches to plant genomics is sufficient to support a thriving research and development job environ- ment in both the public and private sectors. By leading with new opportunities for graduate support in bioinformatics and computational biology within the context of plant genomics, NPGI could bolster the image of plant science as an exciting alternative to the biomedical fields for ambitious and creative students. Students trained in engineering and computational sciences might represent an untapped resource whose skills and inclinations could make them valuable contributors to plant genomics. In particular, engineers are familiar with systems that behave imperfectly, and their systems-level perspective has already enabled important strides in modeling biological regulatory circuitry (Wiley et al. 2003). Collaborative relationships with faculty in engineering could lead to unique train- ing opportunities. NPGI researchers interested in establishing connections with colleagues in engineering might consider the emerging field of synthetic biology (Endy 2005) as a possible example of common ground. As a growing number of Ph.D. umbrella programs require a course in genomics and bioinformatics of all their students, it is incumbent upon NPGI-funded faculty members to insist on standards in their institutions’ training programs that will meet their research needs. NPGI-funded PIs could be encouraged to offer modules or other shared teaching formats in these courses. Eventually, most incoming graduate students will be able to fulfill require- ments in bioinformatics through their undergraduate education, and important inroads have already been made in developing bioinformatics curricular materials suitable for even introductory-level biology courses (Honts 2003; Campbell and Heyer 2007). However, until those fields trickle down to become standard course offerings at the undergraduate level, graduate programs will need to provide them to the incoming students. Summer internships are one path by which interested undergraduates can become acquainted with, and gain proficiency, in bioinformat- ics and computational biology. Table 3-2 lists several such programs. The NRC report BIO 2010 advocates encouraging all students to pursue inde- pendent research as early as possible in their career (NRC 2003). These research experiences reinforce, clarify, or increase students’ interest in postgraduate educa- tion (Lopatto 2004; Seymour et al. 2004) and can result in enhanced confidence in

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 99 TABLE 3-2  Representative REU Programs and Summer Internships for Undergraduates in Computational Biology, Bioinformatics, and Systems Biology Field Institution Website Computational and systems biology Iowa State University http://www.bioinformatics.iastate. edu/BBSI Computational and systems biology Massachusetts Institute of http://csbi.mit.edu/website/ Technology outreach_programs/summerintern Computational biology University of Connecticut http://www.nrcam.ucnh.edu/news/ Health Center positions.html#intern Bioinformatics/computational University of Maryland http://www.umbc.edu/SPCB/ biology Baltimore County Bioinformatics/genome science University of Southern http://cegs.cmb.usc.edu/ California academics/bigs/BIGS.html Computational biology/ University of Pittsburgh http://www.ccbb.pitt.edu/BBSI/ bioengineering/ index.htm Bioinformatics Bioinformatics/bioengineering Virginia Commonwealth http://www.vcu.edu/csbc/bbsi/ University Bioinformatics and computational Cold Spring Harbor Laboratory http://www.cshl.edu/URP/nsf~reu biology Computational genomics Kansas State University http://www.kddresearch.org/REU/ Summer-2003/announcement.html Fungal genomics and University of Georgia http://www.genetics.uga.edu/ computational biology undergrad_fgcb.html Bioinformatics Loyola University, Chicago http://reu.cs.luc.edu Systems biology Harvard http://sysbio.harvard.edu/csb/jobs/ undergraduate.html Bioinformatics California State University, Los http://instructional1.calstatela. Angeles edu/jmomand2/ Genomics/bioinformatics J. Craig Venter Institute http://www.jcvi.org/education/ internship.php Bioinformatics Greater Philadelphia http://www.gpba-bio.com/educ_ Bioinformatics Alliance internships.asp attributes related to “thinking and working like a scientist,” gains in communication and practical skills, and enhanced preparation for graduate school (Seymour et al. 2004). Students also acquire realistic insights into the process of scientific inquiry (Gafney 2001). However, undergraduate research experiences do not appear to attract significant numbers of previously uninterested students to a career that requires a postgraduate degree (Hunter et al. 2006; Lopatto 2004; Seymour et al.

100 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e 2004). One important caveat pertains to programs that recruit first-year students from underrepresented groups: Such programs might indeed stimulate a student’s interest in graduate school, because these students are among those least likely to have had exposure to the idea of graduate school as an option (Seymour et al. 2004). NPGI could promote, and then carefully monitor over time, the expansion of undergraduate research opportunities that result in an expanded and diverse plant genomics community, Students and faculty members alike cite the importance of dedicated mentor- ing as a key factor contributing to students’ positive responses to their research experience (Lopatto 2003). The primary mentors for many undergraduate research, though, are graduate students and postdoctoral fellows, who may have little or no experience in teaching or mentoring younger scientists, and who could benefit from a recently developed program (Handelsman et al. 2005) that has been validated for effectiveness at 11 institutions (Pfund et al. 2006). Introductory laboratory courses that engage students in interdisciplinary in- vestigations in plant sciences and genomics are another avenue to promote student interest in research at a time when their career choices are still relatively fluid. Students might assimilate new information more effectively through inquiry- based, collaborative activities than through traditional classroom learning alone (Wood and Gentile 2003). Inquiry-based pedagogical activities in genomics that address significant, novel questions are under development by single institutions (­Washington University 2005), by a consortium of small liberal arts colleges work- ing together with Columbia University’s Genome Sequencing Center (Carleton C ­ ollege 2007), and by JGI. The Howard Hughes Medical Institute has recently ini- tiated plans for a national genomics research course for undergraduate freshmen. Students from colleges and universities around the country will work collectively on the same research questions, sharing data and results (HHMI 2007). These in- novative programs illustrate ways that education can be more fully integrated into NPGI-funded research. However, it is absolutely vital that already overburdened PIs, or groups of PIs, receive sufficient extra funds, beyond those required to perform their research in an increasingly competitive funding environment, to devote dedicated personnel to these endeavors. For example, NPGI could consider establishing a new category of PIs dedicated to education, as pioneered by the Howard Hughes Medical Institute through its teaching investigators pro- gram. Large plant genomics centers should hire full-time outreach coordinators by appointing professional education managers. Initiatives to be organized at this level could include summer-long funded research internships for community col- lege and high school teachers who wish to develop inquiry-based activities that involve students in the practice of science. Dissemination of information about educational activities deemed successful, as assessed by rigorous outcomes-based metrics, should also be a higher priority for NPGI. One venue for sharing “what

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 101 works” is to hold sessions devoted to education at the large professional society meetings. By working more closely with the Botanical Society of America and the National Association of Biology Teachers, NPGI can reach thousands of members who teach at the undergraduate and precollege level. Plant Genomics, National Competitiveness, and International Collaboration International partnerships like those described in Chapter 2 provide oppor- tunities for U.S. researchers and students to gain valuable experience in a foreign research setting. In an increasingly global scientific arena, U.S. competitiveness will be enhanced by training a cadre of young scientists who understand the advantages of different research environments, the scope of fundamental issues such as food security, and the challenges to national security posed by agricultural constraints. Grand challenge programs that are truly visionary will likely be international in focus, and will require researchers who can design creative and productive pro- grams that are not limited by a single perspective. In this regard, NPGI could seek collaborative funding opportunities with various foundations that are concerned with global agricultural issues, as well other traditional international partners. The groundwork and personal connections that are needed to help structure successful international research programs are often fostered during the forma- tive years of a scientist’s career. Such international research networks harness the creative energy of a young, mobile generation of scientists and the economic power of the emerging economies of Asia (specifically China and India) and of the European Union, the United States, and Australia to provide training, education, and research infrastructure, and to ensure public access to data and information. These considerations suggest a need to increase the opportunity for international training, particularly for our graduate students. For all of the adopted education recommendations (see below), NPGI should build robust and peer-reviewed methods for assessment. Furthermore, IWG agencies should require all NPGI PIs to report the previous educational back- ground, citizenship, and subsequent career paths for every individual funded by an NPGI grant. NPGI needs to establish a mechanism to collect these data in a centralized location and a set of quantitative criteria by which goals for training can be articulated and measured against this dataset. Outreach RECOMMENDATION 9: Promote outreach on plant genomics and related issues that are critical to educating the American public on the value of genomics-based innovations.

102 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e Many research programs include components to reach out beyond the scientific community and emphasize the importance of increasing the public understand- ing of science. Outreach programs in plant genomics are important because end users—food consumers, breeders, farmers, and others—are likely to apply or use products and tools of plant genomic research if they understand the value and benefits of those products and tools, and their potential risks. As with education, NPGI should build robust and peer-reviewed methods for assessment of any adopted outreach recommendations. Because the goals of such activities will determine the metrics to be used to measure success, the goals of the education and outreach activities have to be clearly defined. For example, if one goal of workshops for K-12 teachers and summer internships for high school students is to broaden the targeted populations’ understanding about plant science, genomics, and biotechnology, the conduct of rigorous surveys of participants’ knowledge before and after each program is necessary to assess the impact of the workshop. Longer-term assessment could include occasional follow-up question- naires to document the broader impact of participation in the workshops on the science curriculum at the teachers’ home schools. For programs with an explicit focus on plant biotechnology, student attitudes about biotechnology could also be monitored before and after the activity or internship. Likewise, one common approach to K-12 outreach in science education is a short-term classroom visit or series of visits by a researcher. The visiting researchers might lead a hands-on activity or talk with the students about societal implica- tions of their research. The goals of such visits are to generate enthusiasm among students for science, improve the image of scientists, and promote science literacy (Laursen et al. 2007). There is little direct evidence that classroom visits achieve those goals. One qualitative assessment of a best-case “scientist in the classroom” program documented some measures of success and several benefits and some potential costs to the graduate students who participate in the program (Laursen et al. 2007). Scientists who wish to develop an outreach program might not know how to do so effectively and might be unfamiliar with existing resources that could guide them and prevent unnecessary duplication of efforts. In the face of increasing time pressures on principal investigators, graduate students, and postdoctoral fellows, there is little sense in researchers “reinventing the wheel” with respect to outreach and pre-college education (Dolan et al. 2004). Additional support from NPGI for personnel explicitly trained in outreach who help PIs and graduate students to define, achieve, and further their outreach goals, including outreach to extension and breeder groups, is critical for the translation of NPGI science into tangible benefits to society. As has been observed (Labov 2006), “the kinds of experiences (or lack thereof)

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 103 BOX 3-2 The Partnership for Research and Education in Plants The Partnership for Research and Education in Plants, Biology (PREP; Virginia Polytechnic Institute and State University 2006), funded by NSF’s Arabidopsis 2010 Project, NIH, and the American Society for Plant Biologists, allows high school students to contribute to real research projects. PREP has involved over 10,000 students, 54 teachers, and 26 scientists in six states. It is the brainchild of a biology teacher, a plant geneticist, and a faculty-level outreach coordinator working together (Dolan et al. 2004). High school students design experiments to characterize novel Arabidopsis mutants. The students collect and analyze data on growth and development of the plant lines, and report the data in an online notebook that facilitates interactions with the partner researchers (peers and professional scientists). PREP exemplifies at least three of the four principles of instructional design advocated by a recent NRC report on successful labora- tory exercises: 1) they are designed with clear learning outcomes in mind; 2) they integrate the learning of science content with learning about the process of science; and 3) they incorporate ongoing student reflection and discussion (NRC 2005b). in science that students encounter during their K-12 years will have direct conse- quences on what college-level instructors will be able to accomplish in their own classrooms and teaching laboratories.” By joining forces to expand implementa- tion of an existing program such as the Partnership for Research and Education in Plants (Box 3-2), rather than cobbling together a forced activity lacking a well- considered rationale, NPGI investigators could have a national impact on high school education. Bringing Genomics to the Sustainable, Local, and Organic Agriculture Communities The communities of small-scale and organic farmers are expanding in both numbers and in economic and political clout, driven by rising consumer demand for sustainable and locally grown food. This market sector is likely to grow, espe- cially if food transportation costs rise dramatically. Philosophical interest in plant genomics among these groups is likely to benefit from clear communication that genomic research is not necessarily tied to deployment of transgenic plants, and on tangible and relevant outcomes in the form of cultivars that are well suited for particular, local, and often low technological input, agricultural niches. The committee suggests that NPGI investigate creative mechanisms to translate its research into benefits for such growers. An example is the Public Seed Initiative at Cornell (Cornell University 2005), whose focus is on developing, maintaining, and distributing seeds for cultivars of fruits, vegetables, and grains adapted to the

104 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e needs of organic and local fresh-market growers. By continuing its current focus on applications of genomic tools to marker-assisted selection for diverse crops and traits, NPGI will also provide useful information to this grower community. For example, both large- and small-scale farmers benefit from markers that speed the development of cultivars with enhanced disease resistance. In addition, NPGI is poised to apply technological advances in metagenomics, metabolomics, and systems biology to characterize the complex interdependences among species that are considered important to various cultivation systems, including organic. For example, identification of microbial population structures and the nature of metabolites found in disease suppressive soil ecosystems will help to guide the development of agronomic practices that reduce the need for pesticide use. One interesting case could be investigation of permacultures. Permacultures have mini- mal need for fertilization, irrigation, or pesticide usage because ecological processes common to forestry ecosystems or agroecosystems are used to maximize the yield of edible species in perennial agroforestry and polyculture systems (Jacke 2005; Mollison 1988). To identify specific traits that are needed in new food crop cultivars, NPGI could take steps to engage small-scale farmers and relevant trade groups and to facilitate direct interactions between farmers, breeders, and researchers. This could entail interactions between genome scientists and producers to address genomics applications germane to this arena, and participation from producers to identify what portions of the genomics toolkit are most relevant to them and with respect to what traits, at extension events, county fairs, and local farmers’ markets. Since the officially sanctioned organic farming community has banned appli- cations involving human directed recombinant DNA manipulations (for example, genetically modified organisms [GMO]; AMS 2007), even the most well-meaning efforts to create common ground between genomic researchers and organic farm- ers could be derailed by negative grower or public perceptions that simplistically equate plant genomics with genetically engineered plants and/or proprietary tech- nologies owned by multinational corporations. By fostering a climate of enhanced understanding and promoting connections among researchers, farmers, small- scale seed producers, and nonprofit organizations, NPGI researchers might pave the way for acceptance among small scale growers of a variety of plant genomics technologies. Examples of a technology that NPGI might seek to further develop and com- municate as part of this effort are improved forms of “all-native” or “cisgenic” transformation (discussed earlier in this chapter). These basic approaches need to be made efficient in a variety of species, improved so that mutagenesis during gene transfer is minimized, and made more precise via gene targeting and allele replace- ment capability. They also need to be publicly accessible (that is, not dominated

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 105 by private sector patents) so that localized, plant variety- and region-specific use is feasible. Ethical, Legal, and Social Issues in Plant Genomics In the Human Genome Project (HGP), ethical, legal, and social issues (ELSI) related to human genomic data and related technologies were recognized to have major impacts on how genomic information would be used in biomedicine. ELSI research at the National Human Genome Research Institute (NHGRI) began in 1990 to understand the social implications of genetic and genomic research. Its orientation has been consciously proactive, in that it seeks to identify “. . . problem areas . . . and solutions . . . before scientific information is integrated into health care practice” (NHGRI 2007). The ELSI program accounts for more than $18 mil- lion of the annual $485 million HGP budget. In contrast, there has been little ELSI-related activity in plant genomics re- search. Despite initial plans to the contrary (NSTC 1998) and reemphasis (NSTC 2000), only a narrower objective was retained under broader impacts of the NPGI, stating that “research is needed to identify methods for more effective communica- tion with the general public” (NSTC 2003). To date, there has not been significant collaborative engagement with social scientists to conduct scholarly research on the causes and resolution of ELSI issues related to plant genomics (NSTC 1999, 2000, 2001, 2003, 2004, 2005, 2006, 2007). The lack of attention to ELSI programs from the NPGI is surprising in that it comes amidst growing controversies in the agricultural, forestry, and energy sectors about genetic technologies, particularly transgenic approaches, and in a political climate where public skepticism regard- ing the economics associated with government-subsidized ethanol production are becoming ever more important. Because of attendant costs and social controversies, issues that could be ad- dressed via ELSI research have effectively removed GMO tools for translation of genomic knowledge into useful products from all but the largest commodity crops and the largest agricultural companies, and in only a subset of countries. GMOs have served as a focal point for analysis of a large number of ELSI issues that are growing in significance for agriculture (Serageldin 1999); these may logically spread to encompass all of genomics-enabled breeding in the future. The limited attention to ELSI issues by NPGI may have impacted public per- ception of plant genomics and associated biotechnologies. In the acrimonious GMO debate, most of the NPGI-funded genomics research community has been conspicuously quiet, even when the debate concerns substantive genomics issues. This may have helped to create space for those with strong political views, but weak knowledge of plant science, to dominate the social discourse (Vasil 2003),

106 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e promoting confusion on the part of the media and public. The plant genomics community could provide context for understanding the impacts of GMOs, in comparison to the effects of accepted practices of breeding and domestication, on plant genomes. Because fundamental advances in knowledge of plant genomes are likely to empower increasingly novel, innovative uses of genomic informa- tion, the opportunity cost to society from its limited ability to use transgenic approaches is likely to grow rapidly. Outreach on ELSI topics is an issue that the NPGI needs to confront. A next generation of teachers and scientists who are trained in both plant genomics and ELSI issues could contribute to resolution of genomics-related social issues, and thus play a valuable role in guiding the development of scientifically sound regu- lations. The outcome could have profound consequences for deployment of the products of plant genomics, and on laws that govern international trade. Potential ELSI issues of interest to NPGI include those listed in Box 3-3. Goals for Education and Outreach (Recommendations 8 and 9) 5-year goals • Develop evidence-based metrics to assess educational and outreach programs. • Enhance opportunities for graduate and undergraduate students to become proficient in the theory as well as the practice of computational biology and bioin- formatics, through graduate fellowships and undergraduate research experiences. • Establish interdisciplinary graduate postdoctoral fellowships in plant ge- nomics with an option for international collaborations. This could be modeled on the success of the Arabidopsis 2010 Project: International Research Experience for Graduate Students and Postdoctoral Fellows that supports exchanges between U.S. and German laboratories. • Stimulate undergraduate student interest in plant genomics, especially among populations of students who might be less aware of research career opportu- nities, through expanded research opportunities with trained mentors and through integrated inquiry-based activities in undergraduate and precollege courses. • Develop well-designed educational activities that draw on the latest learn- ing theory research and devise mechanisms for educators to share these initiatives, by creating a new class of PIs dedicated to education and by funding professional education managers to coordinate outreach activities. • Expand the Plant Genome Research Outreach Portal (PGROP) to include a comprehensive collection of existing outreach programs, with evaluative informa- tion, and links to assessment tools.

R e c o m m e n dat i o n s and Goals: New Horizons in Plant Genomics 107 BOX 3-3 Ethical, Legal, and Social Issues Associated with Plant Genomics • Biological and social benefits and risks of plant modifications derived from genomics that use recombinant DNA, such as genetic engineering or GMO approaches. • Sustainability and biodiversity in the broad biological and social senses, and the extent to which genomics-aided breeding can aggravate or mitigate these concerns. • The appropriate role for formal social controls on intellectual property protection and regulation as related to income distribution, business development, and social cost and benefit tradeoffs. • The increasing controls over international germplasm movement and attendant concerns about biopiracy. These regulations may seriously hinder genomics based plant breeding and research progress in both the developed and developing worlds. • The extent and control of unintentional contamination of germplasm. Critical trans- lational research can be hampered—against a backdrop of stringent social intolerance—by dispersal and adventitious presence of foreign genes due to inadvertent pollen, seed, and vegetative dispersal from exotic genotypes, species, and transgenes. • Public education and outreach about the goals and rationale for genomics and related biotechnology research. Broad social approval of plant genomics deployment will be depen- dent on judgments of social and personal benefit in comparison to risk. (Hossain et al. 2003). • NPGI PIs should try to forge connections with engineers and computational scientists, with the goal of attracting students in these fields to plant genomics at the graduate level. • NPGI PIs should encourage changes in the undergraduate curriculum at their own institutions and participate in the reformation. PIs should also be en- couraged to participate in similar reforms in their institutional Ph.D. programs in genomics so that two courses in statistics and competence in a modern scripting language become standard requirements for advanced degrees. • Establish mechanisms to engage sustainable, organic, and small-scale farm- ers in identification of specific traits for which applications of genomic tools could lead to usable varieties with enhanced performance characteristics. 10-year goal • Expand training in ethical, legal, and social issues pertaining to plant genom- ics for K-12 and undergraduate students and teachers, and for NPGI predoctoral and postdoctoral stipend recipients.

108 Achievements of the N at i o na l P l a n t G e n o m e I n i t i at i v e 20-year achievements • Integration of research and education in plant genomics will rival that of biomedical genomics in creativity, in public profile, and in the ability to attract new students. The plant genomics community will provide leadership in contributions to- ward public outreach on ELSI issues, including engagement in development of sci- ence-based regulatory policies at national and international levels, by NPGI-funded programs and NPGI-trained students and postdoctoral associates.

Next: Glossary »
Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology Get This Book
×
 Achievements of the National Plant Genome Initiative and New Horizons in Plant Biology
Buy Paperback | $53.00 Buy Ebook | $42.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Life on Earth would be impossible without plants. Humans rely on plants for most clothing, furniture, food, as well as for many pharmaceuticals and other products. Plant genome sciences are essential to understanding how plants function and how to develop desirable plant characteristics. For example, plant genomic science can contribute to the development of plants that are drought-resistant, those that require less fertilizer, and those that are optimized for conversion to fuels such as ethanol and biodiesel. The National Plant Genome Initiative (NPGI) is a unique, cross-agency funding enterprise that has been funding and coordinating plant genome research successfully for nine years. Research breakthroughs from NPGI and the National Science Foundation (NSF) Arabidopsis 2010 Project, such as how the plant immune system controls pathogen defense, demonstrate that the plant genome science community is vibrant and capable of driving technological advancement. This book from the National Research Council concludes that these programs should continue so that applied programs on agriculture, bioenergy, and others will always be built on a strong foundation of fundamental plant biology research.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!