THE NEW SCIENCE OF METAGENOMICS

Revealing the Secrets of Our Microbial Planet

Committee on Metagenomics: Challenges and Functional Applications

Board on Life Sciences

Division on Earth and Life Studies

NATIONAL RESEARCH COUNCIL OF THE NATIONAL ACADEMIES

THE NATIONAL ACADEMIES PRESS

Washington, DC
www.nap.edu



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page R1
THE NEW SCIENCE OF METAGENOMICS Revealing t he S ecrets o f O ur M icrobial P lanet Committee on Metagenomics: Challenges and Functional Applications Board on Life Sciences Division on Earth and Life Studies THE NATIONAL ACADEMIES PRESS Washington, DC www.nap.edu

OCR for page R1
THE NATIONAL ACADEMIES PRESS 500 Fifth Street, NW Washington, DC 20001 NOTICE: The project that is the subject of this report was approved by the Govern- ing Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineer- ing, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropri- ate balance. This study was supported by Contract MCB—0544539 between the National Acad- emy of Sciences and the National Science Foundation (NSF), Contract N01-OD- 4-2139 between the National Academy of Sciences and the Department of Health and Human Services, National Institutes of Health (NIH), and Contract DE-AT01- 05ER64072 between the National Academy of Sciences and the Department of Energy (DOE). The content of this publication does not necessarily reflect the views or policies of NIH, NSF, or DOE, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government. International Standard Book Number-13: 978-0-309-10676-4 International Standard Book Number-10: 0-309-10676-1 Cover: Design by Francesca Moghari; artwork by Nicolle Rager Fuller (www. sayo-art.com). Additional copies of this report are available from the National Academies Press, 500 Fifth Street, NW, Lockbox 285, Washington, DC 20055; (800) 624-6242 or (202) 334-3313 (in the Washington metropolitan area); Internet, http://www. nap.edu. Copyright 2007 by the National Academy of Sciences. All rights reserved. Printed in the United States of America

OCR for page R1
The National Academy of Sciences is a private, nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Acad- emy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Ralph J. Cicerone is president of the National Academy of Sciences. The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engineers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineer- ing programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. Wm. A. Wulf is president of the National Academy of Engineering. The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Insti- tute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Harvey V. Fineberg is president of the Institute of Medicine. The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy’s purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Ralph J. Cicerone and Dr. Wm. A. Wulf are chair and vice chair, respectively, of the National Research Council. www.national-academies.org

OCR for page R1

OCR for page R1
COMMITTEE ON METAGENOMICS: CHALLENGES AND FUNCTIONAL APPLICATIONS JO HANDELSMAN (Cochair), University of Wisconsin, Madison JAMES TIEDJE (Cochair), Michigan State University, East Lansing LISA ALVAREZ-COHEN, University of California, Berkeley MICHAEL ASHBURNER, University of Cambridge, United Kingdom ISAAC K. O. CANN, University of Illinois, Urbana-Champaign EDWARD F. DeLONG, Massachusetts Institute of Technology, Cambridge W. FORD DOOLITTLE, Dalhousie University, Halifax, Nova Scotia, Canada CLAIRE M. FRASER-LIGGETT, University of Maryland School of Medicine, Baltimore ADAM GODZIK, Burnham Institute for Medical Research, La Jolla, CA JEFFREY I. GORDON, Washington University School of Medicine, St. Louis, MO MARGARET RILEY, University of Massachusetts, Amherst MOLLY B. SCHMID, Keck Graduate Institute, Claremont, CA Staff ANN H. REID, Study Director FRANCES E. SHARPLES, Director, Board on Life Sciences ANNE F. JURKOWSKI, Senior Program Assistant MERC FOX, Program Assistant NORMAN GROSSBLATT, Senior Editor v

OCR for page R1
BOARD ON LIFE SCIENCES KEITH YAMAMOTO (Chair), University of California, San Francisco ANN M. ARVIN, Stanford University School of Medicine, Stanford, CA JEFFREY L. BENNETZEN, University of Georgia, Athens RUTH BERKELMAN, Emory University, Atlanta, GA DEBORAH BLUM, University of Wisconsin, Madison R. ALTA CHARO, University of Wisconsin, Madison JEFFREY L. DANGL, University of North Carolina, Chapel Hill PAUL R. EHRLICH, Stanford University, Stanford, CA MARK D. FITZSIMMONS, John D. and Catherine T. MacArthur Foundation, Chicago, IL JO HANDELSMAN, University of Wisconsin, Madison ED HARLOW, Harvard Medical School, Boston, MA KENNETH H. KELLER, University of Minnesota, Minneapolis RANDALL MURCH, Virginia Polytechnic Institute and State University, Alexandria GREGORY A. PETSKO, Brandeis University, Waltham, MA MURIEL E. POSTON, Skidmore College, Saratoga Springs, NY JAMES REICHMAN, University of California, Santa Barbara MARC T. TESSIER-LAVIGNE, Genentech, Inc., South San Francisco, CA JAMES TIEDJE, Michigan State University, East Lansing TERRY L. YATES, University of New Mexico, Albuquerque Staff FRANCES E. SHARPLES, Director KERRY A. BRENNER, Senior Program Officer ANN H. REID, Senior Program Officer MARILEE K. SHELTON-DAVENPORT, Senior Program Officer EVONNE P. Y. TANG, Senior Program Officer ROBERT T. YUAN, Senior Program Officer ADAM P. FAGEN, Program Officer ANNA FARRAR, Financial Associate ANNE F. JURKOWSKI, Senior Program Assistant TOVA JACOBOVITS, Senior Program Assistant MERC FOX, Program Assistant vi

OCR for page R1
Acknowledgments This report has been reviewed in draft form by persons chosen for their diverse perspectives and technical expertise in accordance with procedures approved by the National Research Council’s Report Review Committee. The purpose of the independent review is to provide candid and critical comments that will assist the institution in making the published report as sound as possible and to ensure that the report meets institutional standards of objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process. We wish to thank the following for their review of the report: Gary Anderson, Lawrence Berkeley National Laboratory, Berkeley, CA Jeffrey Dangl, University of North Carolina, Chapel Hill Julian E. Davies, University of British Columbia, Vancouver, BC, Canada Jed Fuhrman, University of Southern California, Los Angeles Dennis Mangan, University of Southern California School of Dentistry, Los Angeles Victor Markowitz, Lawrence Berkeley National Laboratory, Berkeley, CA Randall Murch, Virginia Polytechnic Institute and State University, Blacksburg Norman R. Pace (NAS), University of Colorado, Boulder David Relman, Stanford University, Stanford, CA Edward Rubin, Lawrence Berkeley National Laboratory, Berkeley, CA George Weinstock, Baylor College of Medicine, Houston, TX vii

OCR for page R1
viii ACKNOWLEDGMENTS Although the reviewers listed above have provided constructive com- ments and suggestions, they were not asked to endorse the conclusions or recommendations, nor did they see the final draft of the report before its release. The review of the report was overseen by John Wooley, University of California, San Diego. Appointed by the National Research Council, he was responsible for making certain that an independent examination of this report was carried out in accordance with institutional procedures and that all review comments were carefully considered. Responsibility for the final content of the report rests entirely with the author committee and the institution. The committee benefited from briefings provided by several speakers. At its second meeting, on May 2, 2006, the committee was briefed by: Michael Gray (by telephone), Professor and Department Head, Canada Research Chair in Genomics and Genome Evolution, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada; Mitchell Sogin, Senior Scientist and Director of the Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, The Woods Hole Biological Laboratory, Woods Hole, MA; and Robert Edwards, San Diego State University and Burnham Institute, San Diego, CA. At its third meeting, on July 27, 2006, the committee was briefed by: David J. Lipman, Director, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Rockville, MD; Rolf Apweiler, Head of Sequence Database Group, European Bioinformatics Institute, Cambridge, UK; Victor Markowitz, Head of Lawrence Berkeley National Lab’s Biological Data Management and Technology Center, Berkeley, CA; Paul Gilna, Executive Director, CAMERA, San Diego, CA; and Amaranth Gupta, Associate Research Scientist, Director Advanced Query Process- ing Lab, San Diego Supercomputer Center, University of California, San Diego. The committee extends heartfelt thanks to Ann Reid who served as Study Director for this report. The product reflects both Ann’s attention to our charge and her ability to provoke us into addressing it thoroughly. Her outstanding editing contributed greatly to the clarity and logic of the report. We also thank Anne Jurkowski for her dedication to this report and its authors. Throughout the process, the committee relied on Anne’s administrative prowess and her willingness to do whatever was necessary to get the report done or the committee on track. Anne’s aesthetic intuition and visual acuity shaped the report as well as its derivative materials. We thank Dr. Patrick Schloss for his assistance in building the meta- genomics bibliography and Dr. Luke Moe, Snow Brook Peterson, and Dr. Ainslie Little for helpful discussion and Christina Matta for assuring historical accuracy.

OCR for page R1
Contents SUMMARY 1 1 WHY METAGENOMICS? 12 What Is Metagenomics?, 13 What Microbes Can Do: Four Examples, 15 Microbes Modulate and Maintain the Atmosphere, 15 Microbes Keep Us Healthy, 17 Microbes Support Plant Growth and Suppress Plant Disease, 18 Microbes Clean Up Fuel Leaks, 19 Invisible Communities: Global Impact, 19 Understanding Microbial Communities, 21 The Limits of Pure Culture, 21 The Genomics Promise, 23 Why Genomics Is Not Enough, 25 Most Microbes Cannot Be Cultured, 25 Microbial Diversity and Variation Have No Limits, 27 Metagenomics Offers a Way Forward, 29 Metagenomics Can Contribute to Advances in Many Fields, 31 2 A NEW LIGHT ON BIOLOGY 33 What Is a Genome?, 33 What Is a Species?, 35 What Is the Role of Microbes in Maintaining the Health of Their Hosts?, 37 How Diverse Is Life?, 38 ix

OCR for page R1
x CONTENTS How Do Microbial Communities Work?, 40 How Do Microbial Communities React to Change?, 43 How Do Microbes Evolve?, 44 What Ecological and Evolutionary Roles Do Viruses Play?, 46 3 FROM GENOMICS TO METAGENOMICS: FIRST STEPS 47 Sequencing Is Just One Kind of Metagenomics, 48 Pioneering Projects in Metagenomics, 50 The Acid Mine Drainage Project, 50 The Sargasso Sea Metagenomic Survey and Community Profiling, 53 The Soil-Resistome Project, 55 The Human-Microbiome Project, 57 Viral Metagenomics, 58 4 DESIGNING A SUCCESSFUL METAGENOMICS PROJECT: BEST PRACTICES AND FUTURE NEEDS 60 Parallels with Traditional Microbial Genome Sequencing, 60 Metagenomics Step by Step, 63 Habitat Selection, 63 Sampling Strategy, 64 Macromolecule Recovery, 65 Getting the Most Out of Metagenomics Studies, 67 16S rRNA-Based Surveys, 67 16S rRNA Phylogenetic and Functional Anchors: A Hybrid Approach, 70 Generation of Large-Scale DNA Sequence, 70 Assembling Whole Genomes, 71 Gene-Centric Analyses, 73 Hybridization- and Array-Based Analyses, 74 Function-Based Analyses of Microbial Communities, 76 Advancing the Field, 77 Sequencing Technology, 77 Gene-Expression Systems, 79 Single-Cell Analyses, 80 Methods for Culturing Uncultured Species, 82 Basic Microbiology, 83 Understanding Microbial Habitats and Collecting Metadata, 83 Downstream Development of Metagenomics, 84

OCR for page R1
xi CONTENTS 5 DATA MANAGEMENT AND BIOINFORMATICS CHALLENGES OF METAGENOMICS 85 Genomic Data, 85 Metagenomic Data, 88 The Importance of Metadata, 90 Databases for Metagenomic Data, 92 Software, 94 Analysis of Metagenomic Sequence Data, 95 6 THE INSTITUTIONAL LANDSCAPE FOR METAGENOMICS: NEW SCIENCE, NEW CHALLENGES 98 Major Stakeholders in Metagenomics, 98 The Scientific Community, 98 Funding Agencies, 98 International Coordination, 99 Education and Training, 100 Other Institutional Issues, 102 Data Release, 102 Intellectual Property, 103 Metagenomics and the Convention on Biological Diversity, 104 Biosafety, 105 Outreach, 106 7 A BALANCED PORTFOLIO: MULTI-SCALE PROJECTS IN THE “GLOBAL METAGENOMICS INITIATIVE” 107 The Vision, 107 Characteristics of Successful Large-Scale Projects, 108 Why Metagenomics Needs a “Big Science” Component, 109 What Kind of Large-Scale Projects in the Global Metagenomics Initiative and How Many?, 112 Expected Benefits of Large-Scale Metagenomics Projects, 113 Theory and Principles, 113 Understanding Specific Habitats, 114 Technical Advancement of the Field, 114 International Collaboration and Training, 115 Learning from Previous Large-Scale Genomics Projects, 115 The Human Genome Project, 116 The Arabidopsis Genome Project, 117 Lessons for Metagenomics, 118

OCR for page R1
xii CONTENTS A Preliminary Road Map, 118 Phase I: Choosing Model Communities, 118 Phase II: Planning and Initial Data-Gathering, 120 Phase Phase III: Implementation, 122 Conclusion, 122 8 RECOMMENDATIONS 124 9 EPILOGUE 134 REFERENCES 144 APPENDIXES A Statement of Task 151 B Committee Biographies 152