1
Introduction and Context

Supercomputers are used to solve complex problems, including the simulation and modeling of physical phenomena such as climate change, explosions, and the behavior of molecules; the analysis of data such as national security intelligence, genome sequencing, and astronomical observations; and the intricate design of engineered products. Their use is important for national security and defense, as well as for research and development in many areas of science and engineering. Supercomputers can advance knowledge and generate insight that would not otherwise be possible or that could not be captured in time to be actionable. Supercomputer simulations can augment or replace experimentation in cases where experiments are hazardous, expensive, or even impossible to perform or to instrument; they can even enable virtual experiments with imaginary worlds to test theories beyond the range of observable parameters. Further, supercomputers have the potential to suggest entirely novel experiments that can revolutionize our perspective of the world. They enable faster evaluation of design alternatives, thus improving the quality of engineered products. Most of the technical areas that are important to the well-being of humanity use supercomputing in fundamental and essential ways.

As the uses of computing have increased and broadened, supercomputing has become less dominant than it once was. Many interesting applications require only modest amounts of computing, by today’s standards. Yet new problems have arisen whose computational demands for scaling and timeliness stress even our current supercomputers. Many of



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 11
Getting up to Speed the Future of Supercomputing 1 Introduction and Context Supercomputers are used to solve complex problems, including the simulation and modeling of physical phenomena such as climate change, explosions, and the behavior of molecules; the analysis of data such as national security intelligence, genome sequencing, and astronomical observations; and the intricate design of engineered products. Their use is important for national security and defense, as well as for research and development in many areas of science and engineering. Supercomputers can advance knowledge and generate insight that would not otherwise be possible or that could not be captured in time to be actionable. Supercomputer simulations can augment or replace experimentation in cases where experiments are hazardous, expensive, or even impossible to perform or to instrument; they can even enable virtual experiments with imaginary worlds to test theories beyond the range of observable parameters. Further, supercomputers have the potential to suggest entirely novel experiments that can revolutionize our perspective of the world. They enable faster evaluation of design alternatives, thus improving the quality of engineered products. Most of the technical areas that are important to the well-being of humanity use supercomputing in fundamental and essential ways. As the uses of computing have increased and broadened, supercomputing has become less dominant than it once was. Many interesting applications require only modest amounts of computing, by today’s standards. Yet new problems have arisen whose computational demands for scaling and timeliness stress even our current supercomputers. Many of

OCR for page 11
Getting up to Speed the Future of Supercomputing those problems are fundamental to the government’s ability to address important national issues. One notable example is the Department of Energy’s (DOE’s) computational requirements for stockpile stewardship. The emergence of mainstream solutions to problems that formerly required supercomputing has caused the computer industry, the research and development community, and some government agencies to reduce their attention to supercomputing. Recently, questions have been raised about the best ways for the government to ensure that its supercomputing needs will continue to be satisfied in terms of both capability and cost-effectiveness. At the joint request of the DOE’s Office of Science and the Advanced Simulation and Computing1 (ASC) Program of the National Nuclear Security Administrations (NNSA) at DOE, the National Research Council’s (NRC’s) Computer Science and Telecommunications Board convened the Committee on the Future of Supercomputing to conduct a 2-year study to assess the state of supercomputing in the United States. Specifically, the committee was charged to do the following: Examine the characteristics of relevant systems and architecture research in government, industry, and academia and the characteristics of the relevant market. Identify key elements of context such as the history of supercomputing, the erosion of research investment, the needs of government agencies for supercomputing capabilities, and historical or causal factors. Examine the changing nature of problems demanding supercomputing (e.g., stockpile stewardship, cryptanalysis, climate modeling, bioinformatics) and the implications for systems design. Outline the role of national security in the supercomputer market and the long-term federal interest in supercomputing. Deliver an interim report in July 2003 outlining key issues. Make recommendations in the final report for government policy to meet future needs. STUDY CONTEXT Much has changed since the 1980s, when a variety of agencies invested in developing and using supercomputers. In the 1990s the High Performance Computing and Communications Initiative (HPCCI) was conceived and subsequently evolved into a broader and more diffuse pro- 1   ASC was formerly known as the Accelerated Strategic Computing Initiative (ASCI). This report uses ASC to refer collectively to these programs.

OCR for page 11
Getting up to Speed the Future of Supercomputing gram of computer science research support.2 Over the last couple of decades, the government sponsored numerous studies dealing with supercomputing and its role in science and engineering research.3 Following the guidelines of the Report of the Panel on Large Scale Computing in Science and Engineering (the Lax report),4 the National Science Foundation (NSF) established three supercomputer centers and one advanced prototype in 1985 and another center in 1986. Major projects on innovative supercomputing systems were funded—for instance, the Caltech Cosmic Cube, the New York University (NYU) Ultracomputer, and the Illinois Cedar project. The other recommendations of the report (to increase research in the disciplines needed for an effective and efficient use of supercomputers and to increase training of people in scientific computing) had only a modest effect. Following the renewal of four of the five NSF supercomputer centers in 1990, the National Science Board (NSB) commissioned the NSF Blue Ribbon Panel on High Performance Computing to investigate future changes in the overall scientific environment due to rapid advances in computers and scientific computing.5 The panel’s report, From Desktop to Teraflop: Exploiting the U.S. Lead in High Performance Computing (the Branscomb report), recommended a significant expansion in NSF investments, including accelerating progress in high-performance computing through computer science and computational science research. The impact of these recommendations on funding was small. In 1991 Congress passed the High Performance Computing Act (P.L. 102-194),6 which called for the President to establish a national program to set goals for federal high-performance computing research and development in hardware and software and to provide for interagency cooperation. 2   The proliferation of PCs and the rise of the Internet commanded attention and resources, diverting attention and effort from research in high-end computing. There were, however, efforts into the 1990s to support high-performance computing. See, for example, NSF, 1993, From Desktop to Teraflop: Exploiting the U.S. Lead in High Performance Computing, NSF Blue Ribbon Panel on High Performance Computing, Arlington, Va.: NSF, August. 3   The committee’s interim report provides a more detailed summary of several key reports. 4   National Science Board. 1982. Report of the Panel on Large Scale Computing in Science and Engineering. Washington, D.C., December 26 (the Lax Report). 5   NSF. 1993. From Desktop to Teraflop: Exploiting the U.S. Lead in High Performance Computing. NSF Blue Ribbon Panel on High Performance Computing. Arlington, Va.: NSF, August. 6   Bill summary and status are available online at <http://thomas.loc.gov/cgi-bin/bdquery/R?d102:FLD002:@1(102+194)>.

OCR for page 11
Getting up to Speed the Future of Supercomputing NSF formed a task force in 1995 to advise it on the review and management of the supercomputer centers program. The chief finding of the Report of the Task Force on the Future of the NSF Supercomputer Centers Program (the Hayes report)7 was that the Advanced Scientific Computing Centers funded by NSF had enabled important research in computational science and engineering and had also changed the way that computational science and engineering contribute to advances in fundamental research across many areas. The recommendation of the task force was to continue to maintain a strong Advanced Scientific Computing Centers program. Congress asked the NRC’s Computer Science and Telecommunications Board (CSTB) to examine the HPCCI.8 CSTB’s 1995 report Evolving the High Performance Computing and Communications Initiative to Support the Nation’s Infrastructure (the Brooks/Sutherland report)9 recommended the continuation of the HPCCI, funding of a strong experimental research program in software and algorithms for parallel computing machines, and HPCCI support for precompetitive research in computer architecture. In 1997, following the guidelines of the Hayes report, NSF established two Partnerships for Advanced Computational Infrastructure (PACIs), one with the San Diego Supercomputer Center as a leading-edge site and the other with the National Center for Supercomputing Applications as a leading-edge site. Each partnership includes participants from other academic, industry, and government sites. The PACI program ended on September 30, 2004. The reports did not lead to increased funding, and no major new projects resulted from the recommendations of the Brooks/ Sutherland report. In 1999, the President’s Information Technology Advisory Committee’s (PITAC’s) Report to the President. Information Technology Research: Investing in Our Future (the PITAC report) made recommendations 7   NSF. 1995. Report of the Task Force on the Future of the NSF Supercomputer Centers Program. September 15. 8   HPCCI was formally created when Congress passed the High-Performance Computing Act of 1991 (P.L. 102-194), which authorized a 5-year program in high-performance computing and communications. The goal of the HPCCI was to “accelerate the development of future generations of high-performance computers and networks and the use of these resources in the federal government and throughout the American economy” (Federal Coor-dinating Council for Science, Engineering, and Technology (FCCSET), 1992, Grand Challenges: High-Performance Computing and Communications, FY 1992 U.S. Research and Development Program, Office of Science and Technology Policy, Washington D.C.). The initiative broadened from four primary agencies addressing grand challenges such as forecasting severe weather events and aerospace design research to more than 10 agencies addressing national challenges such as electronic commerce and health care. 9   NRC. 1995. Evolving the High Performance Computing and Communications Initiative to Support the Nation’s Infrastructure. Washington, D.C.: National Academy Press.

OCR for page 11
Getting up to Speed the Future of Supercomputing similar to those of the Lax and Branscomb reports.10 PITAC found that federal information technology R&D was too heavily focused on near-term problems and that investment was inadequate. The committee’s main recommendation was to create a strategic initiative to support long-term research in fundamental issues in computing, information, and communications. In response to this recommendation, NSF developed the Information Technology Research (ITR) program. This program, which was only partly successful in meeting the needs identified by PITAC, is now being phased out. In 2000, concern about the diminishing U.S. ability to meet national security needs led to a recommendation by the Defense Science Board that DoD continue to subsidize a Cray computer development program as well as invest in relevant long-term research.11 The Defense Advanced Research Projects Agency (DARPA) launched the High Productivity Computing Systems (HPCS) program in 2002 to provide a new generation of economically viable, high-productivity computing systems for the national security and industrial user community in 2007-2010. The goal is to address the gap between the capability needed to meet mission requirements and the current offerings of the commercial marketplace. HPCS has three phases: (1) an industrial concept phase (now completed), in which Cray, Silicon Graphics, Inc. (SGI), IBM, Hewlett-Packard, and Sun participated; (2) an R&D phase that was awarded to Sun, Cray, and IBM in July 2003 and lasting until 2006; and (3) full-scale development, to be completed by 2010, ideally by the two best proposals from the second phase. In summary, while successive reports have emphasized the importance of increased investments in supercomputing and the importance of long-term, strategic research, investments in supercomputing seem not to have grown, and the focus has stayed on short-term research, one generation ahead of products. Research on the base technologies used for supercomputing (architecture, programming languages, compilers, operating systems, etc.) has been insufficient. Computenik In the spring of 2002 the Japanese installed the Earth Simulator (ES), a supercomputer to be used for geosciences applications. For over 2 years, 10   PITAC. 1999. Report to the President. Information Technology Research: Investing in Our Future. February. 11   Defense Science Board. 2000. Report of the Defense Science Board Task Force on DoD Supercomputing Needs. Washington, D.C.: Office of the Under Secretary of Defense for Acquisition and Technology. October 11.

OCR for page 11
Getting up to Speed the Future of Supercomputing the TOP500 list12,13 has ranked it as the fastest performing supercomputer in the world. The ES was designed to use custom multiprocessor vector-based nodes and to provide good support for applications written in High Performance Fortran—technologies that were all but abandoned in the United States in favor of commodity scalar processors and message passing libraries. The emergence of that system has been fueling recent concerns about continued U.S. leadership in supercomputing. Experts have asserted that the Earth Simulator was made possible through long-term, sustained investment by the Japanese government. The U.S. Congress and several government agencies began to question what should be done to regain the supercomputing lead. While some experts have argued that maintaining an absolute lead in supercomputing (as measured by the TOP500 list) should not be an overriding U.S. policy objective, the Earth Simulator nonetheless offers important lessons about investment in, management of, and policy toward supercomputing. The Defense Appropriations Bill for FY 2002 directed the Secretary of Defense to submit a development and acquisition plan for a comprehensive, long-range, integrated, high-end computing (IHEC) program. The resulting report, High Performance Computing for the National Security Community,14 released in the spring of 2003 and known as the IHEC Report, recommends an applied research program to focus on developing the fundamental concepts in high-end computing and creating a pipeline of new ideas and graduate-level expertise for employment in industry and the national security community. The report also emphasizes the importance of high-end computing laboratories that will test system software on dedicated large-scale platforms; support the development of software tools and algorithms; develop and advance benchmarking, modeling, and simulations for system architectures; and conduct detailed technical re- 12   The TOP500 project was started in 1993 to provide a reliable basis for tracking and detecting trends in high-performance computing. Twice a year, a list of the sites operating the 500 most powerful computer systems is assembled and released. The best performance on the Linpack benchmark is used for ranking the computer systems. The list contains a variety of information, including the system specifications and its major application areas (see <http://www.top500.org> for details). 13   When Jack Dongarra, one of the people who maintains the TOP500 list (an authoritative source of the world’s 500 most powerful supercomputers), announced that the Earth Simulator was the world’s fastest supercomputer, the New York Times quoted him as saying, “In some sense we have a Computenik on our hands” (John Markoff, 2002, “Japanese Computer Is World’s Fastest, as U.S. Falls Back,” The New York Times, April 20, Page A1, C14). 14   Available online at <http://www.hpcc.gov/hecrtf-outreach/bibliography/200302_hec.pdf>.

OCR for page 11
Getting up to Speed the Future of Supercomputing quirements analysis. The report suggests $390 million per year as the steady-state budget for this program. The program plan consolidates existing DARPA, DOE/NNSA, and National Security Agency (NSA) R&D programs and features a joint program office with Director of Defense Research and Engineering (DDR&E) oversight. At the request of Congress, DOE commissioned (in addition to this study) a classified study by the JASONs to identify the distinct requirements of the Stockpile Stewardship Program and its relation to the ASC acquisition strategy. Roy Schwitters, the study leader, said that the report, released in 2003, concluded that “distinct technical requirements place valid computing demands on ASC that exceed present and planned computing capacity and capability.”15 The 2003 Scales report, A Science-Based Case for Large-Scale Simulation,16 presents a science-based case for balanced investment in numerous areas—such as algorithms, software, innovative architecture, and people—to ensure that the United States benefits from advances enabled by computational simulations. The High-End Computing Revitalization Task Force (HECRTF) of the National Science Coordination Office for Information Technology Research and Development (NITRD) was chartered under the National Science and Technology Council to develop a plan and a 5-year roadmap to guide federal investments in high-end computing starting with FY 2005. The report, Federal Plan for High-End Computing,17 released in May 2004, noted that the 1990s approach of building systems based on commercial off-the-shelf (COTS) components may not be suitable for many applications of national importance. It recommends research in alternative technologies to ensure U.S. leadership in supercomputing. The report also calls for an interagency collaborative approach. The Senate Committee on Energy and Natural Resources held a hearing in June 2004 on the High-End Computing Revitalization Act of 2004 (S. 2176).18 This bill calls for the Secretary of Energy to implement a research and development program in supercomputing and establish a high-end software development center. On July 8, 2004, the House passed and referred to the Senate Committee on Commerce, Science, and Transportation a similar bill, the Department of Energy High-End Computing 15   Presentation to the committee on December 3, 2003. 16   DOE, Office of Science. 2003. “A Science-Based Case for Large Scale Simulation.” Scales Workshop Report, Vol. 1. July. Available online at <http://www.pnl.gov/scales/>. 17   Available online at <http://www.hpcc.gov/pubs/2004_hecrtf/20040702_hecrtf.pdf>. 18   See <http://thomas.loc.gov/cgi-bin/query/z?c108:S.2176:>.

OCR for page 11
Getting up to Speed the Future of Supercomputing Revitalization Act of 2004 (H.R. 4516), which omits the call for the software development center.19 The Senate passed an amended version of H.R. 4516 on October 11, 2004; the House is expected to consider the legislation in late November 2004.20 The House also passed and sent to the Senate the High-Performance Computing Revitalization Act of 2004 (H.R. 4218),21 which amends the High-Performance Computing Act of 1991 and directs the President to establish a program to provide for long-term research on high-performance computing, including the technologies to advance the capacity and capabilities of high-performance computing. It also calls for the Director of the Office of Science and Technology Policy to develop and maintain a roadmap for high-performance computing. ABOUT THE INTERIM REPORT An interim report was presented in July 2003, approximately 6 months after the start of the study.22 The report provides a preliminary outline of the state of U.S. supercomputing, the needs of the future, and the factors that will contribute to meeting those needs. The report notes that the United States had the lead, on the June 2003 TOP500 list, in the use and manufacture of supercomputers.23 However, to meet the security and defense needs of our nation and to realize the opportunities to use supercomputing to advance knowledge, progress in supercomputing must continue. An appropriate balance is needed for investments that evolve current supercomputing architectures and software and investments that exploit alternative approaches that may lead to a paradigm shift. Balance is also needed between exploiting cost-effective advances in widely used hardware and software products and developing custom solutions that meet the most demanding needs. Continuity and stability in the government funding of supercomputing appear to be essential to the well-being of supercomputing in the United States. ORGANIZATION OF THE REPORT In this report the committee first examines the requirements of different classes of applications and the architecture, software, algorithm, and 19   See <http://thomas.loc.gov/cgi-bin/bdquery/z?d108:HR04516:>. 20   See <http://thomas.loc.gov/cgi-bin/query/R?r108:FLD001:S61181>. 21   See <http://thomas.loc.gov/cgi-bin/query/D?c108:3:./temp/~c108qnbgq9::>. 22   NRC. 2003. The Future of Supercomputing: An Interim Report. Washington, D.C.: The National Academies Press. 23   Based on the June 2003 TOP500 list at <http://www.top500.org>.

OCR for page 11
Getting up to Speed the Future of Supercomputing cost challenges and trade-offs associated with these application classes. The report addresses not only present-day applications and technology, but also the context provided by history, by institutions and communities, and by international involvement in supercomputing. Chapter 2 defines supercomputing. Chapter 3 outlines a brief history of supercomputing. Chapter 4 describes many compelling applications that place extreme computational demands on supercomputing. Chapter 5 discusses the design of algorithms, computing platforms, and software environments that govern the performance of supercomputing applications. The institutions, computing platforms, system software, and the people who solve supercomputing applications can be thought of collectively as an ecosystem. Chapter 6 outlines an approach to supercomputing ecosystem creation and maintenance. Chapter 7 discusses the international dimension of supercomputing. Chapter 8 offers a framework for policy analysis. Chapter 9 describes the role of the government in ensuring that supercomputing appropriate to our needs is available both now and in the future. Chapter 10 contains the committee’s conclusions and recommendations for action to advance high-end computing.