Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 15
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering 2 The Potential Impact of HECC in Astrophysics INTRODUCTION It is impossible to gaze at the sky on a clear, moonless night and not wonder about our Universe and our place in it. The same questions have been posed by mankind for millennia: When and how did the Universe begin? What are the stars and how do they shine? Are there other worlds in the Universe like Earth? The answers to many of these questions are provided by scientific inquiry in the domains of astronomy and astrophysics. Astronomy is arguably the oldest of all the sciences. For most of history, observations were made with the naked eye and therefore confined to the visible wavelengths of light. Today, astronomers use a battery of telescopes around the world and on satellites in space to collect data across the entire electromagnetic spectrum—that is, radio, infrared, optical, ultraviolet, X-ray, and gamma-ray radiation. Astrophysics emerged as a discipline in the last century as scientists began to apply known physical laws to interpret the structure, formation, and evolution of the planets, stars, and galaxies, and indeed the Universe itself. Astronomical systems have long challenged our understanding of physics; for example, research in nuclear physics is intimately tied to our desire to understand fusion reactions inside stars. Similarly, some of the most important discoveries made by astronomical observers have resulted from the predictions of theoretical physics, such as the cosmic microwave background radiation that is a signature of the Big Bang. The rate of discovery in astronomy and astrophysics is rapid and accelerating. It is only within the past 100 years that we have understood the size and shape of our own Milky Way galaxy, and that there are billions of other galaxies like the Milky Way in the Universe. It is only in the past 50 years that we have realized the Universe had a beginning in what is now called the Big Bang. It is only within the past decade that we have narrowed the uncertainties surrounding the age of the Universe. Some of the most important discoveries made in the past decade are these three: Dark matter and dark energy. It has long been known that the motions of galaxies in clusters, and their internal rates of rotation, require the existence of dark matter—that is, material in the
OCR for page 16
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering Universe that interacts gravitationally but not with electromagnetic radiation. Precise measurements of the expansion history of the Universe, made using distances determined from Type Ia supernovae, have recently indicated the existence of dark energy, which is responsible for an acceleration with time in the expansion rate of the Universe that counteracts the deceleration produced by gravity. In the past year, the Wilkinson Microwave Anisotropy Probe (WMAP), a satellite launched by NASA in 2001, has provided the most accurate observations of the fluctuations in the cosmic microwave background radiation that were first discovered by the Cosmic Background Explorer (COBE), another NASA mission, launched in 1989. These fluctuations are the imprint of structure in the Universe that existed about 300,000 years after the Big Bang, structure that eventually collapsed into the galaxies, stars, and planets we see today. These observations not only challenge astrophysicists to explain how these fluctuations grew and evolved into the structure we see today, but they also provide firm evidence that verifies the reality of both dark matter and dark energy. Discovering the nature of dark matter and dark energy are perhaps the two most compelling challenges that face astrophysicists today. Exosolar planets. In the past 10 years, over 200 new planets have been discovered orbiting stars other than the Sun. Most of these planets are dramatically different from those in our solar system. For example, a significant fraction consist of large, Jupiter-like bodies that orbit only a few stellar radii from their host stars. There is no analog to such planets in our own solar system. Indeed, our current theory of planet formation predicts that Jupiter-like planets can form only in the outer regions of planetary systems, at distances several times the Earth-Sun separation. Thus, the existence of these new planetary systems challenges our understanding of planet formation and the dynamical evolution of planetary systems. As our techniques for detecting exosolar planets (also known as exoplanets) improve, astronomers are finding more and more planets closer in size to and with properties similar to those of Earth. (As of the time of this report, several planets with masses only a few times that of Earth have been found.) Of course, this raises the question of whether any of them harbor life. Unambiguous evidence for the existence of black holes. It has long been known that there is a compact object at the center of our galaxy responsible for producing powerful nonthermal emission at a variety of wavelengths. One way to delimit the nature of this object is to measure its size and mass: If the inferred mass density is sufficiently high, then it must be a black hole. Using diffraction-limited images in the infrared to measure the motions of stars within 1 light-year of this object over the past 10 years, two research groups have been able to limit its mass to at least 3 million solar masses. There is no known object in the Universe, other than a supermassive black hole, that could contain this much mass in such a small volume. Similar observations of the motions of stars and gas near the centers of several other galaxies, notably NGC4258, also known as Messier106, have also provided unambiguous evidence for the existence of supermassive black holes. The purpose of this chapter is to assess the potential impact of HECC on the progress of research in astrophysics. Using the discoveries of the past decade, along with sources such as prior decadal surveys of the field and input from outside experts, the committee formulated a list of the major challenges facing astrophysics today, identified the subset of challenges that require computation, and investigated the current state of the art and the future impact of HECC on progress in facing these challenges.
OCR for page 17
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering MAJOR CHALLENGES IN ASTROPHYSICS Several reports published recently in consultation with the entire astronomy and astrophysics community have identified key questions that confront the discipline in the coming decade. These made the task of identifying the major challenges in astrophysics much simpler for the committee. In particular, the NRC decadal survey of astronomy and astrophysics, Astronomy and Astrophysics in the New Millennium, published in 2001 (often referred to as the McKee-Taylor report), the NRC report Connecting Quarks with the Cosmos: Eleven Science Questions for the New Century, published in 2002 (also called the Turner report), and the National Science and Technology Council (NSTC) report The Physics of the Universe: A Strategic Plan for Federal Research at the Intersection of Physics and Astronomy, published in 2004, were instrumental in developing the list of questions summarized in this section. At its first meeting, the committee heard a presentation from Chris McKee, coauthor of the McKee-Taylor report, and at a later meeting it heard presentations from Tom Abel, Eve Ostriker, Ed Seidel, and Alex Szalay on topics related to the identification of the major challenges and the potential impact of HECC on them. Committee members found themselves in complete agreement with the consistent set of major challenges identified in each of these three published reports. The challenges take the form of questions that are driving astrophysics and that are compelling because our current state of knowledge appears to make the challenges amenable to attack: What is dark matter? What is the nature of dark energy? How did galaxies, quasars, and supermassive black holes form from the initial conditions in the early Universe observed by WMAP and COBE, and how have they evolved since then? How do stars and planets form, and how do they evolve? What is the mechanism for supernovae and gamma-ray bursts, the most energetic events in the known Universe? Can we predict what the Universe will look like when observed in gravitational waves? Observations, Experiment, Theory, and Computation in Astrophysics As in other fields of science, astrophysicists adopt four modes of investigation: observation, experiment, theory, and computation. Astronomy is characterized by its reliance on observation over experimentation, and this clearly affects the information available to astrophysicists. Virtually all that we know about the Universe beyond the solar system comes from electromagnetic radiation detected on Earth and in space. To push the frontiers of our knowledge, astronomers build ever-larger telescopes that operate over ever-wider bands of the electromagnetic spectrum and equip them with more efficient and more sensitive digital detectors and spectrographs. This is leading to an explosion of data in digital form, a point to which we return below. Experimentation has a long and distinguished history in astrophysics. Although it is of course impossible to build a star in the laboratory and perform experiments on it, it is possible to measure basic physical processes important in stars and other astrophysical systems in the laboratory. For example, the cross sections for nuclear reactions of relevance to astrophysics have been the subject of laboratory measurements for many decades, as have the cross sections for the interaction of astrophysically abundant ions with light. More recently, the construction of high-energy-density laser and plasma fusion devices have enabled experiments on the dynamics of plasmas at the pressures and temperatures relevant to a variety of astrophysical systems. Theory is primarily concerned with the application of known physical laws to develop a mathematical
OCR for page 18
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering model of astrophysical phenomena. The goal is to identify the most important physical effects and to formulate the simplest mathematical model that adequately describes them. For example, the equations of stellar structure, which can be used to describe the evolution of stars, result from combining mathematical models of processes in nuclear physics, gas dynamics, and radiation transfer. Comparing the predictions of such a model to observations allows investigators to develop insight about the details of those component processes and to infer the degree to which those processes contribute to and account for known data on stellar evolution. Finally, computation has emerged as a powerful means to find solutions to mathematical formulations that are too complex to solve analytically. Computation also allows “numerical experimentation”—that is, systematic exploration of the effects of varying parameters in a mathematical model. This is particularly important in astrophysics, because so many phenomena of interest cannot be replicated in the laboratory. Finally, data analysis and modeling are critical in observational astronomy, where it is important to find interesting and unusual objects out of billions of candidates (for example, the most distant galaxies) or where information can be extracted only from a statistical model of very large data sets (for example, patterns in the distribution of galaxies in the Universe that reveal clues to the nature of dark matter and dark energy). Computation has a long and distinguished history in astrophysics. The development of many numerical algorithms has been driven by the need to solve problems in astrophysics. Similarly, astrophysicists have relied heavily on HECC since the first electronic computers became available in the 1940s. To identify those challenges that are critically reliant on HECC, the committee considered the importance of each of these four modes of investigation to the solution of the six major challenges listed on the preceding page. The sections below discuss in more detail the major challenges in astrophysics that require HECC, while also describing the current state of the art in algorithms and hardware and what is needed to make substantial progress. MAJOR CHALLENGES THAT REQUIRE HECC It seems likely that the bulk of the progress on Major Challenges 1 and 2 will come from more advanced astronomical observations. Even then, it is important to emphasize that some of the data sets required to inform our understanding of dark matter and dark energy are unprecedented in size and will require HECC for their analysis. The need for HECC to exploit large data sets is discussed on its own later on, in a separate section. For the remaining four challenges, the committee concluded that while further observation will also play a role in answering the questions, it is unlikely that any significant advance in understanding will be achieved without a major, if not dominant, reliance on HECC. That so many of the major challenges in astrophysics require HECC should not be surprising given the complexity of astrophysical systems, the need for numerical experimentation as proxies for laboratory experiments, and the long tradition of the use of HECC in astrophysics, which suggests that the field is comfortable with cutting-edge computation. Major Challenge 3. Understanding the Formation and Evolution of Galaxies, Quasars, and Supermassive Black Holes Precise measurements of anisotropies in the cosmic microwave background over the past decade have reduced uncertainties in the fundamental cosmological parameters to a few percent or less and have provided a standard model for the overall properties of the Universe. According to this model,
OCR for page 19
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering the Universe is geometrically flat and consists presently of about 4 percent ordinary matter, 22 percent nonbaryonic dark matter, and 74 percent dark energy. It is thought that small-amplitude density fluctuations, initially seeded at early times by a process like inflation, grew through gravitational instability to produce stars, galaxies, and larger-scale structures over billions of years of evolution. Thus cosmic microwave background measurements provide the initial conditions we need, in principle, to understand the structure of the present-day Universe. However, while the gravitational dynamics of dark matter are reasonably well understood, the physics attending the formation of stars and galaxies is highly complex and involves an interplay between gravity, hydrodynamics, and radiation. Initially smooth, material in slightly overdense regions collapsed, shock heating the gas and leaving behind halos of dark and ordinary matter. In some cases, the baryons in these halos cooled radiatively, producing the dense, cold gas needed to form stars. Winds, outflows, and radiation from stars, black holes, and galaxies established a feedback loop, modifying the intergalactic medium and influencing the formation of subsequent generations of objects. Numerical simulation makes it possible to follow the coupled evolution of dark matter, dark energy, baryons, and radiation so that the physics of this process can be inferred and the properties of the Universe can be predicted at future epochs. It is instructive to consider the current state of the art in the mathematical models and numerical algorithms for simulating galaxy formation. As indicated above, most of the mass in the Universe is in the form of dark matter, which in the simplest interpretation interacts only gravitationally and is collisionless. The equations describing the evolution of this component are just Newton’s laws (or Einstein’s equations on larger scales) supplemented by Poisson’s equation for the gravitational interaction of the dark matter with itself or with baryons. Stars in galaxies also interact gravitationally and can be approximated as a collisionless fluid except in localized regions, where the relaxation time may be relatively short. The gas that cooled and formed stars and galaxies evolved according to the equations of compressible hydrodynamics, and this material evolved under the influence of gravity (its own self-gravity, as well as that arising from dark matter and stars), pressure gradients, shocks, magnetic fields, cosmic rays, and radiation. Because the gas can cool and be heated by radiation, the equations of radiative transfer also must ultimately be solved along with the equations of motion for stars, gas, and the inferred behavior of dark matter. But to date there have been few attempts to dynamically couple radiation and gas owing to the complexity of the physics and the extreme numerical resolution needed. Various numerical algorithms have been implemented to solve the equations underlying galaxy formation. Dark matter and stars are invariably represented using N-body particles because it is impractical to solve Boltzmann’s equation as a six-dimensional partial differential equation. The most computationally challenging aspect of the simulations is solving for the gravitational field in an efficient fashion with a dynamic range in spatial scales sufficiently large to capture the distribution of galaxies on cosmological scales as well as to describe their internal structure. State-of-the-art codes for performing these calculations typically employ a hybrid scheme for solving Poisson’s equation. For example, in the TreePM approach, the gravitational field is split into short- and long-range contributions. Short-range forces are computed with a tree method, in which particles are grouped hierarchically and their contribution to the potential is approximated using multipole expansions. Long-range forces are determined using a particle-mesh technique by solving Poisson’s equation using fast Fourier transforms. An advantage of this hybrid formulation is that long-range forces need to be updated only infrequently. This approach has been used to perform very large N-body simulations describing the evolution of dark matter in the Universe, ignoring the gas. For example, the recent Millennium simulation (Springel et al., 2005) employed more than 10 billion particles to study the formation of structure in a dark-matter-only Universe, making it possible to quantify the mass spectrum of dark-matter halos and how
OCR for page 20
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering this evolves with time. An example of such a calculation, in a smaller volume with fewer particles, is shown in Figure 2-1. In fact, the problem of understanding galaxy formation is even more complex than solving a coupled set of differential equations for dark matter, stars, gas, and radiation, because some of these components can transform into one another. Stars, as well as supermassive black holes, form from interstellar gas, converting material from collisional to collisionless form. As stars evolve and die, they recycle gas into the interstellar medium and intergalactic medium, enriching these with heavy elements formed through nuclear fusion. It is also believed that radiative energy produced during the formation of supermassive black holes, possibly during brief but violent quasar phases of galaxy evolution, can impart energy and/ or momentum into surrounding gas, expelling it from galaxies and heating material in the intergalactic medium. Star formation and the growth of supermassive black holes are not well-understood, and they occur on scales that cannot be resolved in cosmological simulations. But since these processes are essential ingredients in the formation and evolution of galaxies, simplified descriptions of them are included in the mathematical model of galaxy formation, usually incorporated as subgrid-scale functions. An important goal for the field in the coming decades will be to formulate from first principles physically motivated models of star formation and black hole growth that can be incorporated into cosmological simulations. Because current simulations are limited, many aspects of galaxy formation and evolution are not well understood. For example, in the local Universe, the vast majority of galaxies are classified as spirals. FIGURE 2-1 Snapshot from a pure N-body simulation showing the distribution of dark matter at the present time (light colors represent the greater density of dark matter). This square is the projection of a slice through a simulation box that is 150 megaparsecs (Mpc) across. One billion particles were employed.
OCR for page 21
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering Similar to the Milky Way and unlike elliptical galaxies, which are spheroidal and are supported against gravity by the random motions of their stars, spiral galaxies have most of their mass in thin, rotationally supported disks. It is believed that most (perhaps all) galaxies were born as spirals and that some were later transformed into ellipticals by collisions and mergers between them, a process that is observed to occur often enough to explain the relative abundances of the two types of galaxies. Indeed, simulations of mergers between individual, already-formed spirals show that the remnants are remarkably similar to actual ellipticals, strengthening the case for this hypothesis. Yet in spite of the obvious significance of spiral galaxies to our picture of galaxy evolution, simulations of galaxy formation starting from cosmological initial conditions have invariably failed to produce objects with structural and kinematic properties resembling those of observed galactic disks. The cause for this failure is presently unknown but probably involves an interplay between inadequate resolution and a poor representation of the physics describing star formation and associated feedback mechanisms. Cosmological simulations that incorporate gas dynamics, radiative processes, and subgrid models for star formation and black-hole growth have employed a variety of algorithms for solving the relevant dynamical equations. In applications to cosmology and galaxy formation, state-of-the-art simulations are being done solving the equations of hydrodynamics using either a particle-based approach (smoothed-particle hydrodynamics, or SPH) or a finite-difference solution on a mesh. The most promising grid-based schemes usually employ some variant of adaptive mesh refinement (AMR), which makes it possible to achieve a much larger range in spatial scales than would be possible with a fixed grid. Radiative transfer effects have typically not been handled in a fully self-consistent manner but have instead been included as a postprocessing step, ignoring the dynamical coupling to the gas. An example of such a calculation is illustrated in Figure 2-2, which shows the impact of ionizing radiation from an assumed distribution of galaxies in the simulation shown in Figure 2-1. These various methods have already yielded a number of successes applied to problems in cosmology and galaxy formation. For example, simulations of the structure of the intergalactic medium have provided a new model for the absorption lines that make up the Lyman-alpha forest seen in the spectra of high-redshift galaxies, according to which the absorbing structures are the filaments seen in, for example, Figures 2-1 and 2-2 (the “Cosmic Web”; for a review, see Faucher-Giguere et al., 2008). High-resolution N-body simulations have quantified the growth of structure in the dark matter and the mass function of halos. Multiscale calculations have plausibly shown that the feedback loop described above probably plays a key role in determining the properties of galaxies and the hosts of quasars. However, in the future, simulations with even greater dynamic range and physical complexity will be needed for interpreting data from facilities such as the Giant Magellan Telescope, the Thirty-Meter Telescope, the James Webb Space Telescope, and the Square Kilometer Array, which will enable us to probe the state of the Universe when stars first began to form, ending the Cosmic Dark Ages. For example, the self-consistent interaction between the hydrodynamics of the gas and the evolution of the radiation field must be accounted for to properly describe the formation of the first objects in the Universe and the reorganization of the intergalactic medium. Large galaxies forming later were shaped by poorly understood processes operating on smaller scales than those characterizing the global structure of galaxies—such as, in particular, star formation, the growth of supermassive black holes, and related feedback effects—and the physical state of the interstellar medium within galaxies, which is also poorly understood. Understanding the formation and evolution of galaxies and their impact on the intergalactic medium is at the forefront of research in modern astrophysics and will remain so for the foreseeable future. The physical complexity underlying these related phenomena cannot be described using analytic methods alone, and the simulations needed to model them will drive the need for new mathematical models, sophisticated algorithms, and increasingly powerful computing resources.
OCR for page 22
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering FIGURE 2-2 Simulation from Figure 2-1 post-processed to demonstrate the impact of ionizing radiation from galaxies. Black regions are neutral, while colored regions are ionized. The ionizing sources are in red. It is equally instructive to consider what scale of computation is required, in terms of both the mathematical models and the numerical algorithms, to achieve a revolutionary step forward in our understanding of galaxy formation. The challenge is associated with the vast range of spatial scales that must be covered. A volume of the local Universe containing a representative sample of galaxies will be on the order of a few hundred megaparsecs across, if not larger. Star-forming events within galaxies occur on the scale of a few parsecs or less, implying a linear dynamic range greater than 108. Even smaller scales (about 0.01 parsec or less) could be important if processes related to the growth of supermassive black holes in the centers of galaxies influence the dynamical state of nearby star-forming gas. If this problem were to be attacked using, say, a uniform grid, the number of cells required in three dimensions would be between 1024 and 1030, whereas we can currently handle only between 1010 and 1012 cells. More plausibly, advances in the near term will probably be made using codes with adaptive resolution, integrating different algorithms in a multiscale manner, so that the entire range of scales need not be captured in a single calculation. For example, a global simulation of a galaxy or a pair of interacting galaxies could be grafted together with a separate code following the inflow of gas onto supermassive black holes to estimate the impact of radiative heating and radiation pressure on the gas and dust in the vicinity of the black holes, to study the impact of black hole feedback on galaxy evolution. The algorithms that are required would, ideally, solve the equations of radiation hydrodynamics, including gravitational interactions between gas and collisionless matter, and be capable of handling magnetohydrodynamical effects. This would almost certainly involve a combination of adaptive mesh and particle codes that can represent the various components in the most efficient manner possible.
OCR for page 23
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering The algorithms would likely be running across a number of computing architectures simultaneously to maximize throughput, depending on the scalability of different aspects of the calculation. In addition to needing large numbers of fast processors operating in parallel, simulations such as these will place stringent demands on core memory and disk storage. For example, a pure N-body simulation with (10,000)3 particles, using the most efficient codes currently available, would require on the order of 100 Tbytes of core memory and would produce data at the rate of roughly 30 Tbyte per system snapshot, or on the order of 1,000 Tbytes total for relatively coarse time sampling. A simulation of this type would make it possible to identify all sources of ionizing radiation in volumes much larger than that in, for example, Figure 2-2 and hence enable theoretical predictions for the 21-cm emission from neutral gas that would be detectable by the Square Kilometer Array. Major Challenge 4. Understanding the Formation and Evolution of Stars and Planets All of the stars visible to the naked eye are in the Milky Way galaxy, and they formed long after the Big Bang. Roughly a century ago, astronomers realized that the Milky Way contains not just stars but also giant clouds of molecular gas and small solid particles, which they called dust. Further observations have revealed that new generations of stars form from the gravitational collapse and fragmentation of these Giant Molecular Clouds (GMCs). The formation process occurs in two stages. At first, dynamical collapse creates a dense, swollen object, called a proto-star, surrounded by a rotating disk of gas and dust. As the core cools, it shrinks, eventually evolving into a star once the central densities in the core become high enough to initiate nuclear fusion. During the latter phase, which can take millions of years for stars of our Sun’s mass, it is thought that planets can form from the surrounding protostellar disk. Thus, the formation of stars is inextricably linked with that of planets. Star Formation Although astrophysicists now understand the basic physical processes that control star formation, a predictive theory that can explain the observed star formation rates and efficiencies in different environments has yet to emerge. The most basic physics that must be incorporated is the gas dynamics of the interstellar material, including the stresses imposed by magnetic fields. In addition, calculating the cooling and heating of the gas due to the emission and absorption of photons is both important and challenging. Radiation transfer is an inherently nonlocal process: Photons emitted in one place can be absorbed somewhere else in the GMC. Moreover, most of the cooling is through emission at particular discrete frequencies, and calculating such radiative transfer in a moving medium is notoriously complex. Finally, additional microphysical processes such as cosmic-ray ionization, recombination on grains, chemistry (which can alter the abundances of chemical species that regulate cooling), and diffusion of ions and neutrals can all affect the dynamics and must be modeled and incorporated. All of this physics occurs in the deeply nonlinear regime of highly turbulent flows, making analytical solutions impossible. Thus the vast majority of efforts to address the major challenges associated with star formation are based on computation. It is instructive to consider the current state of the art in mathematical models and numerical algorithms for star and planet formation. Currently, the dynamics of GMCs are modeled using the equations of hydrodynamics or, more realistically, the equations of magnetohydrodynamics, including self-gravity and optically thin radiative cooling. Typical simulations begin with a turbulent, self-gravitating cloud, and they follow the collapse and fragmentation of the cloud into stars. A few simulations have considered the effects of radiation feedback using an approximate method (flux-limited diffusion) for radiative transfer,
OCR for page 24
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering but these calculations require other simplifications (because they are not magnetohydrodynamical) to make them tractable. Either static or adaptive mesh refinement is a very powerful technique for resolving the collapse to small scales; however, robust methods for magnetohydrodynamics and radiation transfer on adaptively refined grids are still under active development. Typical calculations involve grids with a resolution of up to 10243 evolved for tens of dynamical times on the largest HECC platforms available today, using hundreds to thousands of processors. Figure 2-3 shows the results of one such calculation. A typical AMR calculation uses 1283 grids with up to seven levels of refinement (see, for example, Krumholz et al., 2007). Alternatively, some researchers use the smoothed particle hydrodynamics (SPH) algorithm to follow the collapse and fragmentation of GMCs; a typical calculation using about 106 particles will require several months of cpu time on a cluster with hundreds of processors (Bate and Bonnell, 2004). To date, there are no calculations that follow the chemistry and ionization of the gas self-consistently with the hydrodynamics. As a result of such calculations, an entirely new paradigm for star formation is emerging (Heitsch et al., 2001). Previously, it was thought to be controlled by the slow, quasistatic contraction of a gravitationally bound core supported by magnetic pressure. These new numerical results indicate star formation is far more dynamic. Turbulence in the cloud generates transient, large-amplitude density fluctuations, some of which are gravitationally bound and collapse on a free-fall timescale (see Figure 2-3). Feedback from newly forming stars disperses the cloud and limits the efficiency of star formation. Many of the properties of newly formed stars are now thought to be determined by the properties of the GMCs from which they form. It must be emphasized that the current simulations still lack important physics. For instance, there are no magnetohydrodynamical calculations of collapse using AMR. None of the calculations include a realistic treatment of the ionization and recombination processes that determine the degree of coupling to the magnetic field. No calculations yet include the galactic gravitational potential, including shocks FIGURE 2-3 Structure of supersonic turbulence in a star-forming GMC, as revealed through three-dimensional hydrodynamic simulations on a 5123 grid. Colors represent the gas density on the faces of the computational volume (red is highest density, blue lowest). A complex network of interacting shocks in the turbulence generates large density fluctuations, which can collapse to form stars.
OCR for page 25
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering induced by spiral arms, to follow the formation of GMCs from more diffuse phases of the interstellar medium. All of this physics is beyond current capabilities. Several outstanding problems in star formation could be addressed with such calculations, including the following: What controls the initial mass function—that is, the numbers of stars formed at different masses? What determines the efficiency of star formation—that is, the fraction of gas in a GMC that ultimately is turned into stars? By what process are GMCs formed in galaxies in the first place? How are planets formed in the accretion disks that surround newly forming stars? (This is discussed further below.) Calculations that can address these issues are vital in view of recent observational programs that have shed new light on star formation. For example, the Spitzer Space Telescope is a billion-dollar NASA mission that has launched into space the most sensitive infrared telescope ever built. The mission is now returning unprecedented images and spectra of cold interstellar gas in the Milky Way and other galaxies. Interpreting the data from Spitzer requires more sophisticated theoretical and computational studies of star formation. To achieve a revolutionary step forward in our understanding of star formation, future calculations need to consider nonideal magnetohydrodynamics in a partially ionized gas, including radiation transfer, self-gravity, ionization and recombination, and cosmic-ray transport. Ideally, the computational domain should include the entire galactic disk, so that the formation of GMCs in spiral density waves can be followed self-consistently. Collapse and fragmentation of the clouds will require AMR to resolve scales down to at least 0.001 pc, starting from the galactic disk on a scale of 1 kiloparsec, a dynamic range of 106. Up to 20 levels of nested grids might be needed for this resolution. The greatest challenge to such a calculation will be load balancing the AMR algorithms across petascale platforms consisting of millions of processors. Much of the computational effort will be associated with the broader range of physics included in the models and the attendant increase in simulation complexity of the model, as opposed to managing the AMR grids and the data exchanges through the hierarchy of grids. Substantial effort will be required to develop scalable algorithms for self-gravity, radiation transport, cosmic-ray transport, and solvers for the stiff ordinary differential equations (ODEs) associated with the chemistry and microphysics. Planet Formation Understanding the formation and evolution of planetary systems in the gas disks that surround protostars is a problem related to understanding star formation. Current theory suggests that planets are formed by one of two routes: either gravitational fragmentation if the disk is massive and cold enough, or through coagulation of solid dust particles into meter-sized “planetesimals,” followed by collisional aggregation of planetesimals into protoplanets. The largest protoplanets can capture large amounts of gas from the disk and grow into Jupiter-like gas giant planets. Both processes are very poorly understood. Moreover, in both alternatives the dynamics of the gas disk play a crucial role. It is thought that weakly ionized disks around protostars have complex dynamics and structure. Turbulence can be driven by magnetohydrodynamical instabilities in regions where the gas is sufficiently ionized (most likely the surface layers). Large-amplitude spiral waves and gaps can be cleared by the interaction with growing planets embedded in the disk. Finally, the central protostar can affect the temperature and thermodynamics of the disk via outflowing matter in winds and irradiation by photons. To date, only the most
OCR for page 26
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering basic theoretical questions have been addressed. With new observational data from Spitzer and from the Atacama Large Millimeter Array (ALMA), a new radio telescope being constructed in Chile with funding from the National Science Foundation (NSF), and with new planet-finding missions such as NASA’s Kepler and Terrestrial Planet Finder (TPF) missions, the need to understand planet formation has never been more pressing. Computations to address these questions will probably consist of three-dimensional magnetohydrodynamic simulations on very large grids (at least 10243) that capture the entire disk. Since protostellar disks are very weakly ionized, additional microphysics must be included to capture nonideal magnetohydrodynamic effects. Finally, an understanding of self-gravity will be needed to follow the fragmentation of the disk and of radiation transport to model the thermodynamics realistically. The requisite mathematical models and numerical algorithms are similar to those needed for the related problem of star formation: magnetohydrodynamics, self-gravity, and radiation transport on very large grids, probably with AMR. The Sun and Stellar Evolution How do stars evolve? Stellar evolution is a mature field that is now encountering a stiff set of challenges from rapid improvement in astronomical data. Understanding of stellar evolution is also one of the foundations for inferring the evolution of star clusters and galaxies, through age estimates and element production by nucleosynthesis. One-dimensional simulations have been exploited to the limit; three-dimensional simulations including rotation and magnetic fields are beginning to be feasible, thanks to the steady increases in computing power. What is the true solar composition? Understanding the Sun is intrinsically interesting, but the Sun is also the best-observed star and therefore the most precise test of stellar evolution calculations. Helioseismological observations use the character of sound waves resonant in the solar volume to constrain the run of sound speed and density through deeper layers, almost to the center; the level of precision is better than 1 percent through much of the Sun and far better than for any other star. The internal rotation field of the Sun has been partially traced, and the behavior of magnetic fields in the solar convection zone is being probed. Present theory does not stand up to these challenges: The standard solar model (SSM) of Bahcall and collaborators, the helioseismological inferences for sound speed and density structure, and the best three-dimensional simulations of the stellar atmosphere and photosphere do not agree. One suggestion is that the measured solar abundances of neon and argon are significantly in error. The abundances from the new three-dimensional atmospheres have been challenged. The SSM is a spherically symmetric and static model of the Sun and may itself be the weak link. Three-dimensional simulations of the solar convection zone produce features not represented in the SSM (such as entrainment, g-mode waves, mixing, etc.). Are dynamic effects significant in the evolution of ordinary stars like the Sun? Significant extensions of three-dimensional simulations, which include rotation and magnetic fields, are needed to answer this question, which may be important for understanding both solar physics and stellar evolution. The First Stars What followed the Big Bang? After expansion and cooling, the Universe began to form stars and galaxies and in the process yielded the first elements beyond hydrogen and helium. What were these first stars like? Did they produce gamma-ray bursts? Did the first stars make the elemental pattern we observe today in the oldest stars? To simulate them accurately requires a predictive theory of stellar evolution. The
OCR for page 27
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering elemental yields from such stars are especially sensitive to mixing because there are almost no elements but H and He, so that any newly synthesized element is important. Three-dimensional simulations of turbulence, rotation, binary interaction, and convection, combined with a three-dimensional simulation of thermonuclear burning, are not currently feasible, but they are needed. If astrophysicists can develop an accurate and predictive theory of star formation that agrees with observations in the current epoch, it is likely they will be able to infer the properties of the first stars that formed in the early Universe. Arguably, understanding the formation of the first stars is easier than understanding star formation today, because the initial conditions were much simpler and because the first stars formed in almost pure H and He gas, in which the heating and cooling processes were much simpler and magnetic fields were probably not dynamically important. However, it is unlikely that we will ever be able to observe the formation of the first stars directly (although we may detect their impact on the surrounding medium). To be confident of the theoretical predictions, we must therefore be certain our numerical models are correct. This requires validation of theory using current epoch data. Major Challenge 5. Understanding Supernovae and Gamma-Ray Bursts and How They Explode Our understanding of the birth, evolution, and death of stars is the foundation of much of astrophysics. The most violent end points of stellar evolution are supernova explosions, resulting in the complete disruption of the star or in the collapse of the stellar core to form neutron stars or black holes, accompanied by gamma-ray bursts (GRBs). It is believed that these are the processes that created the chemical elements. Supernovae that are not of Type Ia result from the collapse of the core of a massive star. Despite great progress, core collapse supernovae remain mysterious: There is no clear understanding of why they explode, how they reach the point of explosion, or how the observed explosion properties are related to their history. Type Ia Supernovae One of the most spectacular insights from astrophysics is that most of the content of the Universe, being either dark matter or dark energy, is invisible and can be detected only by its effect on the curvature of space-time (that is, by its gravitational effects). This insight comes from our understanding of a particular type of exploding star, a Type Ia supernova. These objects are relatively uniform in behavior, so that their apparent brightness is a crude indicator of their distance from Earth. Careful study of nearby events allowed Mark Phillips to discover that the brightest events last longest; this modest effect allowed the mapping between apparent brightness and distance to be significantly refined. Type Ia supernovae are among the brightest observed events, so this distance scale reaches the limits of observation. Data on Type Ia supernovae, coupled with this distance scale, showed the expansion of the Universe to be accelerating and allowed the amount of dark matter and dark energy to be determined. The argument assumes that the nearby supernovae used to derive the Phillips relations have properties identical to the most distant supernovae. Direct examination of the spectral light from the explosions gives results consistent with this assumption. To improve our understanding, we need to know more about what Type Ia supernovae really are. It is currently thought that they are white dwarf stars, of mass close to the Chandrasekhar limit (1.45 solar masses, the maximum mass that can be supported without a continuing energy input), that ignite and explosively burn. The explosion is supposed to be brought on by accretion of mass from a binary companion.
OCR for page 28
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering How do the progenitors of Type Ia supernovae evolve to ignition? What are the properties of the star at ignition? What is the exact mechanism of the explosion? To answer these questions requires, in a critical way, a variety of studies that are dependent on HECC. Much of the understanding of Type Ia supernovae is based on computationally intensive simulations of the radiation and spectra. The identification of specific systems as progenitors of Type Ia supernovae depends on our ability to connect the observed properties to the theoretical models of the evolution of the stellar interior by plausible simulations. Fully three-dimensional simulations of turbulent, self-gravitating, thermonuclear-burning plasma are required; simulations with zoning sufficient to resolve the star down to the inertial range of turbulence are not yet feasible, so a combination of large-scale simulations with subgrid modeling and direct numerical simulation of subsections of the star is needed. Laboratory experiments with high-energy-density lasers and Z-pinch devices can be used to test some of the physics of the simulations; extensive computing is also needed to model the experimental results. Core Collapse in Massive Stars Much is unknown about core collapse in massive stars. There is no clear understanding of the nature of the explosion mechanism(s), how collapse proceeds, or of the expected fluxes of neutrinos and gravitational waves. However, two recent observations have revealed details that, if incorporated into new three-dimensional computational models, seem likely to increase our understanding of this fundamental process. The first set of these observations—of supernova 1987A, the brightest supernova observed since the invention of the telescope—showed striking agreement with theoretical models in some respects (the neutrino flux, for example), but its appearance with three independent rings, ejected during the presupernova evolution, was unexpected. These rings indicate obvious rotational symmetry, and they have been the subject of much speculation. What is the effect of rotation on core collapse? Since such supernovae are not spherically symmetric, they are poorly modeled by one-dimensional treatments. Shear flow in stellar plasma generate magnetic fields, so a three-dimensional magnetohydrodynamic simulation is required for realism. These simulations are very demanding computationally, but they have great promise. There is a wealth of observational data (from pulsars, magnetars, X-ray sources, and supernova remnants) with which such simulations could be compared. The second set of observations was the discovery that at least some GRBs are related to supernova explosions. These explosions are at cosmological distances, and they require energy supplies much larger than are provided by thermonuclear explosions. However, matter degrades gamma-ray energy, so the fact that GRBs are seen at all suggests that both a core collapse (for energy) and a near vacuum (to protect the gamma rays) are involved. A homogeneous mass distribution would degrade the GRBs and result in (only) a supernova. How does this shift come about? Can it be simulated? Figure 2-4 illustrates some of the surprises in store as three-dimensional simulations of stars become common. The stage is set for the core collapse of a massive star by the thermonuclear burning of oxygen nuclei. The simulations show new, previously ignored phenomena: The burning is wildly fluctuating and turbulent, and it mixes in new fuel from above and ashes from below the burning layer. Including such effects will change the theory of presupernova evolution. The implications go far beyond this important example. How stars evolve is intimately connected with how much mixing occurs in their interiors, so that all the theoretical understanding of stellar evolution and supernovae will be affected. It is finally possible to make progress with direct three-dimensional simulation; further progress will require the inclusion of rotation and magnetohydrodynamics.
OCR for page 29
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering FIGURE 2-4 The evolution of abundance gradients in the convective shell of a 23-solar-mass star shortly before core collapse and supernova explosion. This plot shows the abundance gradient (light colors indicate the highest gradient) as a function of radius and time. This plot was generated from a three-dimensional simulation using a compressible fluid dynamics code that includes multiple nuclear species and their thermonuclear burning. Regions of steep gradient in composition are indicative of turbulent mixing that brings entrained fuel (16O nuclei) into hot regions to be burned. Waves are generated in stably stratified regions above and below the convective shell, and they may be seen beyond the entrainment layers at the convective boundaries. Such oxygen-rich material will be explosively burned by the supernova explosion to form elements from silicon to iron, which will then be ejected into interstellar space, later to be seen in young supernova remnants like Cassiopeia A. Such time-dependent, multiscale, multifluid, three-dimensional simulations are changing the way astrophysicists think about how stars behave. SOURCE: Meakin and Arnett (2007). Major Challenge 6. Predicting the Spectrum of Gravitational Waves from Merging Black Holes and Neutron Stars A prediction of Einstein’s General Theory of Relativity is that accelerating masses should produce distortions in space-time called gravitational waves. For example, two stars in tight orbit around one another will produce a spectrum of these waves with a frequency determined by their orbital period. Such waves are detectable at Earth as extremely small distortions of space—that is, as a change in the distance between objects as the waves pass by. The United States has invested considerable resources to build the Laser Interferometric Gravitational Observatory (LIGO), an instrument designed to detect gravitational waves from astronomical sources such as close binary stars. The strongest signals will come from compact objects (such as black holes or neutron stars) in binary systems undergoing mergers. Binary black holes can be formed from the evolution of a binary system containing massive stars. When the stars die they become core-collapse supernovae (see Major Challenge 2), and if their cores are massive enough they will form black holes. The gravitational radiation emitted as the black holes orbit each other removes energy and angular momentum from the system, causing the orbits to decay, so that the stars move ever closer together. The closer they are, the stronger
OCR for page 30
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering the gravitational radiation and the faster the orbits decay. Finally, the stars merge into a single object in a final burst of gravitational radiation. The sensitivity required to detect this burst of radiation from merging black holes is staggering; it amounts to detecting changes in the distance between two points several kilometers apart that are smaller than the diameter of a neutron. The sensitivity of LIGO can be greatly increased if the spectrum of the expected radiation is known, so that the observed signal can be correlated with the expected waveform to test for a match. This requires computing the gravitational radiation signal from merging black holes. In principle, computing this waveform simply requires solving Einstein’s equations, a set of coupled partial differential equations that describe the evolution of space-time. However, physicists and mathematicians have struggled with this task for more than 30 years. Given the complexity of the equations, numerical methods are the only possible means of solving the problem. But developing accurate, stable, and reliable numerical algorithms to solve Einstein’s equations in three dimensions as a binary black hole merges has proved to be a considerable challenge. Recently, there was an enormous breakthrough in the field: Several groups have described numerical algorithms that work, and for the first time these groups have followed the merger of black holes in all three dimensions for many orbital periods. The fundamental components of the algorithm include solution of the partial differential equations (PDEs) using simple centered differencing, AMR to capture a range of scales near the merging black holes, constraint damping to limit the accumulation of truncation error, excision of singularities using internal boundary conditions over patches of the grid inside the event horizon, and adoption of a coordinate system that asymptotically extends to infinity to minimize reflection of outgoing waves from the boundaries. Current simulations typically use a grid of 2563 cells with 10-20 levels of refinement, running on distributed-memory parallel machines with up to several hundred processors. The first gravitational waveforms produced by these mergers have now been reported. Such results are precisely what LIGO and future gravitational wave observatories need to maximize their sensitivity. With this breakthrough, a whole new field of research is opening up in numerical relativity. First, a large parameter space of black hole mergers must be computed to understand how the outcome is affected when the mass and spin of the initial objects are varied. Next, both matter and electromagnetic fields must be incorporated into the numerical methods so that mergers of neutron stars can be followed. This will require substantial advances in the numerical methods because, although the mathematical properties of Einstein’s equations and the equations of magnetohydrodynamics have some important similarities, they are nonetheless quite different. (For example, the equations of magnetohydrodynamics admit discontinuities as solutions, whereas Einstein’s equations do not. Accurate and stable numerical algorithms for shock capturing are therefore required for magnetohydrodynamics.) Finally, as the simulation tools mature, they can provide the foundation for an enormous effort to extract a physical understanding of high-energy phenomena where relativistic effects are important (which goes well beyond the merger of binary systems to include core-collapse supernovae). Physics is at the beginning of a renaissance in the study of general relativity. The first generation of numerical tools required to solve Einstein’s equations are now available. Much work remains to be done, and all of it relies on computation. METHODS AND ALGORITHMS IN ASTROPHYSICS A wide variety of mathematical models, numerical algorithms, and computer codes are used to address the compelling problems in astrophysics. This section discusses some of the most important, organized by the mathematics.
OCR for page 31
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering N-body codes. Required to investigate the dynamics of collisionless dark matter, or to study stellar or planetary dynamics. The mathematical model is a set of first-order ODEs for each particle, with acceleration computed from the gravitational interaction of each particle with all the others. Integrating particle orbits requires standard methods for ODEs, with variable time stepping for close encounters. For the gravitational acceleration (the major computational challenge), direct summation, tree algorithms, and grid-based methods are all used to compute the gravitational potential from Poisson’s equations. PIC codes. Required to study the dynamics of weakly collisional, dilute plasmas. The mathematical model consists of the relativistic equations of motion for particles, plus Maxwell’s equations for the electric and magnetic fields they induce (a set of coupled first-order PDEs). Standard techniques are based on particle-in-cell (PIC) algorithms, in which Maxwell’s equations are solved on a grid using finite-difference methods and the particle motion is calculated by standard ODE integrators. Fluid dynamics. Required for strongly collisional plasmas. The mathematical model comprises the standard equations of compressible fluid dynamics (the Euler equations, a set of hyperbolic PDEs), supplemented by Poisson’s equation for self-gravity (an elliptic PDE), Maxwell’s equation for magnetic fields (an additional set of hyperbolic PDEs), and the radiative transfer equation for photon or neutrino transport (a high-dimensional parabolic PDE). A wide variety of algorithms for fluid dynamics are used, including finite-difference, finite-volume, and operator-splitting methods on orthogonal grids, as well as particle methods that are unique to astrophysics—for example, SPH. To improve resolution across a broad range of length scales, grid-based methods often rely on static and adaptive mesh refinement (AMR). The AMR methods greatly increase the complexity of the algorithm, reduce the scalability, and complicate effective load-balancing yet are absolutely essential for some problems. Transport problems. Required to calculate the effect of transport of energy and momentum by photons or neutrinos in a plasma. The mathematical model is a parabolic PDE in seven dimensions. Both grid-based (characteristic) and particle-based (Monte Carlo) methods are used. The high dimensionality of the problem makes first-principles calculations difficult, and so simplifying assumptions (for example, frequency-independent transport, or the diffusion approximation) are usually required. Microphysics. Necessary to incorporate nuclear reactions, chemistry, and ionization/recombination reactions into fluid and plasma simulations. The mathematical model is a set of coupled nonlinear, stiff ODEs (or algebraic equations if steady-state abundances are assumed) representing the reaction network. Implicit methods are generally required if the ODEs are solved. Implicit finite-difference methods for integrating realistic networks with dozens of constituent species are extremely costly. For all of these methods the main computational challenges are the enormous number of particles (1010-11 at present) and the large grids (as big as 20483, or 1010 cells, at the moment, with larger calculations planned for the future). Complex methods can require 103 or 104 flops per cell per time step and generate hundreds of gigabytes of data in a single snapshot. Floating point performance is often limited by cache access and the speed of the on-chip bus. Some algorithms (for example, grid-based fluid dynamics) scale very well to tens of thousands of processors, while others (for example, elliptic PDEs, such as Poisson’s equation) require global communication, which can limit scaling.
OCR for page 32
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering HECC FOR DATA ANALYSIS As indicated earlier, four of the six questions identified in this chapter as the major challenges facing astrophysics in the coming decade require HECC to test theoretical models against observational data. Without HECC, there might be little or no progress in these areas of astrophysics. For the other two major challenges identified earlier (the nature of dark matter and dark energy), the most productive mode of investigation will likely be observation, which will collect massive amounts of data. For example, the largest survey of the sky to date, the Sloan Digital Sky Survey, has generated about 2.4 TB of data over the past 5 years. Over the next 5 years, the PanSTARRs survey will generate 20-200 TB of data, with an image archive that may grow to 1.5 petabytes (PB). The Large Synoptic Survey Telescope, an 8-meter-class telescope that will provide the deepest survey of the sky, will generate 30 PB of data over the next 10 years. In addition, the scale of numerical computation envisioned in the future will also generate massive data sets. Current simulations of galaxy formation already generate 30 TB of data per run. This will grow to 1 PB per simulation in the near future. Management, analysis, mining, and knowledge discovery from data sets of this scale are a challenging problem in HECC in their own right. For data-intensive fields like astronomy and astrophysics, the potential impact of HECC is felt not just in the power it can provide for simulations but also in the capabilities it provides for managing and making sense of the data. Because the amount, complexity, and rate of generation of scientific data are all increasing exponentially, even some research challenges that are primarily addressed through observation are also critically dependent on HECC. Systems that deal with data on this scale at present usually need to access thousands, perhaps even hundreds of thousands, of disks in parallel; otherwise the system’s performance would be limited by the rate of input/output (I/O). Thus scalable parallel I/O software is extremely critical in such situations. There are many specific challenges within this domain and barriers to progress, including the following: Are there data management tools available that can manage data at the petascale while allowing access by scientists all over the world? For example, data obtained from sky surveys currently range from 1 to 2 PB a year, with the amount likely to increase as more sophisticated instruments with much higher resolutions are deployed. Similarly, as petascale simulations become feasible, many astrophysics calculations will produce multiple petabytes of data. What data management models are good on these scales? A traditional model assumes that the data are moved (for example, as the result of a query) to the place of their end use, but that model does not seem to be scalable. New models and architectures are needed that enable the querying and analysis of data to be distributed to the systems that store and manage data. Data volumes produced by simulations or observations are rising very fast, while computational and analytical resources and tools that are needed to perform the analysis are lagging far behind in terms of performance and availability. Ever-more-sophisticated algorithms and computational models result in larger data sets, but the bandwidth available to deal with these data is not keeping up. Thus, parallel and scalable techniques for analysis and handling data are needed. Online parallel algorithms that can guide the analysis and steer simulations have the potential to help. Scalable parallel file systems and middleware that can effectively aid the scientists with these massive volumes of data are critical. Traditional analysis techniques such as visualization are not scalable, nor are they suitable for knowledge discovery at petascale ranges, although they may enable the user to guide the knowledge-discovery process. Are there statistical and data-mining tools that can help scientists to make
OCR for page 33
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering discoveries from massive amounts of data? Can these tools be scaled appropriately? Data-mining methods such as clustering, neural networks, classification trees, association rules discovery, and Bayesian networks allow scientists to automatically extract useful and actionable patterns, representing information and knowledge, from data sets. They need to be adapted or developed for analyzing scientific applications so that they can enhance the scientists’ ability to analyze and learn from their simulations and experiments. The size of simulations in astrophysics, as described earlier, is reaching the petascale regime. Storing and retrieving the data and results for subsequent analyses are as big a challenge as is performing the computations. The challenge is to extract and share knowledge from the data without being overwhelmed by the task of managing them. How can scientists interact with their data under these conditions? Traditionally, scientists have done simulations or experiments and then analyzed them and published the results, which are subsequently used by others. An emerging model relies more on teams working together to organize new data, develop derived data, and produce analyses based on the data, all of which can be shared with other scientists. Thus more and larger scientific databases are being created. What are the exemplars for such sharing, and how can other scientists use these data sets to accelerate their knowledge discovery? Ontologies and taxonomies must be developed to enable that knowledge discovery. There are no guidelines for building or configuring balanced systems in terms of computation and I/O requirements. There are no benchmarks (analogous to those used to measure, say, the “top 500” most-powerful supercomputers) for balanced systems that incorporate I/O, storage, and analysis requirements. Would it be possible to build such benchmarks for balanced systems? REALIZING THE POTENTIAL IMPACT OF HECC ON ASTROPHYSICS Astrophysics is a computationally mature discipline with a long history of using computing to solve problems. To a large extent, the astrophysics community writes its own codes, and many astrophysicists are knowledgeable about programming issues on modern HECC architectures. Most computation in astrophysics uses a mix of numerical methods, including grid-based methods for hydrodynamics and magnetohydrodynamics, AMR schemes, particle-based methods for N-body and plasma dynamics, and methods for radiation transport using both grids and Monte Carlo algorithms. There are not enough researchers in the community to support a true community-code model, as is the case in atmospheric sciences, for example. Instead, individual groups of a few researchers develop alternative algorithms, and comparing the results of different groups is very beneficial to the research. Students in astrophysics are trained in computation at both the undergraduate and the graduate levels, and researchers trained in computation are heavily recruited. Most large-scale research computing in astrophysics is performed at the NSF-funded supercomputing centers or in university-owned (medium-sized) facilities. A few individual departments are fortunate enough to provide small- to medium-sized facilities to their members. Since such a large fraction of astrophysical research uses HECC, there seems never to be enough resources available, and astrophysics would certainly benefit from more access to much larger facilities. Such access should come with strong support of code and algorithm development efforts to add new physics and exploit new architectures. For all these reasons, progress in astrophysics is intimately tied to progress in HECC. On the hardware side, the components of HECC infrastructure that are most necessary for progress in astrophysics are (1) an increase in processing capabilities (flops), whether through faster processors or, for many computations, increased parallelism, (2) better I/O performance that is matched to the floating
OCR for page 34
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering point performance, and (3) hardware that can handle massive data sets. It is worthwhile to note that the hardware is in some ways the easiest need to satisfy. On the software side, the progress-limiting steps for astrophysics are (1) better methods for non-ideal magnetohydrodynamics, radiation transfer, and relativistic gas dynamics with general relativity (for a few examples) and (2) better visualization tools and better optimization and profiling tools that can make the codes run faster. Certain common components, such as the interface to AMR tools, file formats, and visualization tools, would benefit from common community standards. For many reasons—for example, the wide variety of physics involved in astrophysical research (from N-body dynamics to magnetohydrodynamics); the small size of the astrophysics community in comparison with other communities such as atmospheric sciences; and the evolving nature of the mathematical models used to describe astrophysical systems (from hydrodynamics to magnetohydrodynamics and, finally, weakly collisional plasma dynamics)—the development of true community codes is not likely in astrophysics. Nonetheless, the need for community standards is pressing. The astrophysics research community needs support for porting its own codes to petascale environments. No one knows which of the current algorithms and software will scale, because only a few astrophysics codes have scaled beyond a few hundred processors. Even when the mathematical models are known, transitioning them to a larger number of processors or new processor architectures presents challenges, some of them very difficult. Finally, it is important to emphasize that the most critical resource in the research enterprise, and the one that is always rate limiting, is the number of highly qualified personnel trained in HECC. Better training means the community can make more optimal use of existing resources. Thus, education in computational science must be emphasized at every level. There needs to be support for people whose expertise is at the boundary of computer science and the application discipline. People who really know how to make use of HECC capabilities are scarce and essential. To conclude, the committee identified the following likely ramifications of inadequate or delayed support of HECC for astrophysics: The rate of discovery would be limited. Those foregone discoveries would probably have enriched our fundamental understanding of the Universe and could have provided tangible benefits: For example, understanding how the atmospheres of Venus and Mars and of Titan (the largest moon of Saturn) affect the global climates of those bodies would be expected to help us model the changing climate of Earth. Inadequate support for HECC would lead to a shortage of training for highly qualified personnel. Inadequate support for HECC would limit our ability to capitalize on the investments in expensive facilities. Major observational facilities, especially those in space, can cost billions of dollars. Guidance from theoretical and computational modeling on how best to observe systems can dramatically increase the success rate of such facilities. For example, the LIGO requires templates of expected gravitational waveforms to distinguish signals from the noise. An accurate library of waveforms, which can only come from computation, could mean the difference between whether LIGO does or does not detect a signal. Some data are likely to be underexploited. Without enough HECC, much of the data from big surveys is unlikely to be processed in any meaningful way, which means losing information from which new discoveries might come. For example, detecting near-Earth asteroids requires high-time-resolution surveys over a large area of the sky. Although the likelihood of an asteroid
OCR for page 35
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering impacting Earth in the foreseeable future is extremely small, it does not seem wise to reduce our chance of detecting such a threat by failing to process that survey data. REFERENCES Bate, M.R., and I.A. Bonnell. 2004. Computer simulations of star cluster formation via turbulent fragmentation. In The Formation and Evolution of Massive Young Star Clusters. H.J.G.L.M. Lamers, L.J. Smith, and A. Nota., eds., ASP Conference Series, 322. San Francisco, Calif.: Astronomical Society of the Pacific, p. 289. Faucher-Giguere, C.-A., A. Lidz, and L. Hernquist. 2008. Numerical simulations unravel the cosmic web. Science 319:52. Heitsch, F., M.-M. Mac Low, and R.S. Klessen. 2001. Gravitational collapse in turbulent molecular clouds II: MHD turbulence. The Astrophysical Journal 547: 280. Krumholz, M.R., R.I. Klein, and C.F. McKee. 2007. Radiation-hydrodynamic simulations of collapse and fragmentation in massive protostellar cores. The Astrophysical Journal 656: 959. Meakin, Casey A., and David Arnett. 2007. Turbulent convection in stellar interiors, I: Hydrodynamic simulation. The Astrophysical Journal 667 (1): 448-475. NRC (National Research Council). 2001. Astronomy and Astrophysics in the New Millennium. Washington, D.C.: National Academy Press. NRC. 2003. Connecting Quarks with the Cosmos: Eleven Science Questions for the New Century. Washington, D.C.: The National Academies Press. NSTC (National Science and Technology Council). 2004. The Physics of the Universe: A Strategic Plan for Federal Research at the Intersection of Physics and Astronomy. Washington, D.C.: NSTC. Springel, Volker, Simon D.M. White, Adrian Jenkins, et al. 2005. Simulations of the formation, evolution and clustering of galaxies and quasars, Nature 435: 629.
OCR for page 36
The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering This page intentionally left blank.