Skip to main content

Currently Skimming:

2 Cyberinfrastructure
Pages 14-40

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 14...
... A successful combustion CI will require the following: • Deep engagement of the leading scientists in the field, who will supply the models, algorithms, software, data, and other tools that are to be shared through the CI and who will exploit those shared resources to advance combustion research and develop ment (R&D) ; • A critical mass of information technology and scientific domain professionals, ideally all at a single location, to manage and guide the CI; • Resources not only for doing calculations but also for implement ing and sustaining a detailed and long-range plan to store and curate the product data so that they can be mined for insight by others in the community; 14
From page 15...
... In fact, the primary value that a CI can provide to the combustion community is to bridge the disparate subcommunities (kineticists, fluid dynamicists, industrial designers, and so on) , and so, by definition, it must be broad and encompassing.
From page 16...
... The CI of the combustion community cannot operate in a vacuum. It will have to interact with other CIs and leverage cyber tools and meth odologies developed by other communities.
From page 17...
... ; and • The use of collaboration and Web technology to create domain data portals for education and outreach (discussed in the subsection below entitled "Science Gateways")
From page 18...
... could help address these issues. In addition, the combustion community should agree on a set of community codes to be included in the CI and maintained by CI staff.
From page 19...
... Because of the publication value of their data, some scientists would rather "share their toothbrush than share their data." However, there is a great deal of community data which, if made available in an easily accessible manner, could accelerate scientific progress. The National Institutes of Health has policies that require some data from funded research be made public.3 Although this is an enlightened policy, there are no standard mechanisms to provide the metadata4 to make the data reusable; and building repositories that are robust enough to keep the data alive and available is a substantial CI challenge.
From page 20...
... These closely aligned organizations are persistent and provide both computation and data resources shared by the whole community. Other large groups, such as the Storm Prediction Center,7 consist of important specialists who augment the strength of the central facilities; other individual groups work with resources provided by the entire network of resources.
From page 21...
... munity of researchers through a multilevel network of data providers. The National Radio Astronomy Observatory10 provides a similar central organization for the widely distributed radio astronomy community.
From page 22...
... Instead, combustion scientists use a range of distinct computational tools along with experi mental data, each appropriate to a particular regime, to build models that can be used for simulations at larger scales. Chemical reactions are determined from the synthesis of a broad range of disparate data and simulation tools based on quantum mechanical methodologies such as models for Schrödinger's equation.
From page 23...
... For exascale computing systems, an international planning activ ity is currently underway13 to assess the short-term, medium-term, and long-term software and algorithmic needs of applications for petascale and exascale systems, and to develop a roadmap for software and algorithms on extreme-scale systems. The opportunities provided by petascale and exascale computers for advancing combustion science and engineering are great; however, major investments in software will be required to take full advantage of these new capabilities.
From page 24...
... entitled Long-Lived Digital Data Collections: Enabling Research and Education in the 21st Century provide a framework for considering how digital repositories for combustion research may evolve. • Research collections are the most localized, usually associated with a single investigator or a small laboratory, with limited application of standards, perhaps no intention to archive data over time, and little funding for the management of data.
From page 25...
... Data‑Curation Aims and Challenges Accessible and functional data collections are essential research assets that, if developed through systematic curation principles and methods, will grow not just in size but also in value. Here, "data curation" is defined as the active and ongoing management of data through its life cycle of value to society.
From page 26...
... 26 TRANSFORMING COMBUSTION RESEARCH THROUGH CYBERINFRASTRUCTURE TABLE 2.1 Data Sharing Frameworks and Implications Implications for Data Aspect Approach Characteristics Producers Structure Centralized Single host location Deposition services Normalized format Coordinated acquisition location Normalized format Federated Single access point Limited to participants Enforced data standards Distributed Individual points of Local responsibility for access storage Control of format Access Open Unlimited access and Not an option for reuse sensitive data sets Hybrid Access control as needed Ability to restrict access and use Management of sensitive data Controlled Registration and Controlled sharing permissions required Minimized risks Management Local Case-by-case decisions Control retained Potential inconsistency Maintenance and distribution burden Central Governance by committee Policy-driven options or central authority for control SOURCE: Adapted from Pinowar et al.
From page 27...
... 27 CYBERINFRASTRUCTURE Considerations for Developers, Service Implications for Data Consumers Providers, and Stakeholders High visibility Easier coordination of development and maintenance than with other approaches Optimized retrieval Best for common data types Enabled browsing Requires sustained funding and personnel Same attributes as with centralized More complex infrastructure than for approach, but with more complex centralized structure oversight Requires proactive work with participants Requires sustained funding and personnel Low visibility No control or formal coordination More difficult retrieval than for other Rarely maintained for the long term approaches Complications with interpretation, consistency, integration, and sustainability No barriers to participation Maximizes potential reuse Some barriers to access Requires coordination Access can be complex, time- Accommodates needs for privacy and consuming security Ad hoc, inequitable access Can support gradual transition to more open sharing over time Guidelines for access and use Enables consistency Potential for building community consensus and standards
From page 28...
... . Much enthusiasm is voiced for the "open data" movement in the flurry of reports from scientific agencies and in the popular scientific press, but studies of data practices point to some of the key obstacles that would be faced in the development of CI capabilities for com bustion research.
From page 29...
... Scientists rarely have the time or skills needed to prepare data for public sharing (Research Information Network, 2008) , resulting in the need for invest ment in metadata production, preferably at the point of data generation, and a high level of resources directed to supporting functions during the acquisition and ingestion stages of curation (Beagrie et al., 2008)
From page 30...
... Some high-value functions for the combustion community include the registration and certification of data sets and an awareness of research production trends, features that have often emerged as unintended outcomes of repository development rather than by design. There is a real opportunity in CI planning to exploit these capabilities to the fullest.
From page 31...
... Participation can be most readily extended through existing social networks, but attention to incentives for encourag ing contributions from more disconnected groups can bring in valuable data, technologies, techniques, resources, and expertise for solving com bustion research problems. As linkages and complexity increase, there is a continual need for the translation of requirements and contributions across the fields represented in the growing user base.
From page 32...
... Below, two of the most prominent platforms that the combustion community might consider as components of its CI are briefly examined: science gateways, represented by the nanoHUB, and cloud computing. The development effort will also require leveraging and link 15 Availableat http://www.nsf.gov/pubs/2007/nsf07601/nsf07601.htm.
From page 33...
... Fea tures common to most gateways include the following: tools for work flow management, so that applications can be rapidly composed from existing components; the ability to personalize the gateway; extensive documentation; and help-desk processes. The TeraGrid organization has a small staff of professionals available to help set up a portal for a science gateway.
From page 34...
... The underlying technology platform, HUBzero, now powers eight different "hubs" in various scientific and engineering disciplines and is being readied for an open-source release.16 At its core, a "hub" is a Web site built with familiar open-source packages -- a Linux system running an Apache Web server with a Lightweight Directory Access Protocol17 for user logins, PHP18 Web scripting, Joomla! 19 content-management system, and a MySQL20 database for storing content and usage statistics.
From page 35...
... 22 The Rappture (Rapid APPlication infrastrucTURE) toolkit provides the basic infrastruc ture for the development of a large class of scientific applications, allowing scientists to focus on their core algorithm.
From page 36...
... It does well at the dissemination of new research methods -- especially those that are concretely instantiated in a simulation tool that often meets the needs of experimentalists and promotes collaboration with them. As the field matures, computational demands are increasing, and the nanoHUB is being challenged to make cloud computing work for a community of people focused on solving problems and exploring ideas rather than on computational science per se.
From page 37...
... Cloud Computing Science gateways and portals provide a Web interface to scientific data and data-analysis tools. These tools are traditionally hosted on small servers that sit in a researcher's laboratory.
From page 38...
... This is especially important when the same sequence of tasks must be accomplished for hundreds of different input data collections. In the early 2000s, a number of scientific work flow tools were devel oped (Taylor et al., 2007)
From page 39...
... 2010. "Data Sharing, Small Sci ence, and Institutional Repositories." Philosophical Transactions of the Royal Society A
From page 40...
... Available at http://eprints.erpanet.org/82/01/DCC_Vision.pdf. Accessed December 19, 2010.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.