Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
Summary
T
he National Center for Science and Engineering Statistics (NCSES)
of the National Science Foundation (NSF) communicates its science
and engineering information to data users in a very fluid environ-
ment that is undergoing modernization at a pace at which data producer
dissemination practices, protocols, and technologies, on one hand, and user
demands and capabilities, on the other, are changing faster than the agency
has been able to accommodate.
NCSES asked the Committee on National Statistics and the Computer
Science and Telecommunications Board of the National Research Council
to form a panel to review the NCSES communication and dissemination
program that is concerned with the collection and distribution of informa-
tion on science and engineering and to recommend future directions for the
program according to its statement of task (see Box S-1).
The Panel on Communicating National Science Foundation Science
and Engineering Information to Data Users reviewed NCSES’s existing
approaches to communicating and disseminating statistical information,
including the division’s information products, website, and database sys-
tems; examined existing NCSES data on websites, information gathered by
and from NCSES staff, volunteered comments of users, and input solicited
by the panel from key user groups; assessed the varied needs of different
types of users in the NCSES user community; considered the impact that
current federal and NSF website guidance and policies have on the design
and management of the NCSES online (Internet) communication and dis-
semination program; considered current research and practice in collecting,
storing, and utilizing metadata, with particular focus on specifications for
1
OCR for page 2
2 COMMUNICATING SCIENCE AND ENGINEERING DATA
BOX S-1
Statement of Task
An ad hoc panel will review the communication and dissemination
program of the National Science Foundation (NSF) National Center for
Science and Engineering Statistics (NCSES) that is concerned with the
collection and distribution of information on science and engineering and
recommend future directions for the program. Specifically, the panel will
1. Review NCSES’s existing approaches to communicating and dis-
seminating statistical information, including the division’s informa-
tion products, website, and database systems. [This review will be
conducted in the context of both current “best practices” and new
and emerging techniques and approaches.]
2. Examine existing NCSES data on websites, information gathered
by and from NCSES staff, volunteered comments of users, and
input solicited by the panel from key user groups, and assess
the varied needs of different types of users within NCSES’s user
community.
3. Consider the impact current federal and NSF website guidance
and policies have on the design and management of the NCSES’s
online (Internet) communication and dissemination program.
4. Consider current research and practice in collecting, storing, and
utilizing metadata, with particular focus on specifications for social
science metadata developed under the Data Documentation Initia-
tive (DDI).
5. Consider the impact of government-wide activities and initiatives
(such as FedStats, Data.gov) and the emerging user capability for
online retrieval of government statistics.
The panel will facilitate its review by conducting a 2-day public work-
shop that will feature invited presentations and discussions. The panel
will subsequently prepare an interim letter report that will focus on is-
sues regarding transition from current approaches and a final report with
specific recommendations, including a discussion of related technical,
staffing, and funding issues.
social science metadata; and considered the impact of government-wide
activities and initiatives (such as FedStats, Data.gov) and the emerging user
capability for online retrieval of government statistics.
In accomplishing this review, the panel conducted two workshops to
gather information from data users and experts on various aspects of data
OCR for page 3
3
SUMMARY
storage, retrieval, dissemination, and archiving. An interim report issued
early in 2011 addressed data content and presentation, meeting changing
storage and retrieval standards, understanding data users and their emerg-
ing needs, and data accessibility. The analysis and recommendations from
the interim report are carried into this final report, along with the results
of subsequent analysis by the panel.
These are exciting and challenging times for federal government statisti-
cal agencies responsible for disseminating their data products to their user
communities, and the times are especially challenging for NCSES, which
is finding the importance of its data magnified many fold by the growing
recognition of the role that science and engineering investment is playing as
a source of economic growth. The vision of a data dissemination program
for NCSES is also in a time of flux. The agency is confronting new roles
and missions, as directed in the America COMPETES Reauthorization Act
of 2010, which changed more than its name. Technology is also opening
the door to significant leaps in the ability of NCSES to communicate data
and analytical products to data users. The promise of such services as Data.
gov and the emergence of such private-sector solutions as the Google Public
Data Explorer are just becoming recognized. The semantic web (Web 3.0)
holds promise of communicating data to users in entirely new ways, much
to the advantage of users and the federal agencies themselves (Berners-Lee
and Hendler, 2001). These technological advances open the way to new
opportunities, but they are also problematic in that they are rapidly pro-
mulgated and, many times, they rapidly become obsolete. The panel sug-
gests that NCSES adopt an approach to modernization that stresses the
basics of data provision (common formats with appropriate metadata) and
partnerships with the private sector as opportunities become available, so
that NCSES will avoid the issue of rapid obsolescence associated with rapid
change in the particular tools and systems offered by the private sector.
In the face of these environmental and technological forces, we make
a number of recommendations to the National Center for Science and
Engineering Statistics to improve its dissemination program. The first set
of recommendations has to do with how the survey-based data are received
and input into the NCSES database, managed once there, and preserved for
posterity. (The recommendations are numbered as they appear in the body
of this report.)
Recommendation 3-1. The National Center for Science and Engineer-
ing Statistics should incorporate provisions in contracts with data
providers for the receipt of versioned microdata, at the level of detail
originally collected, in open machine-actionable formats.
OCR for page 4
4 COMMUNICATING SCIENCE AND ENGINEERING DATA
Recommendation 3-2. The National Center for Science and Engineer-
ing Statistics should transition to a dissemination framework that
emphasizes database management rather than data presentation and
strive to use auditable machine-actionable means, such as version con-
trol, to ensure integrity of the data and make the provenance of the
data used in publications verifiable and transparent.
Recommendation 3-3. The National Center for Science and Engineer-
ing Statistics (NCSES) should require that data received from contrac-
tors be accompanied by machine-actionable metadata so as to allow
for automated production of NCSES publications, comparability with
previous analysis, and efficient access for third-party visualization,
integration, and analysis tools.
Recommendation 3-4. The National Center for Science and Engineer-
ing Statistics should proceed to make its data available through open
interfaces and in open formats compatible with efficient access for
third-party visualization, integration, and analysis tools.
Recommendation 3-5. The National Center for Science and Engineer-
ing Statistics should develop a plan for redesign of its retrieval tools
utilizing the emerging, sustainable capabilities of other government and
private-sector resources.
Recommendation 3-6. The National Center for Science and Engineer-
ing Statistics (NCSES) should work with the National Archives and
Records Administration (NARA) to ensure long-term access and preser-
vation of all of its publications and all data necessary to replicate these
publications. As a necessary step, NCSES should review and update
the request for disposition authority that is filed with NARA to ensure
prompt and complete disposition of records and should regularly
review the status of compliance with the records retention directive.
Engaging with its data users is an essential activity for NCSES. There is
much that can be done to make that engagement more productive.
Recommendation 4-1. The National Center for Science and Engineering
Statistics (NCSES) should analyze the results of its initial online con-
sumer survey and refine it over time. Using input from other sources,
such as regular structured user focus groups and panel-based periodic
user surveys, NCSES should regularly and systematically collect and
analyze patterns of data use by web users in order to develop a typol-
ogy of data users and to identify usability issues.
OCR for page 5
5
SUMMARY
Recommendation 4-2. The National Center for Science and Engineer-
ing Statistics should educate users about the data and learn about the
needs of users in a structured way by reinstating the program of user
workshops and instituting user webinars.
Recommendation 4-3. The National Center for Science and Engineer-
ing Statistics should employ user-focused design and user analysis,
starting with an initial heuristic evaluation and continuing as a regular
and systematic part of its website and tool development.
Recommendation 4-4. The National Science Foundation should spon-
sor research and development on accessible data visualization tools
and approaches and potential other means for browsing and explor-
ing tabular data that can be offered via web, mobile, and tablet-based
applications, or browser-based ones.
The implementation of this report’s recommendations should be
undertaken within an overall framework that accords priority to the basic
quality of the data and the fundamentals of dissemination, then to signifi-
cant enhancements that are achievable in the short term, while laying the
groundwork for other long-term improvements. The framework could be
organized along the following lines (highest priority first):
1. Focus on collecting the right data (by contractor or otherwise);
using appropriate change management and version control to estab-
lish data provenance, flag data errors and correct them; annotating
those data with sufficient machine-actionable metadata to establish
a process for interpreting the data, enabling efficient access to
third-party data and to automated NCSES publications; and pub-
lishing the data in formats with web-accessible open interfaces for
all to use.
2. Publish methods for combining old data and new data that have
been collected under different assumptions or categories or that are
disseminated in ways that make them difficult to reintegrate—this
is especially necessary for the data from the old and new industry
research and development expenditure surveys that will populate
the Industrial Research and Development Information System.
3. Provide the essential data reductions and visualizations that NSF’s
mission requires, for example, when Congress asks for authorita-
tive data on a certain topic, a trusted group must be able to use the
data and derived publications to calculate answers.
4. Provide a growing array of visualizations and printed products
tailored for the many different uses and users.
OCR for page 6
6 COMMUNICATING SCIENCE AND ENGINEERING DATA
Not every recommendation made in this report can or should be imple-
mented immediately. Some recommendations must build on the implemen-
tation of others; for example, development of a database structure that
can support accessibility through the semantic web requires that NCSES
obtain data from its contractors in different formats than are now received
and that it define metadata to accompany the data elements. We therefore
suggest a time-phased approach to improving data dissemination, focusing
on five major initiatives:
1. Change the means and content of the data received from contrac-
tors and actively participate in the development and implementa-
tion of the Data.gov compatible metadata standard now being
explored by W3C and the SCOPE project.
2. Gain a better understanding of the needs of users of the data—
those primary, secondary, and tertiary blocks of users—and then
use the information to engage them in an effort to educate them
and otherwise meet their needs.
3. Conduct a continuous usability evaluation program, much akin to
a program of continuous improvement that is part and parcel of
any total quality management program.
4. Provide data in retrievable formats and encourage private-sector
providers and individual users to import the data into their visual-
ization tools.
5. Ensure full short- and long-term access to the data by updating its
retrieval tools and ensuring proper archiving of its publications and
database.