Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 7
1
The Changing Data
Dissemination Landscape
T
he National Center for Science and Engineering Statistics (NCSES) of
the National Science Foundation (NSF) communicates its science and
engineering (S&E) information to data users in a very fluid environ-
ment in which data dissemination practices, protocols, and technologies,
on one hand, and user demands and capabilities, on the other, are changing
faster than the agency has been able to accommodate. In this chapter, we
discuss how strong forces are driving changing expectations on the part of
users of S&E resource and workforce data, as well as how technology and
a changing policy and analytical environment in the federal government are
forcing NSF to rethink and modernize the manner in which NCSES com-
municates information to the public.
To help understand how NCSES can respond to the driving forces
that we document, we also discuss the environment that it faces and that
faces federal statistical agencies in general. For NCSES, this environment is
determined by policies established by the Office of Management and Budget
(OMB), NSF, and its own policies and procedures that have evolved over
the years.
S&E INVESTMENT AND ECONOMIC GROWTH
Much of the pressure that NCSES faces to modernize the way it dis-
seminates information stems from the subject matter itself. It has become
increasingly understood that investment in research and development
(R&D) creates a platform for innovation and that innovation is a major
determinant of national economic competitiveness and growth. It has like-
7
OCR for page 8
8 COMMUNICATING SCIENCE AND ENGINEERING DATA
wise been increasingly apparent that an associated major determinant is
human capital, represented in the output of programs of education and
training for the S&E workforce.
The relationship of innovation and science, technology, engineering,
and mathematics (STEM) education has been recognized in several major
reports, and these reports have formed the basis for major program initia-
tives. The recent report, Rising Above the Gathering Storm: Energizing and
Employing America for a Brighter Future, concluded that a primary driver
of the future economy and concomitant creation of jobs will be innova-
tion, largely derived from advances in science and engineering (National
Academy of Sciences, National Academy of Engineering, and Institute of
Medicine, 2007). Underscoring the case for R&D investment is the conclu-
sion by the National Science Board that “while only four percent of the
nation’s work force is composed of scientists and engineers, this group
disproportionately creates jobs for the other 96 percent” (National Science
Board, 2010a, Figure 3-3).
The 2010 follow-up report to the Gathering Storm report further con-
cluded that “substantial evidence continues to indicate that over the long
term the great majority of newly created jobs are the indirect or direct result
of advancements in science and technology, thus making these and related
disciplines assume what might be described as disproportionate impor-
tance” (National Academy of Sciences, National Academy of Engineering,
and Institute of Medicine, 2010, p. 18).
The conclusions of these reports are based on analysis that relies heav-
ily on the data that are produced by NCSES. Indeed, the need for good
data on science and engineering was recognized as a principle for com-
petitiveness in another recent report, which concluded that “benchmarking
national competitiveness across a set of established and forward looking
metrics—measuring both inputs such as education, R&D spending, patents
and outputs such as job creation, new industries and products, gross domes-
tic product growth and quality of life—is necessary to drive the successful
development and implementation of appropriate competitiveness policies”
(Global Confederation of Competitiveness Councils, 2010, p. 3).
The three pillars on which the White House Strategy for American
Innovation are built—education, research, and private-sector innovation1—
are topics on which NCSES now collects data. The White House strategy
focuses on educating the next generation with 21st century skills, creating
a world-class workforce, and strengthening and broadening American lead-
ership in fundamental research. In order to measure progress in educating
the next generation, data are needed on progress in STEM education and
its outcomes. The place of American leadership in fundamental research
1 See http://www.whitehouse.gov/innovation/strategy/executive-summary [November 2011].
OCR for page 9
9
THE CHANGING DATA DISSEMINATION LANDSCAPE
requires data on investments in fundamental science by the public and pri-
vate sectors, as well as information on the nature and benefits of federally
funded investments in research. The White House strategy requires measur-
ing private-sector innovation expenditures (via the Business Research and
Development Information Survey).
Recent legislation also underscored the importance of NCSES data.
The America Creating Opportunities to Meaningfully Promote Excellence
in Technology, Education, and Science (America COMPETES) Reauthori-
zation Act of 2010 requires “a comprehensive study of the economic com-
petitiveness and innovative capacity of the United States.” This law, among
other initiatives, changed the name and mission of NCSES (see below). It
strongly emphasized the need for improvements in the current competitive
and innovation performance of the U.S. economy relative to other countries
that compete economically with it; coming to grips with regional issues
that influence the economic competitiveness and innovation capacity of the
United States; and evaluating the effectiveness of the federal government in
supporting and promoting economic competitiveness and innovation. All
of these initiatives require access to the kind of information that NCSES
produces in its data collections.
A BROADER MISSION FOR NCSES
The new emphasis on innovation and competitiveness has been
reflected in the new mission statement for NCSES. Not only was the Science
Resources Statistics Division (SRS) renamed the National Center for Sci-
ence and Engineering Statistics by Section 505 of the America COMPETES
Act, but also new roles and missions were assigned. Several words in the
new mission statement signal this new direction: serve as a “central Federal
clearinghouse” for the collection, interpretation, analysis, and “dissemina-
tion” of objective data on science, engineering, technology, and research
and development. NCSES expects to use the findings and recommendations
in this report in determining how best to implement its new dissemination
mandate.
According to the America COMPETES Act, the dissemination func-
tion is to cover “data related to the science and engineering enterprise in
the United States and other nations that is relevant and useful to practitio-
ners, researchers, policymakers, and the public, including statistical data
on—(A) research and development trends; (B) the science and engineering
workforce; (C) United States competitiveness in science, engineering, tech-
nology, and research and development; and (D) the condition and progress
of United States STEM education.” Data collections related to U.S. com-
petitiveness and STEM education are part of these new responsibilities.
OCR for page 10
10 COMMUNICATING SCIENCE AND ENGINEERING DATA
We note that these new roles and responsibilities came without additional
resources in terms of budget or staff.
The next two sections present examples of the role that NCSES data
play in supporting initiatives to develop federal R&D indicators.
SCIENCE OF SCIENCE POLICY
The need for science and engineering metrics has been embedded in
the NSF Science of Science and Innovation Policy (SciSIP), as originally
articulated by John H. Marburger III, the former director of the Office of
Science and Technology Policy (OSTP) and presidential science adviser.
According to the agency’s description, “the SciSIP program underwrites
fundamental research that creates new explanatory models, analytic tools
and datasets designed to inform the nation’s public and private sectors
about the processes through which investments in S&E research are trans-
formed into social and economic outcomes. Or, put another way, SciSIP
aims to foster the development of relevant knowledge, theories, data, tools,
and human capital. SciSIP’s goals are to understand the contexts, structures
and processes of S&E research, to evaluate reliably the tangible and intan-
gible returns from investments in R&D, and to predict the likely returns
from future R&D investments within tolerable margins of error and with
attention to the full spectrum of potential consequences” (National Science
Foundation, 2008).
The STAR METRICS (Science and Technology for America’s Reinvest-
ment: Measuring the EffecTs of Research on Innovation, Competitiveness
and Science) program is led by an interagency consortium consisting of the
National Institutes of Health (NIH), NSF, and OSTP (Lane and Bertuzzi,
2010). The goal of the program is to create a data infrastructure that will
permit the analysis of the impact of science investments using administra-
tive records as well as other electronic sources of data. The program will
have two phases. The first phase will use university administrative records
to calculate the employment impact of federal science spending through the
American Recovery and Reinvestment Act and agencies’ existing budgets.
The second phase will measure the impact of science investment in four
key areas:
Economic growth will be measured through such indicators as
•
patents and business start-ups.
Workforce outcomes will be measured by student mobility into the
•
workforce and employment markers.
Scientific knowledge will be measured through publications and
•
citations.
OCR for page 11
11
THE CHANGING DATA DISSEMINATION LANDSCAPE
Social outcomes will be measured by the long-term health and
•
environmental impact of funding.
The metrics derived from the NCSES surveys are essential inputs to
such science, innovation, and competitiveness metrics. The emphasis on
metrics has been adopted and codified as a key element in the NSF Strategic
Plan for 2011 to 2016 (National Science Foundation, 2011, p. 9)
RESEARCH AND DEVELOPMENT DASHBOARD
As this report was being prepared, OSTP further underscored the
importance of innovation to the economy by announcing the launch of an
online tool that permits tracking of U.S. progress in innovation. The R&D
Dashboard is a website that demonstrates the impacts of federal invest-
ments in R&D (Koizumi, 2011).
The initial R&D Dashboard website presents data on federal R&D
awards to research institutions and links those inputs to outputs—
specifically publications, patent applications, and patents produced by
researchers funded by those investments—from two agencies over the
decade from 2000 to 2009: NIH and NSF play a significant role in funding
basic research in the United States; more than 80 percent of the federal gov-
ernment’s support of university-based research, for example, comes from
these two agencies. The site gathers information from two federal sites,
USASpending.gov and IT.USASpending.gov, and has information on R&D
investments at the state, congressional district, and research institution lev-
els. Information that feeds the Dashboard from these two sites, however, is
not being updated because of funding cuts.2
The OSTP R&D Dashboard is designed to answer questions of the
following kind: Which institutions by state are performing federally funded
research? What fields of science are emphasized locally? Where are the
hot spots for robotics, for example, or optical lasers, or advanced textiles
resulting from federally funded research? How are federal research grants
contributing to the scientific literature by field of science?
The Dashboard is looked on as a first step. OSTP plans to explore fun-
damental changes in how data on R&D are made available to the public. As
in other areas included in the push for greater transparency, the emphasis
will be on testing models for making R&D-related data from contributing
agencies available in ways that are secure, interoperable, and usable by a
wide array of potential users. The initial emphasis will be to coordinate
further development with coordinating bodies supported by OSTP, includ-
2 S ee http://www.washingtonpost.com/blogs/federal-eye/post/new-cios-role-will-be-belt-
tightening/2011/03/23/gIQAdTjbuI_blog.html [August 2011].
OCR for page 12
12 COMMUNICATING SCIENCE AND ENGINEERING DATA
ing the National Nanotechnology Initiative and the National Coordination
Office (NCO) for Networking and Information Technology Research and
Development (NITRD).
INTERNET TRANSFORMS THE DISSEMINATION ENVIRONMENT
In the realm of information dissemination, the Internet has been chang-
ing everything for some time. The ongoing radical transformation in the
modes of data dissemination has profound implications for NCSES.
More than 15 years ago, the OMB’s Federal Committee on Statisti-
cal Methodology (FCSM) recognized the growing presence of electronic
options for data dissemination in a report entitled Electronic Dissemination
of Statistical Data (Office of Management and Budget, 1995). The authors
of this report were quite prescient in noting that the rapid expansion of
computer technology had “led to vast changes in the supply of and demand
for Federal statistical data. Technology is no longer the primary barrier
between users and information.” The authors forecast even further changes
with the advent of a national information infrastructure that would have
even greater impact. The report concluded that statistical agencies would
need “to adopt new methods of disseminating statistical information and
data to replace the traditional means that used to serve as the principal
source of statistical information” (p. 1).
The day foretold by the FCSM committee has long since arrived. The
current choices are no longer between paper publications and electronic
dissemination, but between various modes of and options for electronic dis-
semination. Like many other statistical agencies, NCSES has, except for a
few special publications, largely abandoned hard-copy publication of its data.
Now there are a multitude of choices among electronic means of retrieving
reports and data elements—the most prominent of these choices for the fed-
eral statistical agencies today are FedStats and Data.gov, which are discussed
in Chapter 2.
From a handful of interconnected government and university research
computers, the Internet has grown to near ubiquity, and today’s users
search the web for more information than was available in the past.3 More-
over, with the increased availability of broadband and high-speed Internet
access, dynamic, multimedia-laden websites are replacing formerly static
web pages, with the consequence that users have the expectation of being
able to interact with the information for which they are searching.
Moreover, a recent poll of the Pew Internet and American Life Project
3 Seehttp://www.pewinternet.org/~/media//Files/Reports/2008/PIP_Search_Aug08.pdf [No-
vember 2011].
OCR for page 13
13
THE CHANGING DATA DISSEMINATION LANDSCAPE
showed that access to the Internet is quickly becoming “untethered”4 and
users are turning to smartphones and other mobile devices for access to
the World Wide Web, social networking, and email. As a consequence,
those who disseminate information will need to react to these changes, by
continuing to leverage the newest means to access and interact with infor-
mation on the web.
The U.S. government has made a number of mobile applications avail-
able on the USA.gov website. Several agencies have developed a mobile
edition of their website—an abridged version available to users of smart-
phones, tablets, personal data assistants (PDAs), and other mobile handheld
devices. Taking into account the information that it is beginning to collect
in the online survey of data users (described in Chapter 4), NCSES could
profitably consider mobile versions of its web presence, perhaps beginning
with the development of a mobile application for its announcements of
product releases and the InfoBrief series.
FEDERAL GOVERNMENT DATA DISSEMINATION POLICIES
As a federal statistical agency, NCSES operates within a set of OMB
guidelines that cover a wide variety of statistical practices, from survey
design to data collection to dissemination. The federal government’s poli-
cies regarding dissemination of information to the public are promulgated
by OMB under the authority of the Paperwork Reduction Act (PRA) of
1980, Public Law 96-511, as amended by the Paperwork Reduction Act
of 1995, Public Law 104-13 (44 USC 35). The PRA mandate is broad,
calling on agencies to “perform their information activities in an efficient,
effective, and economical manner” (Office of Management and Budget,
2000).
Under this authority, published in OMB Circular A-130, NCSES is
required to (a) disseminate information in a manner that achieves the best
balance between the goals of maximizing the usefulness of the information
and minimizing the cost to the government and the public; (b) distribute
information dissemination products on equitable and timely terms; (c) take
advantage of all dissemination channels, federal and nonfederal, includ-
ing state and local governments, libraries, and private-sector entities, in
discharging agency information dissemination responsibilities; and (d) help
the public locate government information maintained by or for the agency.
NCSES is also called on to maintain and implement a management
system for all information dissemination products that ensures that mem-
bers of the public with disabilities, whom the agency has a responsibility
4 Seehttp://www.pewinternet.org/Commentary/2010/September/Technology-Trends-Among-
People-of-Color.aspx [November 2011].
OCR for page 14
14 COMMUNICATING SCIENCE AND ENGINEERING DATA
to inform, have a reasonable ability to access the U.S. Government Printing
Office for distribution to depository libraries. Electronic information dis-
semination is encouraged.
These broad guidelines of Circular A-130 are further detailed in the
OMB standards and guidelines for statistical surveys (Office of Manage-
ment and Budget, 2006). The standards suggest that, when information
products are disseminated, NSF should provide users with access to the
following information:
1. definitions of key variables;
2. source information, such as a survey form number and description
of methodology used to produce the information or links to the
methodology;
3. quality-related documentation, such as conceptual limitations and
nonsampling error;
4. variance estimation documentation;
5. time period covered by the information and units of measure;
6. data taken from alternative sources;
7. point of contact to whom further questions can be directed;
8. software or links to software needed to read/access the information
and installation/operating instructions, if applicable;
9. date the product was last updated; and
10. standard dissemination policies and procedures.
NATIONAL SCIENCE FOUNDATION GUIDELINES
As an operating organization in NSF, NCSES must adhere to the NSF
guidelines regarding the quality of data disseminated to the public. These guide-
lines were developed to comply with OMB-issued government-wide guidelines
under Section 515 of the Treasury and General Government Appropriations
Act for Fiscal Year 2001 (P.L. 106-554), which were designed to ensure and
maximize the quality, objectivity, utility, and integrity of information dissemi-
nated by federal agencies.
Under NSF guidelines, utility is achieved by staying informed of both
internal and external information needs and by developing new data or
information products when appropriate. This is a multifaceted process,
involving keeping abreast of information needs by conducting internal
analyses of information requirements, convening and attending confer-
ences, working with advisory committees and committees of visitors, and
sponsoring outreach activities. The NSF guidelines require review of ongo-
ing publication series and other information products on a regular basis to
ensure that they remain relevant and address current information needs.
Integrity guidelines cover aspects of the security of information from
OCR for page 15
15
THE CHANGING DATA DISSEMINATION LANDSCAPE
unauthorized access or revision to ensure that the information that is dis-
seminated is not compromised through corruption or falsification. NSF
guidelines are designed to ensure that information is protected from unau-
thorized access, corruption, or revision (i.e., making certain disseminated
information is not compromised through corruption or falsification).
NSF also includes objectivity in its guidelines. This is a focus on ensur-
ing that information that is disseminated is accurate, reliable, and unbiased
and that information products are presented in an accurate, clear, complete,
and unbiased manner. Objectivity is achieved by presenting the information
in the proper context, identifying the sources of the information (to the
extent possible, consistent with confidentiality protections), using reliable
data and sound analytical techniques, and preparing information prod-
ucts that are carefully reviewed. These guidelines call for the inclusion of
metadata (information about the data), in that all original and supporting
data sources used in producing statistical data products should be clearly
identified and documented, either in the publication or on each individual
table. The metadata will generally include specification of variables used,
definitions of variables when appropriate, coverage or population issues,
sampling errors, disclosure avoidance rules or techniques, confidentiality
constraints, and data collection techniques.
DATA RELEASE POLICY
A decade and a half ago, the predecessor agency to NCSES issued a
policy on data release that was based on a consumer survey of data rel-
evance and quality that led to a review by an internal Customer Service
Task Force (National Center for Science and Engineering Statistics, 1994).
This was the first of two consumer surveys; a second, conducted in 1996,
was summarized in Measuring the Science and Engineering Enterprise:
Priorities for the Science Resources Studies Division (National Research
Council, 2000, p. 42). The consumer studies have not been repeated since.
The 1994 data release policy statement declared that its objectives were
to encourage the timely release of SRS (NCSES) survey data, ensure that the
released data meet SRS standards for “releasability,” and ensure that NSF
management knows when the data are to be released.
According to this policy statement, the main vehicle for release of
timely data was the Data Brief, which is designed to publicize the data and
provide a targeted group of users with some understanding of the implica-
tions of the data. The goal was to produce timely and accurate data, with
accuracy defined as free from such flaws as gross typographical errors or
methodological mistakes, and that they appear plausible. Procedures for
internal clearance were also outlined.
OCR for page 16
16 COMMUNICATING SCIENCE AND ENGINEERING DATA
OTHER GUIDELINES
Finally, the panel suggests that NCSES consider, in conducting its dis-
semination program, the dissemination guidelines outlined in Principles
and Practices for a Federal Statistical Agency (National Research Council,
2009). In regard to dissemination, this volume states that a statistical
agency should strive for the widest possible dissemination of the data it
compiles. Data dissemination should be timely and public. Furthermore,
measures should be taken to ensure that data are preserved and accessible
for use in future years. Elements of an effective dissemination program
include the following:
• An established publications policy that describes, for a data col-
lection program, the types of reports and other data releases to be
made available, the audience to be served, and the frequency of
release.
• A variety of avenues for data dissemination, chosen to reach as
broad a public as reasonably possible. Channels of dissemination
include, but are not limited to, an agency’s Internet website, gov-
ernment depository libraries, conference exhibits and programs,
newsletters and journals, email address lists, and the media for
regular communication of major findings.
• Release of data in a variety of formats, including printed reports,
easily accessible website displays and databases, public-use micro-
data5 and other publicly available computer-readable files, so that
the information can be accessed by users with varying skills and
needs for data retrieval and analysis. All data releases should be
suitably processed to protect confidentiality, with careful and com-
plete documentation.
• For research and other statistical purposes, access to relevant
information that is not publicly available through restricted access
modes that protect confidentiality. Such modes include protected
research data centers, remote monitored online access for special
tabulations and analyses, and licensing of individual researchers
to allow them to use confidential data on their desktop computers
under stringent arrangements to ensure that no one else can access
the information.
• Procedures for release of information that preclude actual or per-
ceived political interference. In particular, the content and timing of
5 In the National Research Council report, and throughout this report, the term “microdata”
is defined in the statistical sense, that is, microdata are data on the characteristics of units of a
population, such as individuals, households, or establishments, collected by a census, survey,
or experiment (U.S. Bureau of the Census, 1998).
OCR for page 17
17
THE CHANGING DATA DISSEMINATION LANDSCAPE
the statistical agency, and the agency or unit that produces the data
should publish in advance and meet release schedules for impor-
tant indicators to prevent even the appearance of manipulation of
release dates for political purposes.
• Policies for the preservation of data that guide what data to retain
and how they are to be archived for future secondary analysis.
OCR for page 18