Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 24
24
Committee on National Statistics
Act (H.R. 109) to tighten restrictions on exchange of information in such
fields as computer technology (Kolata, 19811.
(3) It has also been proposed to have scientific work reviewed by federal
agencies on a voluntary basis prior to publication. Such a voluntary review
system is now in effect in the field of cryptanalysts.
Although published unclassified data are exempt, researchers fear restraint of
scholarly inquiry, and professional societies, among others, are objecting,
since information presented at scientific meetings may not be exempt
(Marshall, 1981).
The conflicting pressures of national security and open science have recent-
ly aroused much interest in He general press as well as in scientific circles.
The National Academy of Sciences announced in March 1982 He appoint-
ment of a broadly based panel of senior policy makers and researchers to ex-
amine the relationship between university research and national security in
light of the growing concern that foreign nations are gaining military advan-
tages from American research. The panel's September 1982 report recom-
mended guidelines that would allow government-funded, academically based
scientific research to be performed without restriction, except for research in
narrowly defined areas of technologies Hat could not justifiably be either clas-
sified or completely open (Committee on Science, Engineering, and Public
Policy, 1982~. In an assessment of policy developments 18 months after the
panel report was issued, Wallerstein (1984) concluded that "the reach of res-
trictions either proposed or in force go considerably beyond the panel's
recommendation." Since then, the Department of Defense has indicated that it
would not furler restrict publication of militarily sensitive but unclassified
research: control of fundamental research in science and engineering at uni-
versities and federal laboratories is to be achieved Cough classification.
Some scientists fear, however, Hat more research will be classified
(Goodwin, 19841.
CONCLUSIONS AND RECOMMENDATIONS
". . . the best security for the fidelity of mankind is to make their interest
coincide win their duty."
Alexander Hamilton
The Federalist Papers, No. 72
Most scientific advances are not solely the result of separate, individual ef-
forts. As society turns to science win ever more problems, solutions are in-
terdisciplinary and require the contribution of many investigators. At the
same time, scientists are becoming more specialized. Sharing data can pro-
vide opportunities for interdisciplinary approaches to problems and, even
OCR for page 25
Sharing Research Data
25
within the same discipline, the sometimes synergistic result of different peo-
ple thinking about the same or similar problems.
Because of the promise for eventual solutions to important problems, as
well as the benefits of increased knowledge and understanding, society sup-
ports science. Sharing data offers efficient use of research funds by allowing
further discoveries to be recovered from data that have already been collected
at great expense and that otherwise would not be used further. There are
many other important benefits to science from sharing data. A primary one is
that sharing data provides for further theones, methods, and results. Sharing
data also tends to correct inadvertent error and to discourage fraud.
But there are potential costs for an investigator who provides data to others:
costs of time, money, and inconvenience; fears of possible criticism, whether
justified or not; possible violations of trust by a breach of confidentiality; and
forgoing recognition or profit from possible further discoveries.
In some circumstances initial investigators are required to share data in ac-
cordance win the rules of their employing institutions or the terms of their
grants. In many cases, however, whether data are shared and the extent to
which they are shared depend on the decisions of individual scientists.
Professional societies, organizations that publish scholarly journals, research
institutions, and foundations and over organizations that fund research can
encourage, facilitate, and even reward the sharing of data, although they sel-
dom prescribe Me behavior of individual scientists.
These considerations led the Conarnittee on National Statistics to make the
following general recommendations.
Recommendation I . Sharing data should be a regular practice.
The advantages of data sharing are sufficient to warrant considerable atten-
tion to ways to share data without imperiling privacy or breaching We con-
fidentiality promised to data providers. We share the views of Jowell
(1981:141:
Planers (1979, p. 307), in his definitive international survey of measures to en-
hance the confidentiality of niicrodata, concludes that an "ultimate goal of public
policy in every county should be to encourage custodians to disseminate data and
researchers to use it." As long as the individual is adequately protected, wider ac-
cess to data will surely serve rather than Greaten the interests of civil liberties and
open government.
The Committee recommends a number of guidelines for researchers, for
funding agencies, for professional journals, for research training institutions,
and for over participants in research Mat should facilitate and encourage shar-
ing data for research purposes.
OCR for page 26
26
When to Share Data
Committee on National Statistics
Recommendations for Initial Investigators
Data are collected in a variety of circumstances in controlled laboratory ex-
periments, by observation in the field, through interviews, from accumula-
tions of records, or by combinations of these methods. In some cases, data to
which access is desired may have developed through one investigator's efforts
and be entirely at his or her disposal to share. In other cases, the nature of the
data, promises of confidentiality, laws or regulations, contractual require-
ments, or proprietary rights may preclude or at least militate against shanng.
In still other cases, raw data may be available to all (for example, from public
records or from public-use tapes, which are samples of anonymous statistical
data specifically designed for widespread research use), and the researcher's
contribution may be in the compilation procedures and methods of analysis.
In the latter instance, it is the edited and categorized data, an explanation of
the analytical methods used, and documentation of how the data were handled
to which access may be requested.
Analyzing data and reporting discoveries are clearly more glamorous tasks
to many scientists than collecting data. The motivation of possible discover-
ies is needed even to contemplate data collection, and science is served well
by this motivation. Thus, initial investigators are entitled to be the first to ex-
amine, summarize, and analyze their data. There may, however, be excep-
tions, for example, when data collection is a joint effort or when public funds
are used to pay for data collection with the intent that He data be available to
many in a timely manner. Although scientists surely deserve, in most cases,
first claim to data compiled under their direction, Be practice of withholding
data until all possible analyses are exhausted is unnecessarily resmctive and
too self-serving to advance science. A balance is needed.
Recommendation 2. Investigators should share their data by the time
of publication of initial major results of analyses of the data except in
compelling circumstances.
It should also be noted that, if data are made available when the results of
research are submitted for publication, the submitted manuscript can be more
carefully and more fully reviewed. The benefits of sharing data appreciably
increase upon publication, since other researchers can then test the same and
other theories and methods. We encourage researchers to make every effort
to share data as soon as it is feasible.
OCR for page 27
Sharing Research Data
Data Relevant to Public Policy
27
Scientists have a special responsibility to share data as quickly and as widely
as possible when the data are or will become relevant to public policy.
Withholding such data risks the use of wrong results or of ineffective analysis
of important issues.
Recommendation 3. Data relevant to public policy should be shared as
quickly aM widely as possible.
This recommendation is not intended to support the public release of ana-
lyses prior to appropriate review.
Planning for Data Sharu~gas Part of Research
Researchers can more effectively share data if they keep Mat objective in mind
in all stages of their research. Planning to share data from the outset not only
helps achieve the goal of data sharing but also may improve the quality of the
research. For example, adequate documentation of data helps initial investi-
gators as well as subsequent analysts. Data files should include the unedited
raw data as well and documentation on edits, handling of nonresponse, and
similar problems (see Straf, 1981; Madow et al., 19831.
Not all data can be shared in a situation in which confidentiality must be
preserved. For example, photographs, oral histories, detailed notes on inter-
views of well-known people, and some types of proprietary information are
data Mat could not be shared if confidentiality is to be maintained. Some per-
sons or organizations may be unique or come from such a small group Mat it
may be impossible to share data and not identify them. There are, however,
ways to share many types of data and still maintain confidentiality (see
Campbell et al., 1975~.
Recommendation 4. Plans for data sharing should be an integral part
of a research plan whenever data sharing is feasible.
Researchers might benefit by first considering whether Hey could be subse-
quent analysts: data might already have been collected that are sufficiently
useful to warrant forgoing new data collection.
OCR for page 28
28
Committee on National Statistics
Keeping Data A bailable
Part of a research plan should include maintaining the data for a reasonable
period following the completion of research for possible use by subsequent
analysts. Some data collections may be small or so specialized that only lim-
ited use by others can be expected, and the initial investigator can handle re-
quests without undue burden. Other data sets may be of such general purpose
and in such demand over a considerable period that the initial investigator may
find it difficult or impossible to handle the requests of subsequent analysts.
Particularly in the latter case, researchers might consider submitting data to an
appropriate archive that not only would assume responsiblity for much of the
handling of data to be shared, but also would encourage fisher use of the data
by lounging them to the attention of a wider community of researchers.
Cataloging of machine-readable data fees and citing such data in a standard
way (Dodd, 1982) would also encourage further use.
Recommendation 5. Investigators should keep data available for a rea-
sonable period after publication of resultsfrom analyses of the data.
Recommendations for Subsequent Analysts
It is neither practical nor equitable to expect initial investigators to pay all
costs of transfemng their data to others. It is reasonable to expect subsequent
analysts to reimburse animal investigators at least for the extra costs involved
in data transfer.
Recommendation 6. Subsequent analysts who request data from others
should bear the associated incremental costs.
Recommendation 7. Subsequent analysts should endeavor to keep the
burdens of data sharing on initial investigators to a minimum and explic-
itly acknowledge the contribution of the initial ir~vesiigators.
Explicit acknowledgment of the initial investigators and their contributions
would encourage data sharing.
Subsequent analysts who discover eITors in data should inform the data col-
lectors or the appropriate archive so Mat the data may be corrected for We use
of others. Cnticism of a data collection or analysis should be made in a pro-
fessional manner. With few exceptions, it is desirable that subsequent ana-
lysts also inform initial investigators or data archives promptly of the results
of new analyses, even those that are unrelated to We original analysis. This
OCR for page 29
Sharing Research Data
scientific courtesy may also help to avoid future duplications of efforts.
Recommendations to Institutions that Fund Research
29
A scientist is recognized and rewarded through the scientific community and
its institutions. Researchers will have greater incentive to share data if the
community and its institutions foster the idea that the practice advances
science and is part of what is recognized as necessary and proper scientific be-
havior. We suggest that foundations, federal agencies, and other organiza-
tions that fund research provide encouragement and rewards for sharing data.
In many instances, funding organizations would be justified in requiring
that data be shared. Government funding agencies, in particular, should re-
quire applicants to guarantee data sharing or to justify explicitly in their pro-
posals why sharing would be inappropriate. Unless data sharing is a condi-
tion of a grant or contract whether of public or private funds applicants
who have budgeted to share are at a disadvantage when costs are compared
with the budgets of those who have not.
If plans to share data are given as much weight as the sample design, meth-
ods of analysis, and over aspects of proposed research in deciding on an
award, researchers would then plan for sharing data at an early stage. A re-
searcher might request funds to make important data available to others. In
any case, he or she could be encouraged to describe in the application how the
content and structure of the data would be documented, how invitations for
subsequent analysis would be extended, and how requests for data could be
honored at minimal cost. The referees of the research proposal could judge
the importance of support for making the data available to others.
For research projects involving large data sets, investigators could request
funds for a person with responsibility to document data files; update and cor-
rect data entries; produce data files for those who request them; consult with
users on interpretations, limitations, and other important aspects of He data;
and preserve the confidentiality of respondents. Even for small data sets,
however, a funding organization Mat encourages reasonable standards for
documentation will aid not only subsequent analysts, but also the initial inves-
tigators.
Funding organizations that require, in rules or by contracts, unnecessarily
excessive protection of privacy and confidentiality hinder the sharing of data.
Society benefits from the accessibility of data as well as from the protection of
privacy and confidentiality. A reasonable balance between these often con-
flicting values cannot be achieved by exclusive attention to one.
When funding agencies anticipate that research results will be directly rele-
vant to public policy, the agencies should be alert to the need for sharing data
so Rat conclusions can be verified or contested through reanalysis. Federal
OCR for page 30
30
Committee on National Statistics
funding organizations can ensure the availability of data for such uses by in-
cluding in original contracts or grants a requirement that, on completion of re-
search, data will be delivered to the sponsoring agency. The data would then
be subject to the Freedom of Information Act.
Recommendation 8. Funding organizations should encourage data
sharing by careful consideration and review of plans to do so in applica-
tionsfor research funds.
Initial investigators whose data sets prove to be of wide interest to subse-
quent analysts may not be in a position to manage and disseminate data to
many others for a long time. Even if initial investigators are paid for the addi-
tional time and other costs involved, sharing data may impinge too severely
on other scientific activities. Intermediate research archives have been deve-
loped in some fields to meet this problem (see Clubb, in this volume, for more
details). Organizations funding large data collections that are expected or lat-
er found to be of considerable general interest should be alert to this problem.
If existing data archives are not suitable or are inadequately funded, funding
organizations should consider supporting appropn ate ones.
Recommendation 9. Organizations funding large-scale, general-
purpose data sets should be alert to the need for data archives and con-
sider encouraging such archives where a significant need is not now be-
ing met.
Recommendations to Editors of Scientific Journals
The editorial policies of scientific journals have a significant effect on scien-
tific practice, since the publication of research results in respected, refereed
journals is one of the principal rewards of scientific research. Journal editors
should adopt editorial policies designed to encourage data sharing.
Providing Access to Data for Peer Review
Access to data during Me review process, a practice already in use by some
journals, provides reviewers an opportunity to replicate the analysis and dis-
cover possible errors. Reviewers can use alternate assumptions or analytic
models to test the robustness of authors' conclusions.
Recommendation 10. Journal editors should require authors to pro-
vide access to data during the peer review process.
OCR for page 31
Sharing Research Data
Publishing Reanalyses and Secondary Analyses
31
If researchers know that reports of replications, whether confimnatory or not,
and of secondary analyses will be welcomed under journal editorial policies,
such research would be encouraged.
Recommendation 11. Journals should give more emphasis to reports of
secondary analyses and to replications.
Giving appropriate credit to data collectors should serve to encourage oth-
ers to share data as a matter of good scientific practice. Criticism of the on-
ginal data collection should be factual, temperate, and made in the light of
reasonable standards of data collection.
Recommendation 12. Journals should require full credit and appropri-
ate citations to original data collections in reports based on secondary
analyses.
Encouraging Accessibility to Data
It should be standard practice for small data sets to be published with the re-
search reports Cat use them. For larger sets, the availability might be an-
nounced in the research report win an explanation of where the data may be
obtained: from the journal editor, from an intermediate archive, from the on-
ginal investigator, or elsewhere.
Recommendation 13.
make detailed data accessible to other researchers.
Journals should strongly encourage authors to
Recommendations to Other Institutions
Other participants in the scientific research process can promote data sharing.
Academic institutions can exercise leadership in encouraging data sharing
both in training future scientists and by example. Professional associations
can also play a part, as can funding agencies and archives.
Providing Training for Sharing Data
Instruction and training on data-sharing policies and practices should be in-
cluded in the education of many research scientists. Professional societies
might organize meeting sessions or workshops on data sharing. The techni-
OCR for page 32
32
Committee on National Statistics
Cal aspects of data shanng, especially documentation and archiving methods,
should be taught in specialized courses either as a part of academic curricula
or in continuing education programs. Instruction in data sharing should also
include how to find and adapt existing data for research (Myers and Rockwell,
1984) and how to prepare data for secondary analysis (Fortune and McBee,
1984~. In some disciplines, emphasis on sharing data could be a recognized
part of graduate training.
Recommer~tion 14. Opportunities to provide training on data shar-
ing principles and practices should be pursued arm expanded.
Researchers should be encouraged to use data collected by others for schol-
arly research when appropriate. Actual data should be used in teaching
whenever practical, a practice that depends on data being shared.
Reference Service for Soc~al Sc~ence Data
A centralized reference service for computer-readable social science data
would promote Me use of data already collected. A start can be made with
existing archives and with some federal statistical agencies. The Social
Science Research Council (1983) has recently issued a compendium of brief
descriptions of about 100 national data bases available for use in social
science research. By allowing sufficient funds for adequate documentation
of original studies and by funding research based on the use of shared data,
funding agencies could foster Me grown and efficient use of such a service.
The National Science Foundation might take a leading role in promoting it.
Recorr~nendation 15. A comprehensive reference service for
computer-readable social science data should be developed.
Providing Recognition for Date Sba~g
The scientific reward structure could be strengthened to achieve more sharing
of data and more innovative subsequent analyses. In addition to our recom-
mendations to journal editors, we suggest Rat academic inshtudons encour-
age data sharing by granting appropriate professional recognition to We data-
shanng activities of teaching and research staff members in such matters as
salary and promotion policies.
Recommendation 16. institutions and organizations through which
scientists are rewarded should recognize the contributions of appropri-
ate data-sharing practices.
Representative terms from entire chapter:
sharing data