| ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 18
18
Committee on National Statistics
large. Data organized in complex file structures may need to be converted to
simpler structures by the subsequent analyst. The data-base dictionary may
be tied to an incompatible software package and require conversion. The ori-
ginal data collectors may not have used standard data preparation and docu-
mentation practices. The data documentation may be inadequate; the codes
may be undocumented, inconsistent, or erroneous. Undiscovered errors are
inevitable.
These costs can be reduced if data sharing is recognized as a goal by initial
data collectors. And the costs may be shared if data tapes are transferred to
an intermediate archive that takes responsibility for editing and documenting
them.
Sharing Costs
One strategy for encouraging data sharing is to impose a cost for not sharing
data. A public statement that a researcher was withholding data may encour-
age the researcher and others to share their data. Reinforcing data shar-
ing as a scientific obligation may be fruitful in promoting data sharing more
widely.
The practice of data sharing probably will become more widespread if the
costs are not borne exclusively by the initial researcher. Data shanng, then,
must also be cost sharing; subsequent analysts should contribute appropriately
to the costs of documentation and pay the costs to transfer data.
Sharing data primarily benefits science and society; the costs are borne
mostly by the initial investigators. Yet most scientists are willing to share
their data to some extent despite this relationship. One reason is that recogni-
hon of the initial investigator usually is provided by subsequent analysts.
Another reason is that scientific institutions do foster data sharing through
peer recognition of altruistic behavior that advances science.
THE CHANGING ENVIRONMENT FOR DATA SHARING
Developments in computers and software, changes in research practices, the
different rewards and incentives for research, and new laws and regulations
may all affect the sharing of data. Ibis section describes how a few of these
changing circumstances may affect the propensity of researchers to share their
data.
OCR for page 19
Sharing Research Data
19
Use of Computers
The widespread use of computers for recording, summarizing, and analyzing
research data facilitates sharing data. The use of computers avoids time-
consuming clerical work and permits the transfer of large data bases that
would not have been feasible in the past. Large machine-readable data files
are a research resource in the social sciences analogous to large-scale instru-
mentation in the physical sciences.
Transfer of machine-readable data is hindered by incompatibility of com-
puter equipment and software. Help to overcome such technical problems
may come from the acceptance of common conventions for the internal stor-
age and representation of data, from the development of standard analytic
packages, and the development of conversion capabilities to move from one
system to another. More burdensome to an initial investigator are the time-
consuming tasks of file cleaning, preparation of data-base dictionaries and
other appropriate documentation, and dissemination. As the importance of
these activities has become more widely recognized, some aids have been
developed; more are expected in the future. The literature on computer file
management, standards for file documentation, and similar matters is grow-
ing. Moreover, institutions have been organized that specialize in the collec-
tion, maintenance, and dissemination of machine-readable data files. Some
of these institutions are international in scope. Both the technical guidelines
for data documentation and the number of institutions that serve as intermedi-
aries to transfer data are growing (see Clubb, in this volume, for a further dis-
cussion on using computers for data sharing).
Privacy and Confidentiality
Confidentiality refers to not disclosing responses to questions that could be
identified as belonging to an individual organization or person. Privacy ref-
ers to the right of an individual not to make personal information available to
another. Confidentiality is obviously relevant to data sharing. Privacy is
also relevant: as the public has become more concerned about invasion of pri-
vacy, researchers have attempted to overcome respondent hesitation by mak-
ing stronger promises of confidentiality. Legal protections for privacy at-
tempt to protect privacy by maintaining confidentiality of records, and in
many cases, restricting their use to the agency to which the respondent pro-
vided information.
Growing concerns about confidentiality and the protection of privacy have
affected research involving information about individuals and the conditions
under which data may be shared, especially if the research is undertaken under
OCR for page 20
20
Committee on National Statistics
federal contract. As a result, more attention is paid to maintaining the con-
fidentiality of records, whether legally required or not; to removing identifi-
able information from records before data are shared; and to using other dis-
closure avoidance techniques.
Paralleling the burgeoning use of computers in business and government,
public awareness of issues of privacy and confidentiality has increased during
the past two decades. Respondents express concern over invasion of privacy
and are skeptical of assurances that confidentiality will be protected (see, for
example, National Research Council, 19791. Also, the public is apprehen-
sive of the growth of large-scale computerized data banks that contain person-
al, individually identifiable information. Investigators have become more
sensitive to issues of privacy and confidentiality because of this public discus-
sion and respondent reactions.
The public concerns have led to enactment of statutes designed to protect
privacy and ensure the confidentiality of data concerning individuals (see
Cecil and Griffin, In this volume). A major federal statute is the Privacy Act
of 1974. Designed to protect the confidentiality of records collected and
maintained by the federal government, it provides, with certain exceptions,
that identifiable information about individuals may not be disclosed outside
the agency that collected the information unless the prior consent of the indi-
viduals concerned is obtained.S A key characteristic of this statute is that it
does not distinguish between data for administrative purposes and data for re-
search or statistical purposes. The provisions of the law apply directly to in-
vestigators whose research or surveys are undertaken under a contract win a
federal agency, as are, for example, most evaluations of federal programs.
Such investigators must observe the provisions of the Privacy Act in sharing
data by deleting identifying names and numbers from individual records;
sometimes, over disclosure-avoidance techniques are used.
These rules may hamper and at times prevent the matching or linking of
data files. In some research requiring access to federal data, identification of
individuals is essential. In epidemiological studies, for example, it may be
necessary to know He names of persons exposed to certain suspected hazards
over long periods in order to match these win records of death or disease at a
later time. Unless such epidemiological research is considered "routine use"
under Be teens of Be Privacy Act, access to this information may be res-
tricted.
Biomedical researchers in particular are affected by federal regulations go-
verning research on humans that require review of research plans by institu-
SIn Edition to federal law, several states have enacted statutes to protect privacy that may also
affect research.
OCR for page 21
Sharing Research Data
21
tional review boards. In some cases, such boards may go beyond the require-
ments of the Privacy Act and so have an effect on the ability of researchers to
share data.
The Privacy Protection Study Commission, called for by the Privacy Act of
1974, urged among other recommendations that the Act be revised to distin-
guish between data for research purposes and those maintained for administra-
tive purposes (Privacy Protection Study Commission, 1977: especially pages
567 6041. If the law is changed, investigators might find fewer restrictions
on access to individually identifiable federal data for research purposes. It is
certain, however, that there would still be strong injunctions and safeguards
calling on researchers to protect the confidentiality of data.
Freedom of Information
Another federal statute, the Freedom of Information Act, enacted in 1966,
which provides for greater public access to many kinds of federal data, has
had the opposite effect of the Privacy Act (see Cecil and Griffin, in this vol-
ume). There are two specific exemptions to access in the Freedom of
Information Act that are most relevant to research data: "personnel and medi-
cal and similar files Me disclosure of which would constitute a clearly unwar-
ranted invasion of privacy" and "trade secrets and commercial or financial in-
formation obtained from a person and privileged or confidential." An investi-
gator whose contract with a federal agency calls for transfer to the agency of
microdata that do not qualify for these exemptions should expect that We data
may be shared with others, researchers or not, under the Freedom of
Information Act. The act does not appear to apply, however, to data main-
tained solely under the control of the investigator. Even investigators
working on funds from private sources may be subject to Me Freedom of
Information Act should Hey submit data to a federal agency for advice or
checking. For example, a privately sponsored survey Hat used computer as-
sistance from the federal Centers for Disease Control was ruled subject to the
Freedom of Information Act (Dickson, 19801.
Patents, Profits, and PropFietary Data
The possibility that a research effort may lead to He development of a patent-
able product or process may affect the willingness of investigators to share
Heir data. Patent laws may also delay publication of research results and,
therefore, may delay data sharing. A recent change in He U.S. patent law,
for example, led the Office of Management and Budget to suggest that federal
agencies require notification of any potentially patentable results at least three
months before research reports are submitted for publication. The rule would
OCR for page 22
22
Committee on National Statistics
apply to federally sponsored research in universities and small businesses and
is intended to allow time to apply for patent rights in certain European coun-
tries. In the United States, patents can be applied for up to one year following
publication of research results, but in some European countries patent rights
may be forfeited by publication. In commenting on these developments,
Dickson (1981:501) noted: "The proposed rule has already created a storm of
protest from the U.S. research community, which claims that, by threatening
to deny a scientist patent rights to a discovery if the procedure is not followed,
it could seriously impede scientific communication."
The Copynght Act is also relevant to data sets developed by researchers.
Under that act, the proprietary rights of a person who has developed informa-
tion are balanced against the public benefits from distribution of the inforrr~a-
tion. Interpretations of the Copynght Act, which was significantly amended
in 1976, may affect the extent to which data are shared. The doctrine of fair
use, which limits the exclusive rights of copyright owners in order to permit
reasonable use by others for purposes such as criticism, news reporting,
teaching (including multiple copies for classroom use), or research, was ex-
panded in the Copynght Act amendments (see Cecil and Griffin, in this vol-
ume). Scholarly journals that insist on copyrighting all articles may impede
reanalysis of previously published information by requiring secondary ana-
lysts to obtain copyright releases from original researchers, although the fair
use provision makes this requirement unnecessary.
Recent applications of research on DNA have drawn dramatic attention to
the potential profitability of some research. Academic research scientists and
private funs engaged in developing profitable applications have sometimes
found themselves win very different interests. A report in Science of a dis-
pute between the University of California and the pharmaceutical firm of
Hoffmann-La Roche concerning a human gene containing the genetic infor-
mation for the synthesis of interferon earned the following headline:
"University and Drug Finn Battle Over Billion-Dollar Gene: A lawsuit over
interferon may change the informal ways by which researchers exchange
materials" (Wade, 1980~. Donald Kennedy, president of Stanford
University, commented: "Scientists who once shared Republication info~a-
tion freely and exchanged cell lines without hesitation are now much more re-
luctant to do so" (Roark, 1981~. And the New York Times (1981) editonal-
ized: "The values of the marketplace have so invaded the campus that on sev-
eral occasions researchers have refused to share with their colleagues the ex-
act details of how Hey did their experiments. Such attitudes are incompatible
with the ethos of a scholarly community." Similar views were expressed in a
Nature (1980) editorial. Potentially lucrative applications of scientific re-
OCR for page 23
i
Sharing Research Data
23
search are not widespread, but, in the scientific disciplines in which they oc-
cur, the effect on data sharing is significant.
At a recent meeting of university and company officials, the need for facul-
ty freedom to report research was discussed, and it was agreed that research
contracts or licensing agreements between universities and private companies
should avoid secrecy (Chronicle of Higher Education, 1982:121. The joint
statement included, under the heading "Open Communication Encouraged,"
the following:
The traditions of open research and prompt transmission of research results
should govern all university research, including research sponsored by industry.
Those traditions require that universities encourage open communication about re-
search in progress and research results. However, it is appropriate for institutions
to file for patent coverage for inventions and discoveries that result from university
research. This action may require brief delays in publication or other public disclo-
sure.
Receipt of proprietary information from a sponsor may occasionally be desirable
to facilitate the research. Such situations must be handled on a case-by-case basis
in a manner which neither violates the principles stated above nor interferes with the
educational process. Any other restrictions on control of information disclosure by
institutions are not appropriate as general policy.
Restrictions on International Sharing of Data
Restrictions on the sharing of data across national boundaries are likely to
fluctuate with international political tensions and changes in perceived nation-
al interests. Such restrictions may apply not only to defense-related technolo-
gy, but more broadly to research that is deemed to be of advantage to other na-
tions. The Export Administration Act of 1979, administered by the U.S.
Department of Commerce, requires that export controls be used where neces-
sary '`to restrict the export of goods and technology which would make a sig-
nificant contribution to the military potential of any other country or combina-
tion of counties which would prove detrimental to the national security of the
United States.,'
In the United States, restrictions on sharing data with other countries appar-
ently are being tightened. Examples include:
(1) Proposed revisions in the 1972 International Traffic in Arms
Regulations, published in preliminary form in the Federal Register
(December 19, 1980), require that an export license be obtained for transfer to
a foreigner of technical data that may have a defense application.
(2) Dunng 1981, an amendment was proposed to the Arms Export Control
Representative terms from entire chapter:
sharing data