Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
Executive Summary
Precise, accurate spatial data are contributing to a revolution in some
fields of social science. Improved access to such data about individuals,
groups, and organizations makes it possible for researchers to examine
questions they could not otherwise explore, gain better understanding of
human behavior in its physical and environmental contexts, and create
benefits for society from the knowledge flows from new types of scientific
research. However, to the extent that data are spatially precise, there is a
corresponding increase in the risk of identification of the people or organi-
zations to which the data apply. With identification comes a risk of various
kinds of harm to those identified and the compromise of promises of confi-
dentiality made to gain access to the data.
This report focuses on the opportunities and challenges that arise when
accurate and precise spatial data on research participants, such as the loca-
tions of their homes or workplaces, are linked to personal information they
have provided under promises of confidentiality. The availability of these
data makes it possible to do valuable new kinds of research that links
information about the external environment to the behavior and values of
individuals. Among many possible examples, such research can explore
how decisions about health care are made, how young people develop
healthy lifestyles, and how resource-dependent families in poorer countries
spend their time obtaining the energy and food that they need to survive.
The linkage of spatial and social information, like the growing linkage of
socioeconomic characteristics with biomarkers (biological data on indi-
1
OCR for page 2
2 PUTTING PEOPLE ON THE MAP
viduals), has the potential to revolutionize social science and to significantly
advance policy making.
While the availability of linked social-spatial data has great promise for
research, the locational information makes it possible for a secondary user of
the linked data to identify the participant and thus break the promise of
confidentiality made when the social data were collected. Such a user could
also discover additional information about the research participant, without
asking for it, by linking to geographically coded information from other sources.
Open public access to linked social and high-resolution spatial data
greatly increases the risk of breaches of confidentiality. At the same time,
highly restrictive forms of data management and dissemination carry very
high costs: by making it prohibitively difficult for researchers to gain access
to data or by restricting or altering the data so much that they are no longer
useful for answering many types of important scientific questions.
CONCLUSIONS
CONCLUSION 1: Recent advances in the availability of social-spatial
data and the development of geographic information systems (GIS) and
related techniques to manage and analyze those data give researchers
important new ways to study important social, environmental, eco-
nomic, and health policy issues and are worth further development.
CONCLUSION 2: The increasing use of linked social-spatial data has
created significant uncertainties about the ability to protect the confi-
dentiality promised to research participants. Knowledge is as yet inad-
equate concerning the conditions under which and the extent to which
the availability of spatially explicit data about participants increases
the risk of confidentiality breaches.
Various new technical procedures involving transforming data or creat-
ing synthetic datasets show promise for limiting the risk of identification
while providing broader access and maintaining most of the scientific value
of the data. However, these procedures have not been sufficiently studied to
realistically determine their usefulness.
CONCLUSION 3: Recent research on technical approaches for reduc-
ing the risk of identification and breach of confidentiality has demon-
strated promise for future success. At this time, however, no known
technical strategy or combination of technical strategies for managing
linked spatial-social data adequately resolves conflicts among the ob-
jectives of data linkage, open access, data quality, and confidentiality
protection across datasets and data uses.
OCR for page 3
3
EXECUTIVE SUMMARY
CONCLUSION 4: Because technical strategies will be not be sufficient
in the foreseeable future for resolving the conflicting demands for data
access, data quality, and confidentiality, institutional approaches will
be required to balance those demands.
Institutional solutions involve establishing tiers of risk and access and
developing data-sharing protocols that match the level of access to the risks
and benefits of the planned research. Such protocols will require that the
authority to decide about data access be allocated appropriately among
primary researchers, data stewards, data users, institutional review boards
(IRBs), and research sponsors and that those actors are very well informed
about the benefits and risks of the data access policies they may be asked to
approve.
We generally endorse the recommendations of the 2004 National Re-
search Council report, Protecting Participants and Facilitating Social and
Behavioral Sciences Research, and the 2005 report, Expanding Access to
Research Data: Reconciling Risks and Opportunities, regarding restricted
access to confidential data and unrestricted access to public-use data that
have been modified so as to protect confidentiality, expanded data access
(remotely and through licensing agreements), increased research on ways to
address the competing claims of access and confidentiality, and related
matters. Those reports, however, have not dealt in detail with the risks and
tradeoffs that arise with data that link the information in social science
research with spatial locations. Consequently, we offer eight recommenda-
tions to address those data.
RECOMMENDATIONS
Recommendation 1: Technical and Institutional Research
Federal agencies and other organizations that sponsor the collection
and analysis of linked social-spatial data—or that support data that
could provide added benefits with such linkage—should sponsor re-
search into techniques and procedures for disseminating such data while
protecting confidentiality and maintaining the usefulness of the data
for social-spatial analysis. This research should include studies to adapt
existing techniques from other fields, to understand how the publica-
tion of linked social-spatial data might increase disclosure risk, and to
explore institutional mechanisms for disseminating linked data while
protecting confidentiality and maintaining the usefulness of the data.
OCR for page 4
4 PUTTING PEOPLE ON THE MAP
Recommendation 2: Education and Training
Faculty, researchers, and organizations involved in the continuing pro-
fessional development of researchers should engage in the education of
researchers in the ethical use of spatial data. Professional associations
should participate by establishing and inculcating strong norms for the
ethical use and sharing of linked social-spatial data.
Recommendation 3: Training in Ethical Issues
Training in ethical considerations needs to accompany all method-
ological training in the acquisition and use of data that include geo-
graphically explicit information on research participants.
Recommendation 4: Outreach by Professional Societies and Other Or-
ganizations
Research societies and other research organizations that use linked
social-spatial data and that have established traditions of protection of
the confidentiality of human research participants should engage in
outreach to other research societies and organizations less conversant
in research with issues of human participant protection to increase
attention to these issues in the context of the use of personal, identifi-
able data.
Recommendation 5: Research Design
Primary researchers who intend to collect and use spatially explicit data
should design their studies in ways that not only take into account the
obligation to share data and the disclosure risks posed, but also provide
confidentiality protection for human participants in the primary re-
search as well as in secondary research use of the data. Although the
reconciliation of these objectives is difficult, primary researchers should
nevertheless assume a significant part of this burden.
Recommendation 6: Institutional Review Boards
Institutional Review Boards and their organizational sponsors should
develop the expertise needed to make well-informed decisions that bal-
ance the objectives of data access, confidentiality, and quality in re-
search projects that will collect or analyze linked social-spatial data.
OCR for page 5
5
EXECUTIVE SUMMARY
Recommendation 7: Data Enclaves
Data enclaves deserve further development as a way to provide wider
access to high-quality data while preserving confidentiality. This devel-
opment should focus on the establishment of expanded place-based
enclaves, “virtual enclaves,” and meaningful penalties for misuse of
enclaved data.
Recommendation 8: Licensing
Data stewards should develop licensing agreements to provide increased
access to linked social-spatial datasets that include confidential infor-
mation.
The promise of gaining important scientific knowledge through the
availability of linked social-spatial data can only be fulfilled with careful
attention by primary researchers, data stewards, data users, IRBs, and re-
search sponsors to balancing the needs for data access, data quality, and
confidentiality. Until technical solutions are available, that balancing must
come through institutional mechanisms.
OCR for page 6
Representative terms from entire chapter:
spatial data