Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 149
5
Weaving a Strong Trust Fabric
INTRODUCTION
Building trust among all stakeholders of the digital infrastructure—in
particular the patient population—is vital to progress and constitutes the
focus of this chapter. Included are considerations of the most effective
ways to engage stakeholders through demonstration of the value of health
information exchange in improving outcomes and efficiency, building confi-
dence in security and privacy safeguards, and examining the learning health
system–specific challenges posed in these areas. Examinations range from
a focus on the sociotechnical components of privacy and the risk–benefit
calculation in health information exchange to technical approaches to en-
suring data privacy and security.
Edward Shortliffe of the American Medical Informatics Association
addresses the need to build a strong fabric of trust among stakeholders
by communicating and demonstrating value. Dr. Shortliffe states that in
order for health information technology (HIT) to meet its full potential,
patient and provider participation must be secure. This sense of security
depends on an appreciation of the value presented by the HIT used as well
as creating and maintaining proper security and safeguards. Sharing a per-
sonal anecdote about a provider who admitted that only patient demand
would motivate him to adopt an electronic health record (EHR) system,
Dr. Shortliffe observes that sufficient patient demand could even obviate
the need for federal incentives. Using electronic banking as an example, he
suggests that educational programs are necessary to inform stakeholders
about the risks and benefits of EHRs, and predicts that with the establish-
149
OCR for page 150
150 DIGITAL INFRASTRUCTURE FOR THE LEARNING HEALTH SYSTEM
ment of an environment of trust, the value of increased convenience and
quality offered by EHRs and data sharing will overcome concerns about
privacy. Currently, however, the risks of adopting an EHR system are better
understood and communicated, so the focus of stakeholder engagement
activities going forward should be on communicating the benefits—most
importantly, better care and lower costs.
The implementation of fair information practices to ensure privacy
and security is the focus of the Center for Democracy and Technology’s
Deven McGraw. Citing surveys showing that while individuals desire
electronic access to their health information, they have significant privacy
concerns, she suggests that providing individuals with meaningful choices
around privacy is an important approach to addressing these concerns.
Ms. McGraw points to a comprehensive approach to patient privacy
and data security based on the Markle Common Framework for Secure
and Private Health Information Exchange. Key elements of the frame-
work include an open and transparent process, specification of purpose,
individual participation and control, and accountability and oversight.
Closing with a warning that overreliance on consent leads to weak pro-
tection—shifting the burden of privacy protection to the individual—and
that existing regulations are insufficient to cover the emerging issues of a
learning health system, she notes the need for a trust fabric based on fair
information practices.
Since its passage in 1996 and recent modifications, the Health Insur-
ance Portability and Accountability Act (HIPAA), has served as the legal
and policy framework for health information privacy. Bradley Malin of
Vanderbilt University describes the current state of play around health
data de-identification and highlights some of the relevant learning health
system–related issues posed by HIPAA. Included among these are identity
resolution while maintaining privacy and concern that de-identification
could cause modifications to patient information that influence the mean-
ing of clinical evidence. He asserts, however, that most of these challenges
are not insurmountable, and that efforts to quantify risk are an important
first step to mitigation. Dr. Malin suggests that use cases that better define
health information uses, and progress in the area of distributed query-based
research will be important in progressing toward a privacy-assured learning
health system.
Ian Foster of Argonne National Laboratory addresses the technical
components surrounding trust in the digital infrastructure for the learn-
ing health system. Dr. Foster lays out a number of challenges facing the
a establishment of a secure digital platform. He points to the fact that a
learning health system requires data sharing on an unprecedented scale, and
that the purpose of this sharing be extended beyond individual patient care
support to include research and population health. Identifying the challenge
OCR for page 151
151
WEAVING A STRONG TRUST FABRIC
as one of a highly complex system with an unclear definition of security,
Dr. Foster suggests some basic principles and technology solutions that can
form a basis for progress: auditabililty (information can be mapped to an
individual and data can be mapped to its origin); scalability; and transpar-
ency in terms of data usage, policies, and enforcement. Methods to achieve
these principles include attribute-based authorization, distributed attribute
management, and end-to end (scalable) security.
DEMONSTRATING VALUE TO SECURE TRUST
Edward H. Shortliffe, M.D., Ph.D.
American Medical Informatics Association
There is a widely acknowledged need for individuals to trust the
use of EHRs in the management of their health and health care. People
must believe that their personal data are being protected, and used con-
sistently in their best interest. Formal studies in scientific journals that
document the positive influences of electronic records on quality, safety,
and efficiency—typically poorly communicated to the lay public—will not
counter a deep concern that individual privacy can be compromised or
that personal data will be used for nefarious purposes. Thus all the laud-
able goals we seek with the use of health information technology (HIT)
that are under discussion at this workshop are dependent on a “fabric
of trust”—the willingness of individuals and, by extension, society to
contribute personal data and clinical experiences to the development of a
learning healthcare system.
Individuals in the healthcare community bring a deep understanding of
the health policy, financing, and quality issues that can be enhanced by the
empowering use and effective implementation of HIT. We see strong ad-
vantages to society in the use of electronic health records (EHRs) and their
adaptation to support a learning health system. Yet the individuals in our
communities—and I fear this includes many members of the media—have
a limited understanding of such issues and would find most of our work
difficult to follow. What they can easily understand, however, are news
stories that emphasize the way in which EHRs may threaten their privacy,
the confidentiality of personal data, and general security issues (such as
lost or stolen laptop computers containing private medical data regarding
thousands of patients). We need to understand that the public’s support for
EHRs depends on their sense that their care is improved or their life is sim-
plified when their provider uses the technology. The public needs to believe
that all prudent measures are being taken to ensure that their personal data
are protected from loss or inappropriate access.
OCR for page 152
152 DIGITAL INFRASTRUCTURE FOR THE LEARNING HEALTH SYSTEM
Anecdotal Evidence of the Current Challenges
Like everyone else attending this workshop, I am a patient as well as
a health professional. Long ago I made the personal decision, based on
my understanding of the trade-offs, that I would greatly prefer to be cared
for by a health system and by individual clinicians who had embraced the
use of EHRs. When I recently moved to a new city and had to identify a
primary care provider, I decided to rule out any physician or provider orga-
nization that lacked the infrastructure or philosophy that would allow me
to communicate through e-mail with my physician and his office staff. Frus-
trated by my recent experience in another city, I swore that I would never
again subject myself to a healthcare environment or physician who had not
adopted modern electronic means of communication, data management,
and information dissemination. I wanted to be sure it would be simple for
me to book appointments online, to request prescription refills, to check lab
results, and to review other aspects of my personal record. I also wanted to
have reasonable faith in the authentication and authorization procedures
that were in place before I or others could access my information online.
I recognize that I am an early adopter of new information technologies by
nature, but as I looked at the plethora of smart phones, Facebook pages,
and laptops in airport security lines that surround me every day, I sus-
pected that I was not alone in using such “digital literacy” criteria to guide
my choice of physician and healthcare system. I have subsequently been
pleased to find a suitably rigorous, electronically sophisticated physician
and healthcare environment in my new city and realize that I personally
associate such capabilities with quality of care, safety, and cost contain-
ment. Furthermore, I have minimal fear that my personal data are being
indiscriminately accessed by others or being handled in ways that would
make it easy for them to be lost or stolen.
It is natural to ask whether I am typical of patients with regard to my
search for a physician who chooses to use EHRs. One indication that I
am atypical was the conversation that I had with my previous physician
when I asked him whether he had any plans to automate the practice in
which he worked. He was surprised that any patient cared about such an
esoteric topic. He told me that I was the first patient who had ever que-
ried him on the matter, asserting that there was no demand from patients
for him to use an EHR. Additionally, he was personally disinterested in
the expense or the retraining that would be required. He noted that he
would be retiring in 6–8 years and asked why he should go through this
kind of transformation at the very end of his career. He had no interest
in using an EHR and did not care what incentives were being offered by
the government.
He did acknowledge that if all his patients were telling him that they
OCR for page 153
153
WEAVING A STRONG TRUST FABRIC
really cared about automating the office, accepting e-mail, and providing
EHR access for patients, then he might feel differently about the topic. One
wonders whether federal incentives and the meaningful use criteria would
have even been necessary if the average citizen was enamored of EHRs and
warned their doctors that they would change providers if the practice did
not implement electronic records. Under the current circumstances, how-
ever, he viewed the CMS incentives as a conspiracy in Washington, trying
to force unproven technology upon him and his patients.
Public Use of HIT
Conversations with others have convinced me that my former physi-
cian is not atypical but that I, as a patient requesting that my providers
use an EHR, am quite unusual. Seeking to better understand the public’s
attitudes toward EHRs, I was fascinated to come across a recent book that
provides extensive survey data about the public and their access to and
use of electronically available health information. Written by researchers
at Brookings Institution and Brown University, Digital Medicine summa-
rizes and interprets the results of many national e-health public opinion
surveys. The emphasis is not on the technology per se but on current
trends in adoption, acceptance, and pursuit of e-health solutions. Docu-
menting relatively low use of information technology for health purposes
by certain segments of society, the authors state a motivating argument
that “in order to achieve the promise of health information technology,
digital medicine must overcome the barriers created by political divisions,
fragmented jurisdiction, the digital divide, the cost of technology, ethical
conflicts, and privacy concerns” (West and Miller, 2009). I have described
this volume in more detail elsewhere, noting that education—both of the
public and of current and future health professionals—is viewed as a key
element in any solution. There is evidence that this issue has been too
often overlooked when others have assessed approaches to making better
use of information technology in health care (Shortliffe, 2010). Given the
economic determinants of e-health use and the digital divide, low-cost
technologies and improved access through publicly available means con-
tinue to be key requirements.
Yet public familiarity with technology, and personal use of information
resources in managing one’s own health care, is not the same as having a
society that understands and supports the use of EHRs by physicians and
other health professionals. If we need educational programs to enhance the
public’s capabilities in the use of the electronic media for accessing health
information, we also need to help them understand the risks and benefits
of EHR use.
OCR for page 154
154 DIGITAL INFRASTRUCTURE FOR THE LEARNING HEALTH SYSTEM
The Value Proposition: Convenience vs. Risk
I believe that convenience, quality, and perceived value of EHRs will
trump concerns about privacy or other risks—but only if there is a climate
of trust. The financial system has helped to demonstrate this social phe-
nomenon to us. Consider, for example, the use of one ubiquitous finan-
cial technology, the automated teller machine (ATM). When ATMs were
introduced, it rapidly became obvious to the public that there were huge
advantages in using these machines rather than relying on the traditional
interaction with a bank teller or the use of travelers’ checks. We all know
there are risks associated with electronic banking and ATMs—fraud, sto-
len PIN numbers, lost cards, and the like—but convenience and universal
access to one’s funds have clearly outweighed those concerns. In fact, indi-
viduals are even willing to pay for the convenience of an ATM, given the
surcharges that are typically absorbed by the user. We perceive the value to
be high, and the risks to be low—and most banks have explicit assurances
about maximum losses in the case of documented fraud or theft. There is
a climate of trust that, on balance, our funds are protected by the system
with which we choose to interact.
But the acceptance of such trade-offs in the use of electronic banking
clearly requires that the public appreciate the positive value of the innova-
tion offered to them. The value proposition for EHR use is much less well
understood by the public, and what they do know has tended to focus more
on potential negatives (loss of privacy, government intrusion, etc.) rather
than the benefits. Stories about threats to the safety and confidentiality of
online health data have tended to dominate in the press; even when most
organizations are taking measures to protect against the described threats,
the public largely focuses on the negatives.
Engaging the Public
In educating the public about the ways in which the use of EHRs can be
positive, the emphasis needs to be on aspects of their implementation that
create a sense of value for individual patients or their families. The greater
good—for public health, research, or a learning health system—must be
viewed as secondary. Since we know that patients tend to trust their own
doctors, one crucial source of trust in the health system is the individual’s
own physician. Thus, there is an important potential interaction between
physicians and their patients that can help to inform the public about the
clinical value of EHRs, and to assist in the creation of a climate of trust.
That outcome, of course, requires that physicians themselves perceive the
value of EHRs and believe that it outweighs the costs associated with
adoption.
OCR for page 155
155
WEAVING A STRONG TRUST FABRIC
We know that the public appeal of EHRs will grow when they are
viewed as convenient for patients, empowering them as partners in their
own management, and providing a way to deal with the opacity of tradi-
tional healthcare interactions. Their consent for data use—and the sub-
sequent steps toward a learning health system—will follow if there is a
strong trust in the data stewardship that occurs when EHR data are shared,
anonymized, pooled, and reused.
POLICIES AND PRACTICES TO BUILD PUBLIC TRUST
Deven McGraw, J.D.
Center for Democracy and Technology
Health information technology (HIT) and electronic health information
exchange are engines of health reform and have tremendous potential to im-
prove health, reduce costs, and empower patients. While some progress has
been made on resolving the privacy and security issues raised by e-health,
significant gaps remain and implementation challenges loom.
Many surveys show that people want to have electronic access to their
health information, but these same surveys also demonstrate that people
have significant privacy concerns about how their data will be used and
protected. For example, a 2005 study by the California HealthCare Foun-
dation revealed that a majority of the respondents (67%) have significant
concerns about the privacy of their medical records (CHCF, 2005). More
recent surveys by the Agency for Healthcare Research and Quality confirm
these findings (AHRQ, 2009).
While most people acknowledge the importance of ensuring patient
privacy in health information systems, many assume that providing a simple
“opt-in” or “opt-out” option fully addresses the issue. Providing individu-
als with some meaningful choices is an integral part of any privacy system,
but relying solely on a check box or blanket consent will not allay consumer
fears or, more importantly, provide adequate safeguards against misuse of
patient data.
The consequences of not ensuring privacy adequately can include fail-
ing to collect complete or adequate patient data. Without privacy protec-
tions, people may engage in “privacy-protective behaviors” to avoid having
their information used inappropriately. A 2007 Harris Interactive survey
revealed that one in six adults withhold information from providers due to
privacy concerns (Harris Interactive, 2007). The frequency increases among
people with poor health and among racial and ethnic minorities who report
higher levels of concern and are more likely to engage in privacy-protective
behaviors (CHCF, 2005).
OCR for page 156
156 DIGITAL INFRASTRUCTURE FOR THE LEARNING HEALTH SYSTEM
A Comprehensive Strategy for Fair Information Practices
To counter these tendencies and to facilitate the collection of the most
complete patient data possible, a comprehensive approach to patient pri-
vacy and data security is needed. It is important to note that privacy and
security protections are not themselves obstacles to achieve these goals.
Rather, enhanced privacy and security can enable higher levels of patient
participation in health data collection and facilitate HIT and health infor-
mation exchange.
The core elements of such a comprehensive strategy include commonly
used fair information practices, such as those articulated in the Markle
Common Framework for Secure and Private Health Information Exchange
(Markle Foundation, 2006). The principles outlined seem so straightfor-
ward that, based on common sense, it would seem that everyone employs
them. Unfortunately, this is often not the case. However, a serious applica-
tion of these practices should serve as the lynchpin to building a trusted
information-sharing infrastructure
Some of the key elements of fair information practices include: open-
ness and transparency, purpose specification and minimization, collection
and data use limitation, individual participation and control, data integrity
and quality, security safeguards and controls, accountability and oversight,
and remedies. Perhaps the most important element of a comprehensive
approach is to develop an open and transparent process. Taking the time
to educate patients about the purpose, uses, and goals of collecting their
health information can go a long way toward building public trust. Such
openness and transparency can reap higher rewards than simply present-
ing a consent form with little or no explanation and a vague guarantee of
security and privacy.
Some elements of this framework are reflected in the Health Informa-
tion Portability and Accountability Act (HIPAA) privacy and security rules,
which provide important baseline protections for patient information. The
recent rules added by the Health Information Technology for Economic
and Clinical Health Act offer improvements, but existing regulations re-
main insufficient to cover all of the emerging issues in this new and rapidly
evolving environment. For instance, there are now many entities involved
in the health information infrastructure that are not covered by HIPAA and
other federal regulations. There is also still some ambiguity on the roles,
rights, and responsibilities of the various entities involved. For example,
a prominent finding in the IOM study on HIPAA and medical research
indicates that lack of clarity of the rules and their inconsistent interpreta-
tion often pose as much of an obstacle to research as the rules themselves
(IOM, 2009).
OCR for page 157
157
WEAVING A STRONG TRUST FABRIC
Limitation of the Informed Consent Model
In this approach, consent is still important but, as noted, is only one
element of a comprehensive approach. Indeed, it may not even be the most
important component necessary to ensure data security and patient privacy
since too much emphasis on consent can often lead to weak privacy pro-
tection in practice (CDT, 2009). In practice, an over reliance on consent
provides weak privacy protection since it shifts the burden of privacy pro-
tection to the individual as opposed to requiring that data holders to be
good stewards of patient information that they use and maintain. The evi-
dence is clear that individuals pay little attention to consent forms, and too
often don’t understand the full implications of what they have agreed to.
To ensure the highest level of privacy and security, we need fair in-
formation best practices to govern the digital infrastructure for a learning
health system. Individual participation and control (consent) should play
a role, but other principles (transparency; data minimization, collection,
use and disclosure limitations, accountability, and oversight) are equally
important in building trust.
HIPAA AND A LEARNING HEALTHCARE SYSTEM
Bradley Malin, Ph.D.
Vanderbilt University
In order to function efficiently and effectively, a learning health system
requires reliable access to several critical pieces of information. First, it
needs to be informed through knowledge that is derived from the healthcare
system. This information must flow continually, so that the system can be
updating through current patient experiences. The importance of this in-
formation is greater than simply ensuring the accuracy of a patient’s EHR.
Rather, the provision of this information enables the evolution toward a
system that is flexible and able to continually evolve. Second, a learning
health system needs to access, and analyze, health information on large
populations to inform decision support models that allow for personalized
approaches to care.
HIPAA and Data De-Identification
The Health Information Portability and Accountability Act (HIPAA)
defines protected health information as information that is explicitly linked
to a particular individual or could reasonably be expected to allow individ-
ual identification. The HIPAA Privacy Rule permits health information to
be shared without patient consent for “secondary” purposes in two ways.
OCR for page 158
158 DIGITAL INFRASTRUCTURE FOR THE LEARNING HEALTH SYSTEM
First, HIPAA permits data to be shared without oversight or contrac-
tual use agreements provided the data are “de-identified”—which is not the
same as “anonymous.” Rather, the regulation is designed to mitigate risk
while facilitating the sharing of health information. De-identification can be
achieved in two different ways: safe harbor and expert determination. Safe
Harbor is satisfied when the data are stripped of 18 enumerated features.
These include explicit identifiers (such as the individual’s name and Social
Security number), as well as potential quasi-identifiers (such as the date of
birth, gender, and zip code). In contrast, expert determination (sometimes
referred to as the statistical standard) states that health information is de-
identified if an expert uses generally acceptable scientific principles and
methods to certify that the risk of identifying an individual is sufficiently
small. In doing so, the expert must document the methods and the results
of any analysis used to justify this determination. Additionally, the covered
entity is prohibited from revealing any mechanisms generated in the process
that would allow an individual to be re-identified.
If a covered entity believes that de-identification would hamper the
ability to support a learning system, then it could opt for an alternative:
the HIPAA limited dataset. Under this model, the covered entity continues
to be prohibited from sharing explicit patient identifiers, but can provide
dates and geographic information. The caveat, however, is that the recipient
of such information must enter into a data use agreement that states the
recipient cannot use the information in a way that would harm, or attempt
to identify, the corresponding individuals.
De-Identified Data in a Learning Health System
What is easy? One thing that is relatively easy to do is to build automated
approaches to find and suppress patients’ identifiers from structured health
information. At the present time, there are currently no standards for rep-
resenting identifiers, but there are various terminologies and message-based
standards that we use to represent medical information. It would be fruitful
to extend such languages to define types of identifiers.
What is not so easy? When repurposing an electronic medical record sys-
tem, such as for clinical phenotyping of patients, we use natural language
text. As a result, it is more challenging to guarantee the de-identification of
this information. There exists software to automatically detect and suppress
identifiers within natural language, but none are guaranteed to find all of
the identifiers, all of the time. Even if the software is completely efficient,
there is still no guarantee that the residual information would protect the
corresponding individual from re-identification.
There are, however, alternatives to simply handing health informa-
OCR for page 159
159
WEAVING A STRONG TRUST FABRIC
tion over to any interested recipient. For instance, we could construct an
environment in which the clinical text is housed in a secure environment
where an abstract programming interface allows users to submit programs
to the system and retrieve aggregate statistics. This model has already been
adopted by various statistical agencies around the world for providing ac-
cess to sensitive governmental information.
What is hard? De-identification, and even aggregation, is not devoid of
risks. The HIPAA safe harbor standard, for instance, leaves a certain por-
tion of the population unique with respect to the residual demographics.
Latanya Sweeney provided an example in her testimony before National
Committee on Vital and Health Statistics several years ago, where she
reported that 0.04% of the U.S. population is expected to be unique on
residual demographics (NCVHS, 2007). The concern here is that such
demographics have been linked to public resources that contain explicit
identifiers to accomplish “re-identification.” Moreover, when considering
the expert determination approach for de-identification, there is no clear
designation of what the statistical threshold should be or who can be desig-
nated as an expert. It would help greatly if there was a certification process,
something similar to a Certified Information Systems Security Personnel
program. Furthermore, and perhaps most challenging, is the fact that de-
identification tools could suppress potentially useful clinical information.
This is a great concern if it influences the meaning of clinical evidence. For
example, if the evidence is changed from “no evidence of myocardial infarc-
tion” to “evidence of myocardial infarction,” the statistics upon which the
learning system is built could be subject to noise.
Common Challenges and Next Steps
Let us return to HIPAA from the perspective of challenges. At the pres-
ent time, HIPAA does not make it easy to support longitudinal studies. If
a patient was distributed across multiple covered entities, it would be dif-
ficult to resolve the patient’s presence without access to identifiers. In the
healthcare domain, we can execute some record linkage techniques without
revealing patient identifiers through certain cryptographic mechanisms,
but the interpretation of HIPAA is such that we are not allowed to apply
those encryption technologies even though the keys never get revealed.
This is somewhat strange, because it could be guaranteed with very strong
evidence that a recipient of such information could not determine who the
corresponding patient is.
One notion that I wish to make clear is that the challenges I have al-
luded to are not necessarily insurmountable. In particular, many of the risks
that various studies have promoted (such as the risk of re-identification)
OCR for page 160
160 DIGITAL INFRASTRUCTURE FOR THE LEARNING HEALTH SYSTEM
may be less of a concern than initially anticipated. We can, and have, quan-
tified risks prior to disclosing health information. Once such measurements
are in hand, we can mitigate the risks. These are things we should do. Ad-
ditionally, we must recognize that not every dataset of health information
is susceptible to re-identification in the same way. In a study conducted by
Latanya Sweeney, it was shown that one could use publicly available voter
registration lists, for instance, to re-identify patients in a de-identified data-
set because they shared common demographics (Sweeney, 2002). However,
in 2008 we went back and surveyed all the state electoral commissions
to see what you would actually get if you purchased or found their voter
registration lists. In our investigation we found that the cost of conducting
identification is completely different across the states. For instance, in Wis-
consin it costs almost $13,000 to purchase such a list, whereas in the state
of Minnesota it only costs $46. But it is equally, if not more, important to
recognize that the information available in such resources varies. Date of
birth is provided in voter lists in the states of Tennessee, Washington, and
Illinois, but not in the list published by the state of Wisconsin. Additionally,
in the state of Minnesota, only the year of birth is shown. There are always
ways of intelligently surpassing, generalizing, or perturbing information
such that you preserve the aggregate statistics or the statistics that a learn-
ing health system requires.
Conclusion
I will conclude with three parting statements on HIPAA, privacy, and
the learning health system. First, as a society we must recognize that pri-
vacy risks are context dependent. There is no silver bullet ensuring that
if a covered entity de-identifies data according to a particular recipe it is
sufficiently protected. Second, the healthcare community must define use
cases for the health information to be utilized. If there are no use cases,
technologists will not know how the learning system should look, and
will be unable to design protections for health information that support
a learning system. We probably will not be able to develop methods that
support all possible needs in healthcare within the next several years, but
we may be able to orient technologies that address some of the bigger chal-
lenges first. Moreover, when providing such use cases, it needs to be made
clear who needs access to the data. Is it the public? Is it the employees of
covered entities? The amount of trust we have in the anticipated recipient
influences the amount of health information that can be reported and the
way in which it is reported. Finally, we need to determine if the system can
learn from the health data remotely. Do we really need to share all of the
data with all of the recipients? Or can we enable an environment that is
built upon query-response systems? The more control we have over where
OCR for page 161
161
WEAVING A STRONG TRUST FABRIC
health information goes and when, the better chance we have of ensuring
that is appropriately secured.
BUILDING A SECURE LEARNING HEALTH SYSTEM
Ian Foster, Ph.D.
Argonne National Laboratory
A learning health system is “designed to: generate and apply the best
evidence for the collaborative healthcare choices of each patient and pro-
vider; drive the process of discovery as a natural outgrowth of patient care;
and ensure innovation, quality, safety, and value in health care” (IOM,
2007). The security challenge is to ensure that the wrong people do not
learn the wrong things!
A learning health system requires data sharing on a far larger scale
than today. This sharing must occur within a highly fragmented environ-
ment: most of the ~6,000 hospitals in the United States have restrictive and
idiosyncratic data policies and practices, focused on avoiding risk rather
than enabling learning. In this context, secure data sharing is as much a
political as a technological challenge, and will require political as well as
technological solutions. These comments are restricted to technology issues,
and speak to the following questions: What can technology do and not
do? What can we learn from other large-scale distributed systems in which
sensitive data are shared on a large scale? What principles can guide us as
we work to create systems that are sufficiently flexible to encompass not
only today’s applications but those of the future; scalable to a large number
of participants; and robust to various threats, including not only malicious
acts but also human error and the challenges of complexity?
Defining the Problem
Often the hardest step in building a secure system is characterizing what
the system is and what we mean by security. In the case of the U.S. health-
care system, we are dealing with thousands of hospitals, millions of patients,
and tens of millions of visits. Participants differ in their institutional struc-
tures, cost structures, incentives, capabilities, and regulatory environments.
Information technology is often deployed and operated with a view to risk
mitigation or avoidance rather than to enable a learning health system. Data
sharing is needed not only for individual patients, but also for population
health and research studies. Additionally, sharing needs evolve over time, as,
for example, an individual patient moves from one caregiver to another or
a research project is established linking different organizations. The overall
situation is one of complexity, diversity, and constant change.
OCR for page 162
162 DIGITAL INFRASTRUCTURE FOR THE LEARNING HEALTH SYSTEM
Further complicating the problem is the fact that the security needs of
this system are not well defined. Policy statements tend to speak in gen-
eralities, stating, for example, that we should ensure security and privacy,
offer patients options, maintain appropriate levels of privacy and security,
and build in security and privacy from the outset (IOM, 2007). None of
these prescriptions is precise. HIPAA regulations try to be specific, but
are open to interpretation and can depend on statistical tests (Jajosky and
Groseclose, 2004). We also have political and social considerations, such as
objections to universal identifiers and different views on opt in vs. opt out.
Principles for Building Secure Systems
Overall, we have a system that is highly complex and a definition of
security that is far from clear. Designing technical solutions to achieve
security in this context is a challenging and, perhaps in some sense, impos-
sible task. Nevertheless, there are basic principles that, if followed, can help
improve the quality of security solutions.
Auditability means that all actions are mapped to individuals and the
origin of all data is unambiguous. Any healthcare security and privacy
solution must inevitably combine technical protections with appropriate
regulatory frameworks (including penalties for release of data). Thus, we
need to build in auditing at a foundational level so that any action per-
formed on healthcare information can be mapped to the individual who
performed that action. Equally important, both for research purposes and
to protect from other sorts of attacks—for example, delivery of incorrect
data—is to ensure that all data can be mapped unambiguously to their
origin. This latter requirement becomes increasingly important as patients
become more mobile.
Scalability means that the cost of adding participants—whether new in-
stitutions or new individuals—is small. Without this property, technological
obstacles too easily impede the new connections required to support patient
mobility and research studies.
Transparency is important from two perspectives. First, we require
transparency with respect to what it done with data and where it is stored.
Second, we need transparency with respect to the policies that are being
enforced and the consequences of those policies. If multiple policies are
being applied, it should be easy to work out what that actually means for
an individual’s data.
These principles may appear obvious, but it is striking how often
systems deployed in healthcare settings ignore them. For example, we fre-
quently see hospitals using virtual private networks (VPNs) to enable secure
remote access. VPN technology is effective in protecting against snooping
of messages transmitted between two points. However, it does not provide
OCR for page 163
163
WEAVING A STRONG TRUST FABRIC
for scalability (every new participant requires an additional point-to-point
VPN), auditability (there is no immediate control over who sees data when
they are received from the remote location), or transparency (the policies
that are enforced in this way are unclear, and the risks of information leak-
age hard to quantify). If, as is often the case, scaling is handled by adding
more VPNs in an ad hoc manner, the result can easily become a complex
system in which both usability and security are compromised.
Technology Success Stories
There are, fortunately, simple and well-understood methods that we
can apply to help achieve auditability, scalability, and transparency. I de-
scribe three such methods here: attribute-based authorization, distributed
attribute management, and end-to-end security. Each has been deployed
and used on a large scale—for example, within grid systems such as the
cancer Biomedical Informatics Grid (caBIG®), Biomedical Informatics Re-
search Network, TeraGrid, and Open Science Grid—albeit for sharing
either scientific data or clinical data for research purposes (Oster et al.,
2008; Pordes et al., 2007). Many of these systems use technologies imple-
mented within the Globus Toolkit (Foster, 2006).
Attribute-based authorization addresses the frequent (and fundamen-
tal) requirement in healthcare security to be able to control who can ac-
cess a piece of data, software program, or other resource. This problem
is often solved by associating an access control list—a list of authorized
individuals—with each resource. However, the cost of change is then high.
If Dr. X joins the team, Dr. X must be added to all relevant access control
lists: a potentially complex and error-prone process.
Using attribute-based authorization, we express access control poli-
cies in terms of the properties that an individual must have in order to be
allowed access. Properties can include the individual’s identity, but more
commonly will be properties such as “has Institutional Review Board (IRB)
approval for participating in study 123” or “is a faculty member in the
department of surgery.” Attribute-based authorization provides scalability,
because a single rule can govern any number of people that satisfy that rule.
In addition, we end up with greater transparency. Instead of having to work
out what Alice, Bob, and Chris have in common, we can read the access
control rule to determine what condition applies. An important technology
here is the eXtensible Access Control Markup Language, frequently used to
express access control policies.
Distributed attribute management is an important adjunct to attribute-
based authorization. The idea is that we rely on authoritative sources for
all attributes. For example, an institution is likely the authoritative source
for attributes concerning employment status and qualifications; the IRB for
OCR for page 164
164 DIGITAL INFRASTRUCTURE FOR THE LEARNING HEALTH SYSTEM
attributes concerning IRB approvals; and the National Institutes of Health
for membership of study sections. Then, when an individual attempts to
access a resource, the security system reaches out to each required authori-
tative source, each of which takes responsibility for ensuring that they are
issued correctly. With the attributes in hand, the security system can then
enforce appropriately the policies that apply at the individual resource. An
important technology here is the Security Assertion Markup Language,
which defines protocols and representations for requesting and communi-
cating attribute assertions.
End-to-end security is a scalable, more capable alternative to VPNs.
As we extract data from databases and move them to remote locations,
there will typically be a set of things that we want to ensure happen: that
the data are anonymized, that their provenance is documented, that they
are not modified en route, and that privacy is preserved. We can achieve
many of these things by wrapping the data in a cryptographic envelope
that can then be processed appropriately as data move from one location
to another. By thus packaging data in a manner that maintains key proper-
ties independent of context, we enhance our ability to achieve auditability,
scalability, and transparency.
Summary
Security is a systems problem. Without clarity on the nature of the
system we are securing, and what we mean by security, we will likely fail
to create secure systems. We need to spend more time studying these issues
within the context of a learning health system. Auditability, scalability, and
transparency are all properties that we should seek to realize as we design
a secure learning health system. In architecting security solutions, we can
leverage attribute-based authorization, distributed attribute management,
and end-to-end security—three methods that have been proven to scale and
that tend to support these desirable properties.
REFERENCES
AHRQ (Agency for Healthcare Research and Quality). 2009. Consumer engagement in
developing electronic health information systems. http://healthit.ahrq.gov/portal/server.
pt/gateway/PTARGS_0_9442_909189_0_0_18/09-0081-EF.pdf (accessed January 31,
2011).
CDT (Center for Democracy and Technology). 2009. Rethinking the role of consent in pro-
tecting health information privacy. http://www.cdt.org/files/pdfs/20090126Consent.pdf
(accessed January 31, 2011).
CHCF (California HealthCare Foundation). 2005. National Consumer Health Privacy Sur-
vey 2005. http://www.chcf.org/publications/2005/11/national-consumer-health-privacy-
survey-2005 (accessed January 31, 2011).
OCR for page 165
165
WEAVING A STRONG TRUST FABRIC
Foster, I. 2006. Globus Toolkit version 4: Software for service-oriented systems. Journal of
Computational Science and Technology 21(4):523-530.
Harris Interactive. 2007. Many U.S. adults are satisfied with use of their personal health infor-
mation. http://www.harrisinteractive.com/vault/Harris-Interactive-Poll-Research-Health-
Privacy-2007-03.pdf (accessed January 31, 2011).
IOM (Institute of Medicine). 2007. The learning healthcare system: Workshop summary.
Washington, DC: The National Academies Press.
———. 2009. Beyond the HIPAA privacy rule: Enhancing privacy, improving health through
research. Washington, DC: The National Academies Press.
Jajosky, R., and S. Groseclose. 2004. Evaluation of reporting timeliness of public health sur-
veillance systems for infectious diseases. BMC Public Health 4(1):29.
Markle Foundation. 2006. The common framework: Overview and principles. http://www.
markle.org/sites/default/files/Overview_Professionals.pdf (accessed February 25, 2011).
NCVHS (National Committee on Vital and Health Statistics). 2007. Enhanced protections
for uses of health data: A stewardship framework for “secondary uses” of electronically
collected transmitted health data. http://www.ncvhs.hhs.gov/071221lt.pdf (accessed Feb-
ruary 25, 2011).
Oster, S., S. Langella, S. Hastings, D. Ervin, R. Madduri, J. Phillips, T. Kurc, F. Siebenlist,
P. Covitz, K. Shanbhag, I. Foster, and J. Saltz. 2008. caGrid 1.0: An enterprise grid
infrastructure for biomedical research. Journal of the American Medical Informatics
Association 15(2):138-149.
Pordes, R., D. Petravick, B. Kramer, D. Olson, M. Livny, A. Roy, P. Avery, K. Blackburn, T.
Wenaus, F. Würthwein, I. Foster, R. Gardner, M. Wilde, A. Blatecky, J. McGee, and R.
Quick. 2007. The Open Science Grid. Paper presented at Scientific Discovery Through
Advanced Computing (SciDAC) Conference.
Shortliffe, E. 2010. Tracking e-health. Issues in Science and Technology (Spring):92-95.
Sweeney, L. 2002. k-Anonymity: A model for protecting privacy. International Journal of
Uncertainty, Fuzziness, and Knowledge-Based Systems 10(5):557-570.
West, D. M., and E. A. Miller. 2009. Digital medicine: Health care in the Internet era. Wash-
ington, DC: Brookings Institution Press.
OCR for page 166