Read "Protecting Data Privacy in Health Services Research" at NAP.edu

« Previous: Executive Summary

Page 20 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

1 Introduction

Health services research (HSR), through the analysis of large databases of health information, offers the potential to improve the quality of health care delivery and the effectiveness of health care policies. At the same time, the analysis of personally identifiable health information from many individuals raises concerns about privacy and confidentiality. We need to protect the individual subjects of study (where participation in the study may, but will not necessarily, benefit these subjects) by taking measures that are reliable, but are also compatible with good research that can benefit society as a whole. Ensuring both values is particularly important at this time because of policy debates about health privacy and the confidentiality of computerized health information, and recent criticisms about the effectiveness of institutional review boards (IRBs) in protecting research subjects, although much of the recent criticism has actually focused on clinical trials.¹

This project charged the Institute of Medicine (IOM) with gathering information on current practices and principles followed by IRBs that review HSR, both under the federal regulations and in privately sponsored studies. In addition, the IOM was asked to recommend, if appropriate, best practices for safeguarding the confidentiality of personally identifiable health information in HSR.

This introductory chapter summarizes the context of the issue of privacy and confidentiality in health services research, including the background of the

¹	Regarding policy and confidentiality, see for example Applebaum, 2000; IOM, 1994; NRC 1997; Etzioni, 1999; Gostin and Hadley, 1998; Hanken, 1996; GHPP, 1999; Goldman, 1998. Regarding IRB effectiveness, see for example Brown (OIG), 1998b, 2000; Brainard, 2000; GAO, 1996; Edger and Rothman, 1995.

Page 21 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

study, IRBs, HSR and privacy, and the scope and limitations of the current project. This chapter closes with an overview of the remaining chapters of the report. The remaining chapters describe some current and best practices that the committee learned of pertaining to the protection of confidentiality through the application of technology, implementation of informed policies, and training and support of personnel. Finally, the report suggests further steps that would lead to additional improvements in protection of the confidentiality of HSR, while at the time making oversight by IRBs (or other review boards) more effective and efficient. In this report, “effective oversight” includes the idea that the oversight will be trusted throughout our diversified society and reliable and, thus, able to balance societal benefit and individual privacy. Effective oversight will therefore be an efficient means toward allowing valuable HSR to proceed.

PRIVACY AND RESEARCH

Federal policies on the protection of human subjects in all types of research rest on IRB review of the research proposals and protocols, and on obtaining the informed consent of subjects. Both apply somewhat differently in HSR than in clinical research, which increases the scope and complexity of research oversight in general. IRB review is complicated because HSR studies often have characteristics that cause studies not to require full IRB review and discussion. On the other hand, such independent review of these studies may help ensure that confidentiality is adequately protected. The regulations allowing IRBs to exempt studies from full review are described in more detail in Chapter 2. “Exemption” is a formal term in the regulations applied to studies that have such minimal impact on the subjects that no further oversight by an IRB is needed. For situations of somewhat more, but still small, impact, the proposal might receive expedited review from just one or a few members rather than the entire review board. In general, an IRB representative makes the determination of whether a project might be eligible for exemption or expedited review. Informed consent is complicated because many HSR projects involving analysis of personal health data collected previously for another purpose are eligible for waiver of informed consent. Indeed obtaining informed consent is not feasible for many HSR projects.

The methods of HSR are varied and may include not only secondary analysis of previously collected data, but also primary data collection through surveys and interviews. This report focuses on the secondary analysis of data, including personal health information, that have already been collected for some other purpose, because this type of analysis raises the most challenging ethical issues. In research where investigators collect primary data through surveys and interviews, the subject knows that research is being conducted, can find out more about the research, and has an opportunity to decline to participate. By contrast, in secondary analyses of the type described, individuals may not know that they are subjects of research and may not have the opportunity to decline to participate. The researchers also may be unable to identify subjects individually and, thus, unable to contact them

Page 22 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

for consent. Some people may, however, object if researchers have access to their health information without their knowledge or consent.

The committee recognized that important privacy and confidentiality concerns also arise in other forms of research using previously collected data (e.g., research using archival tissue specimens) and in many types of research in which new data are collected. Each of these areas merits careful study and the dissemination and adoption of best practices for protecting confidentiality. Indeed, the committee affirms that all personally identifiable health information, no matter how it was collected or for what purpose, should be treated so as to respect privacy and maintain confidentiality. This report reflects the committee's specific charge to focus on the analysis of existing data used in HSR after collection for another purpose.

Privacy and Confidentiality

Justice Louis Brandeis' reference to “the right to be left alone” (Olmstead v. U.S., 1928) stands as a vivid and succinct definition of privacy in general, but for the purposes of this study, definitions more focused on information should be considered (Box 1-1).²

For the purposes of HSR, privacy can be understood as a person's ability to restrict access to information about him or herself. Privacy is valued because respecting privacy in turn respects the autonomy of persons, protects against surveillance or intrusion, and allows individuals to control the dissemination and use of information about themselves. Privacy fosters and enhances a sense of self and also promotes the development of character traits and close relationships (IOM, 1994). The federal regulations governing human research (45 CFR 46.102 (f)) discuss privacy in the following terms:

Private information includes information about behavior that occurs in a context in which an individual can reasonably expect that no observation or recording is taking place, and information which has been provided for specific purposes by an individual and which the individual can reasonably expect will not be made public (for example, a medical record). Private information must be individually identifiable (i.e., the identity of the subject is or may readily be ascertained by the investigator or associated with the information) in order for obtaining the information to constitute research involving human subjects.

The regulations thus characterize privacy in terms of the expectations of the persons whose personally identifiable health information is being discussed and stipulate that the information must be specifically associated with the individual in order for the individual to have a legitimate interest in protecting it. Individuals may, however, be harmed or wronged by information associated with them probabalistically as well as specifically identifiable information.

²	Lowrance, 1997; NRC, 1997; Buckovich, et al., 1999; OPRR, 1993; Bradburn, 2000.

Page 23 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Confidentiality refers to controlling access to the information that an individual has already disclosed, for example, a patient to a treating physician or to an insurance company paying for care. Confidentiality is a major expression of respect for persons, the person who has trusted the health care provider with private information in the belief that the information will be guarded appropriately and used only for that person's benefit. Maintaining confidentiality is considered important also because it encourages patients to seek needed care and to discuss sensitive topics candidly with their physicians. If patients do not believe they can trust their health care providers to maintain confidentiality, they may withhold information to the detriment of the best medical judgment and care they might receive. Confidentiality is violated if the person or institution to whom information is disclosed fails to protect it adequately or discloses it inappropriately without the patient's consent. The dilemma about HSR is that personally identifiable health information that is disclosed or collected for one purpose (clinical care, billing, etc.) is then used without consent for a different purpose (improving the state of knowledge to benefit future and current patients).

Confidentiality is also important to the continued success and vitality of the HSR effort. Just as in the case of medical treatment, research subjects may withhold information if they do not have confidence that what they disclose will be protected. Further, it is crucial to the HSR effort that researchers design studies so that the risk of harm to subjects is minimal, in order to allow the protocol to qualify for a waiver of the informed consent requirement. HSR projects often apply methods to large databases of previously collected information where individual informed consent would be impracticable or impossible. The effect of losing the population's trust in confidentiality may have serious repercussions both for the effective quality of medical care and for the quality of medical records research. A 1999 poll by the California HealthCare Foundation (CHCF, 1999) found that approximately one in five respondents believed their personal medical information to have been improperly disclosed by a health care provider, insurance plan, government agency, or employer. Approximately one in

BOX 1-1 Definitions about Privacy

Informational privacy: the right of individuals to control access to, and the use of, information about themselves.
Data privacy: Informational privacy especially when the information in question is stored in a database.
Health Information Privacy: Informational privacy especially when the information in question pertains to the health or medical condition of the individual in question.
Confidentiality: the manner of treating private information, which has been disclosed by the individual subject of the information to a particular person or persons for a specific purpose, such that further disclosure of the information will not be allowed to occur without authorization.

Page 24 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

six respondents said they had taken some extra precautions to make sure that medical information about them remained confidential, including paying out of pocket, giving false information, and avoiding care. These figures have been interpreted both as alarmingly large and as reassuringly small. In either case, the numbers do suggest that there is significant potential for the reliability of personally identifiable health information data to decrease if the population's trust that the confidentialityof personally identifiable health information will be maintained decreases.

Benefits and Risks of Harm in Research

All research on human subjects raises ethical concerns because participants in research undergo risks of potential harm primarily, if not solely, for the benefit of others. Balancing benefit and risk of harm is an essential part of the design of any human subjects research. Physicians are familiar with the ethical obligation to balance benefits and risks when providing clinical care. However, in clinical care the patient both directly benefits from interventions and directly accepts the risks. Research, on the other hand, is not intended to benefit the subjects directly, because we actually do not know which treatment is best, so it is even more important in research to ensure that the risks are acceptable in proportion to the likely benefits, and that the risks are minimized. Indeed, these ethical principles are at the core of federal regulations on research human subjects.

Federal Regulations

Federal regulations govern human subjects research when the research is federally supported or regulated (e.g., by the Food and Drug Administration). The body of federal regulations about human subjects protection (45 CFR 46 Subpart A) is called the Common Rule, since it has been adopted “in common” by many federal departments and agencies that are involved in research with human subjects. The Food and Drug Administration (FDA) has adopted similar regulations tailored to its functions (21 CFR 50 and 56) (this report uses the general term “federal regulations” to refer all CFR sections dealing with human subjects protection). In addition, organizations that carry out many projects that are federally funded and involve human subjects can negotiate multiple project assurances (MPAs) with the Office of Human Research Protections (OHRP, formerly the Office for Protection from Research Risks or OPRR). Most organizations holding MPAs agree to carry out all their research according to federal regulations, regardless of whether all the research is intrinsically subject to Common Rule regulation.

In the federal regulations, the IRB of a particular organization is charged with reviewing and approving all research covered by the regulations that is proposed under the auspices of the organization (note, however, that the responsibility for ensuring compliance falls to the organization, not upon the IRB it-

Page 25 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

self). In order to approve research, the IRB must be satisfied that, among other requirements (45 CFR 46.111):

risks to subjects are minimized and are reasonable in relation to anticipated benefits,
selection of subjects is equitable,
informed consent is obtained to the extent required, and
provisions to protect the privacy of subjects and to maintain the confidentiality of data are adequate.

HEALTH SERVICES RESEARCH

Health services research is the study of the effects of different modes of organization, delivery, and financing of health care services (see Box 1-2). HSR includes studies of the effectiveness of health care interventions in real-world settings, as contrasted with studies of the efficacy ³ of interventions under controlled settings such as a clinical trial.

HSR raises particular issues regarding the protection of human subjects that differ from the problems of clinical research, just as the methods of HSR differ from the methods of clinical research.⁴ First, many HSR projects involve minimal risk of harm to subjects, so they may qualify for a waiver of informed consent and individual informed consent is often impractical or impossible in HSR projects. ⁵ For example, an HSR project may carry out secondary analyses of data previously collected in the delivery of patient care or the payment for such care. If the subjects whom the project will involve are enrollees in the federal Medicare program, the number of subjects may be as many as several million individuals. Further, many HSR projects use data that are already public and de-identified, so they may qualify for exemption from IRB review or for expedited review. Finally, many private organizations do HSR—or programs such as quality improvement that use similar data and methods—not covered by the federal regulations. These organizations may not have IRBs.

³	The term “efficacy” refers to how reliably an intervention brings about a given result under ideal, controlled conditions. The term “effectiveness” refers to how an intervention performs in the complex and variable context of real-world use and practice.

⁴

There are other fields of research such as epidemiology, however, that share with HSR similar methods and databases but evaluate different public health questions, (e.g. the frequency of rare medication side effects). Although not examined here, the practices reviewed by the committee for HSR would likely apply in these other fields.

⁵

Informed consent is not always feasible for clinical trials either: FDA regulations at 21 CFR 50.24 allow for the use of investigational drugs without informed consent under certain conditions such as cases where the subject's condition is immediately life-threatening, the subject is not able to participate in giving consent, time does not permit seeking proxy consent, and no alternative approved or generally recognized therapy is available.

Page 26 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Since the object of HSR includes the study of health care operations and HSR uses many of the same methods used in health care operations units to assess their own performance, HSR is fundamentally connected to nonresearch investigations within heath care organizations.

The committee heard one account describing the situation as a continuum, with HSR at one end of the scale and operations at the other end (see Figure 3 in Appendix B). Some HSR projects are clear examples of research; applying scientific methods to test hypotheses and produce new, generalizable knowledge. Other projects are certainly clear examples of internal exercises to assess the quality of the operations of the specific organization with no intention of producing generalizable knowledge. At the same time, quality assessment and quality improvement (QA and QI) exercises sometimes reveal interesting and important data that the organization recognizes to be of general interest, and that therefore ought to be published. In addition, both scientific research in health services and investigations into the internal operation of a health services organization use many of the same methods (e.g., chart review, database analysis and linkage).

In fact, many projects may start out as operations assessment and then become more like research, and many research projects involve doing very much what would be done in an internal operations assessment. This continuum is one of the interesting, if problematic, features of HSR. The committee proceeded with a view to the clearer cases of research in health services, always mindful of the less clear cases and closely related operations assessment exercises. From the point of view of the patient or subject—the person whose personally identifiable health information may be reviewed or used—the continuum appears more like a widening circle of disclosure. At the center is the individual and health information not yet shared with anyone; then, according to Etzioni's description (Etzioni, 1999), comes the inner circle of those with whom the individual shares information because they will use the information directly in the care

BOX 1-2 Definitions of Health Services Research

Institute of Medicine, (IOM):

Health services research is a multidisciplinary field of inquiry, both basic and applied, that examines the use, costs, quality, accessibility, delivery, organization, financing, and outcomes of health care services to increase knowledge and understanding of the structure, processes, and effects of health services for individuals and populations.

Association for Health Services Research:

Health Services Research is a field of inquiry using quantitative or qualitative methodology to examine the impact of the organization, financing and management of health care services on the access to, delivery, cost, outcomes and quality of services.

Page 27 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

FIGURE 1-1 Circles of disclosure. SOURCE: Adapted from Etzioni 1999.

of that individual. Next comes the intermediate circle of payers, and finally the widest circle of everyone else who may have an interest in the individual's health information (but with whom the individual may or may not have an interest in sharing the information [Figure 1-1]).

The clearest examples of QA and QI occur in organizations involved in health care delivery or payment. In many such assessments, individual patient cases have to be reviewed. For example, if an organization is trying to reduce drug errors in the hospital or shorten the length of stay after coronary bypass surgery, for example, it may need to review the medical records of individual patients to get a clear idea of how the process of care might be improved. Furthermore, when a health care organization is investigating a “critical incident” in which an error occurred, the QA committee will have to review the individual case in detail. HSR studies generally do not require investigators to know all personal information about each individual subject, but they often do require preservation of linkages (via consistent code numbers) across data files for individuals.

BENEFITS OF HSR

HSR can lead to improvements in the delivery and organization of health care, which may in turn improve health outcomes, and the cost-effectiveness of care, for patients. It addresses large-scale systemic effects of health care delivery changes that are difficult if not impossible to understand at the level of the individual citizen, consumer, or patient. This kind of information is important for planners and policy developers in both government and the health care industry. It is also increasingly important for both corporate and individual consumers of health care services (Clancy and Eisenberg, 1998; Eisenberg, 1998). Our health care system is incorporating greater reliance on market decisions to improve quality and control costs, but these decisions can be made well only with access to good information about different health care services options. HSR provides objective data on questions about the effectiveness of institutional variables, just

Page 28 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

as clinical trials assess the efficacy of interventions in individuals. HSR projects aim at a variety of levels of the health care system. Some examples follow.

Policy Assessment Well-intentioned public policies may have unanticipated adverse consequences or fail to fulfill their goals. However, whether they are meeting or missing their goals, it is often difficult to assess their outcomes without a systematic examination. HSR generates data on the outcomes of public policies and provides an empirical base for modification or refinement of these policies. For example, Gross et al. (1998) compared previously collected data from several sources (including the Medicare Current Beneficiary Survey) to estimate out-of-pocket health care spending by lower-income Medicare beneficiaries. The authors found that although Medicaid provides significant protection for some lower-income Medicare beneficiaries, out-of-pocket health care spending continues to be a substantial burden for most of this population. This fact may be important in considering policies that would depend on further cost shifting to increase out-of-pocket expenditure. In another example, Cromwell et al. (1998) compared data over a four-year period on Medicaid anti-ulcer drug claims, Medicaid eligibility, and acute care nonfederal hospital discharges to assess what effect a policy of restricting reimbursement for Medicaid anti-ulcer drugs had on the use of these drugs and on peptic ulcer-related hospitalizations. Following implementation of the policy, reimbursements for the drugs decreased 33 percent but there was no associated increase in the rate of Medicaid peptic ulcer-related hospitalizations. These results opened further research questions because there may have been quality-of-life implications for some patients that the study did not address. Addressing these questions has important public policy implications.

Outcome Predictors The question of whether it makes a difference to have a procedure done in a hospital that has a high case load of similar procedures is important to policy makers and to individuals who may need an operation. HSR studies have demonstrated in several cases that centers performing a greater volume of procedures have better patient outcomes. Norton et al. (1998) examined the effects of case volume on outcomes for knee replacement surgery using Medicare claims data and found the results so striking that they recommended against expanding knee replacement surgery to new centers generally and instead recommended concentrating on developing hub centers. Other groups of investigators have examined the relationship between volume and outcomes in coronary interventions, where the inverse relation between volume and mortality has been known for two decades. This does not mean that new information is not important in changing policy, however. Sollano et al. (1999) investigated the outcomes in several New York hospitals of three types of operations and found that although the relationship (high volume associated with lower mortality) persisted in two, it no longer held true for the third, coronary artery bypass grafting. The authors attribute the disappearance of the relationship in this operation to a recent quality improvement program in bypass operations in the region, with important implications for the effects of QI programs in general. On

Page 29 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

a similar note, Malenka et al. (1999) showed that for one type of surgery (percutaneous coronary interventions), the operator did not have to do as many procedures per year to maintain top performance as had previously been believed, which they attributed to changes in practice due to new devices and drugs.

Provider Practices HSR develops data on physician behavior and practices. Zito et al. (2000) recently demonstrated a significant overall increase in the rate of prescriptions of psychotropic drugs for preschool-aged children (using data from two state Medicaid programs and a salaried group-model health maintenance organization [HMO]), which is of note in part because there are few controlled clinical trials of the safety and efficacy of such drugs in this young age group.

HSR studies also help identify factors that may predict underuse of services that are known to be beneficial. HSR has shown that patients who have survived one heart attack also are known to have lower mortality if they receive medications such as beta-blockers, aspirin, and cholesterol-lowering drugs, yet these drugs are underused. Recent research confirms this and supports the use of betablockers even in diabetic patients, a group for whom some physicians had been reluctant to prescribe them (Chen et al., 1999). Other HSR studies have sought to address questions of the adoption or lack of adoption of clinical practice guidelines. Katz (1998) found, on analyzing the AHRQ guideline for unstable angina, that there were several barriers to adoption including physician variability. These included incomplete specification of exceptions as well as unexpected increases in demand for care. Recognition of these barriers can then be incorporated in the development of future guidelines to ease their adoption.

Effects of Business Practices and Law on Health Product Delivery Brooks et al. (1998) used HSR techniques to demonstrate that independent pharmacies are at significant disadvantage compared to chains when negotiating with insurers. Collective bargaining by pharmacies might mitigate this disadvantage but is currently prevented by antitrust law. Such information would be impossible for a consumer to obtain and therefore impossible to act on, yet the consumer certainly feels the effects on the pocket book of more or less negotiating power.

RISKS OF HARM FROM HSR

The risks of HSR are primarily violations of privacy and confidentiality, not physical risks. HSR thus differs from clinical research in which patients are at risk for physical harms because they undergo invasive medical procedures or receive unproven new therapies. Potential risks of violations of privacy or breaches of confidentiality are by no means limited to research, but can occur anytime personally identifiable health information has been collected. Potential risks include the following:

Risk of public (or private) disclosure of protected health secrets, which can lead to stigmatization or discrimination in employment or insurance, and/or

Page 30 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

shame: this is the fundamental issue and, for most people, probably the most serious.

Risk of disruption of, or interference in, patterns within families, which may result from unexpected and unauthorized communication of secrets within the family.
Risk that individuals may recognize (correctly or not) their own health history or anecdotes in results and interpretations of a study or may suffer anxiety simply from knowing that personal data may be in a database, without knowing whether adequate privacy protections are in place: this subjects the person to the perception of the first risk, even if it is not actually present.
Risk of future contact. Privacy is “the right to be left alone.” Yet some HSR studies permit the collection of follow-up investigations that include contacting the individual whose data are studied. In this case, a stranger to the person or (perhaps less alarming but still disruptive) a care provider from long ago can suddenly intrude upon the subject's right to be left alone.
Risk of loss of trust in the health care system and/or scientific research, and thus loss of willingness to participate in future studies or perhaps even to seek needed health care.⁶

These psychosocial harms can be avoided or mitigated if the research data are coded or encrypted in such a way that individual subjects cannot be identified. In addition, some harms can be prevented by strong antidiscrimination laws. However, subjects may be wronged by violations of privacy and confidentiality, even if they suffer no tangible harm. That is, even if persons do not suffer employment difficulties or can be compensated by law if they do, this does not change the fact that the subjects did not receive the respect due them as persons. The federal regulations on research on human subjects explicitly require IRBs to consider wrongs as well as harms in assessing the benefits and risks of research.

Breaches in the confidentiality of previously collected data can occur in a variety of ways. For example, an employee who has a legitimate need to access part of the database to carry out his or her job may make unauthorized use of that access: a clerk in charge of determining insurance eligibility, or a nurse who is not providing direct care to the patient, may review the records of an acquaintance or a celebrity just for the sake of satisfying curiosity. The great majority of occasions for data transfer and access occur not through research (or malfeasance), but in standard health care operations. The great majority of occasions for breach of confidentiality likewise occur in daily operations.⁷ Some instances of breaches of confidentiality are unintentional, for example, leaving a record that includes a patient's name out

⁶	For a full discussion of this problem see, for example, Goldman (1998).

⁷	The recent NRC report For the Record (1997) discussed the increasing complexity of health information flow in detail (see especially Figure 3.1) p. 73. See also Goldman and Hudson (1999).

Page 31 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

in the view of a visitor or discussing a patient by name in the hearing of other parties in an elevator or cafeteria (Ubel et al., 1995).⁸ Also, some breaches are not accidental, but are oversights. The committee heard of one incident in which the names of employees tested for HIV were displayed with the test results on a slide at a presentation, for example. The aim of the presentation was simply to describe a database of in-house health records. Some of the employees whose records were listed in the section displayed were actually attending the meeting. In this case the breach of confidentiality could have been avoided through more attention or training on the part of the research team and by the use of coded identifiers rather than direct identifiers such as names.

As our health care system becomes more complex, information flow is likewise increasingly complicated and the potential occasions for either a breach, or perception of a breach, of confidentiality are correspondingly multiplied. For example, a database marketing firm received patient prescription records from two large pharmacies in the Washington, D.C. metro area (Lo and Alpers, 2000). The firm then created mailings targeted to consumers of certain prescription drug products on behalf of the pharmacies (using the letterhead of the pharmacies), informing them of new products with similar indications. The project was sponsored by the manufacturers of the new products, though the manufacturers did not have access to patient data. Many of the recipients were disturbed at receiving the letters, since the action seemed to straddle or even cross the line between standard prescription medication compliance letters that are often sent by pharmacies to patients and product marketing.⁹

Despite the potential for misuse, there are important and legitimate reasons to maintain some identifiability in personal health information databases. Much of the value of retrospective data-based research comes from the ability to draw

⁸

In some cases the disclosure may be intentional: a particularly famous example of the improper disclosure of personally identifiable health information occurred with the unauthorized release of the HIV status of Mr. Arthur Ashe (mentioned widely, e.g., in Shalala, 1997). It is important in the context of this report, however, to note that this disclosure was not made in the course of research, and it was accomplished with paper records. Of course, such disclosures are also not part of normal health care operations.

⁹

Perceptions of breach of confidentiality can also include cases where an individual has (knowingly or unknowingly) provided information in the course of responding to a consumer survey or calling a product hotline, either of which often results in the individual receiving marketing materials including disease-related product advertising information. On receiving such information, some individuals may assume that private health information was shared by their health care providers, not realizing that they themselves had provided the information for the marketing effort. A different type of concern, again not in research but in operations, was described in a previous IOM report, Health Data in the Information Age (IOM, 1994). The report noted that increasing the fringe benefits offered by employers also increases pressure on the employers to control costs and that information about an employee's health may be shared through the company to tailor plans so as to reduce liability (the report referred to a case that upheld the right of an employer to reduce benefits, in which an employer became self-insured and established a limit on AIDS-related expenses after a current employee was diagnosed with AIDS [p. 159]).

Page 32 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

inferences from data derived from different sources. For example, health care organizations are often interested in identifying episodes of illness in a patient, which may be manifest in records of emergency room visits, ambulance services, hospital stays, operative records, bills from independent medical providers, rehabilitation services, pharmacies and pharmacy benefit managers, and so forth. In order to recognize that the data drawn from these various sources refer to the experiences of a single individual, it is important that researchers be able to identify the same patient in each set of records. This identification allows joining these various datasets into a single (logical) database that contains all relevant data about the patient. Such identification and joining is often difficult, and is one of the motivations for keeping identifiers. The actual identity of any individual is not really necessary to support the linkage between databases that have been joined; all that is required is a unique identifier, which might (at least in principle) be difficult to re-associate with the actual patient.

Even when research data are recorded in coded or encrypted format, however, it may be possible to identify individual subjects at least with good probability. Records are directly identifiable when individual identifiers such as names or Social Security numbers are collected or retained (also called “manifest identifiability”). Yet individuals might be identified, at least probabilistically, by linking otherwise de-identified data so that the resulting record effectively identifies a particular individual. In this latter case, the information is said to be indirectly identifiable (or “identifiable by inference”). For example, race may not be a direct or manifest identifier in the general population, but when combined with the zip code of a relatively homogeneous area, a person of contrasting race could be identified.

In one example of identification by inference, Latanya Sweeney showed that three data fields (e.g., birth date, sex, and zip code) were sufficient to create a linkage in databases, locating, with good probability, the records pertaining to a single individual who was employed by the state. She was able to do this a matter of hours using data that had been made publicly available only after all the (known) identifiers had been removed (L. Sweeney, personal communication, 2000; Sweeney in press; see also Sweeney, 1997, for further discussion). This example shows, first, that supposedly de-identified data may still be personally identifiable when combined with other available data that either are complete or do identify individuals. Second, it shows that the ability to manipulate databases to locate individual subjects has increased due to advances in computing. Even if the information collected is no more invasive now than previously, it is now feasible for others to glean personally identifiable health information from such data where it would have been much more difficult before.

Sweeney's demonstration should be a reminder that with the increasing technical ease of identification by inference, there can be no guarantee of absolute confidentiality of records. This fact in turn raises the question of how much effort and expense ought reasonably to be invested in privacy protection. There are various approaches to minimizing breaches of confidentiality. Some call for strong measures at the point of disclosure, such as increasing the types of disclo-

Page 33 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

sures where explicit informed consent would be required (Norsigian and Billings, 1998; Woodward, 1999); others emphasize strong sanctions against violations of confidentiality by the data holders or users who release or receive secondary data; and finally, still others argue that the best course would be to stop worrying about it entirely and instead turn to developing ways to live in society without informational barriers (as suggested by the now-well-known aphorism of Sun Microsystem 's corporate executive officer Scott McNealy, “There is no privacy, get over it”).

There are several important points to keep in mind about the risk of breaches in confidentiality: the risk is neither new nor research specific, and some level of risk is inevitable. First, the improper identification and disclosure of health information about individuals is not a unique risk from HSR, nor is it a new result of the widespread adoption of computer-based patient records, governmental or health care industry databases or the Internet. Most instances occur outside of research, in operations. Also, breaches occurred with paper records as well. It is the case, however, that with the development of computing and communications technology, both intentional and unintentional identification and disclosure of electronic personally identifiable health information potentially involve more types of information and more individuals than were possible with paper records. At the most basic level, confidentiality always depends on conscious efforts by human agents to treat other human beings with respect and restraint, whether the activity is research or not, and whatever the state of the technology.

The protection of confidentiality is impossible to guarantee—some level of risk is inevitable.¹⁰ It is possible to make breaches less likely and to increase the probability that confidentiality will be maintained, but the protection of confidentiality is a matter of shifting the probabilities; it cannot be an absolute (see also GHPP, 1999, pp. 15–16). The question really is what measures can be taken to enhance confidentiality protection, and thus retain public trust in HSR, and

¹⁰The probabilistic nature of confidentiality has been recognized elsewhere, for instance a 1998 working group convened by the National Cancer Institute of the National Institute of Health to examine the creation of informed consent documents had the following recommendation regarding informing potential subjects about confidentiality and its limitations:

Confidentiality: The confidentiality section of the informed consent document should state that although measures will be taken to protect the privacy and security of personally-identifiable data, absolute confidentiality cannot be guaranteed. The consent document should list the organizations that will have access to personally-identifiable data and that personally identifiable information may be disclosed as required by law. When listing organizations that will have access to research records, describe for what purposes the information will be disclosed to these organizations. (From Recommendations for the Development of Informed Consent Documents for Cancer Clinical Trials, by the Comprehensive Working Group on Informed Consent in Cancer Clinical Trials for the National Cancer Institute, October 1998, posted at the following address, http://cancertrials.nci.nih.gov/researchers/safeguards/consent/recs.html #Confidentiality).

Page 34 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

still allow research to proceed. Since it is not possible to guarantee the confidentiality of records in general, it is also not possible to guarantee absolute confidentiality in HSR. The measures we can take to increase the protection of privacy and confidentiality are varied, some simple and some complex, and the range of measures will change as computational and communications technologies develop. The committee argues that with appropriate safeguards for confidentiality, it is acceptable to consider a great deal of HSR as minimal risk and appropriate to carry out without requesting consent for each reanalysis of data.

BACKGROUND AND POLICY CONTEXT

In recent years, public concern about privacy and maintaining the confidentiality of personally identifiable health information has increased. Legislators have responded to worried constituents by introducing a variety of privacy bills over several sessions of Congress. Currently, there is no comprehensive federal law that protects privacy for all health-related information. There are some federal, and varying state, statutes that protect certain types of personally identifiable health information under certain circumstances (see, e.g., Gostin, et al, 1996; O'Brian and Yasnoff; 1999, also Pritts et al., 1999). One state action that has generated considerable interest of late was the Minnesota Access to Health Records Law (McCarthy et al., 1999), which required informed consent from patients to use of their medical records for research. There has been disagreement as to the actual intent and effect of this law (Melton, 1997; Norsigian and Billings, 1998). Whatever the law's actual impact, it expresses public concerns about privacy in research. The committee felt that these concerns were important to address through effective privacy and confidentiality protections and also believed that good protection could be implemented so as to be compatible with future research. The committee hopes that this report will help address these concerns (see Box 1-3 for more definitions).

In 1996, Congress enacted the Health Insurance Portability and Accountability Act (HIPAA), directing the Secretary of Health and Human Services to create detailed recommendations on standards with respect to the privacy of personally identifiable health information. The Secretary 's recommendations were delivered to Congress in September 1997 (Shalala, 1997), and several privacy bills have been introduced in Congress since. Both the Secretary's recommendations and most of the privacy bills introduced in the 105th Congress would permit research using personally identifiable health information without the subject's explicit permission if the research project were approved by an institutional review board.

The HIPAA further directed the Secretary of Health and Human Services to create regulations by February 2000, unless the Congress had taken legislative action at least six months earlier. Congress did not take further action, so the

Page 35 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

BOX 1-3 More Definitions

Common Rule—the central federal policy adopted “in common” by 16 federal departments and agencies (and concurred, with some modifications, by the FDA) that support and/or conduct research involving human subjects. The adoption of the federal policy in 1991 implements a recommendation of the President's Commission for the Study of Ethical Problems in Medicine and Biomedical and Behavioral Research that all federal departments and agencies “adopt as a common core the regulations governing research with human subjects issued by the Department of Health and Human Services (codified at 45 CFR 46, Subpart A), as periodically amended or revised, while permitting additions needed by any department or agency that are not inconsistent with these core provisions” (OPRR Guidebook, Chapter 2).

(Federal) Regulations—Regulations are the rules that departments or agencies issue to provide specific guidance to themselves and others about how they will implement pertinent laws. In this particular report, “regulations” refers to federal regulations on human subjects protection implementing the Common Rule. This report usually provides the citation to the DHHS regulations, since that agency sponsors most of the relevant research and this project, but the report applies similarly to other departments and agencies.

Personally Identifiable Health Information—Health or medical data or information that can be linked manifestly or inferentially to an individual.

The terms “anonymized” and “de-identified” are commonly used to refer to health information where some attempt has been made to provide confidentiality protections by making it difficult to link a record to a specific individual. An additional difficulty, as discussed in the text, is that the ability to re-identify an individual from a dataset depends not only on the degree to which identity is hidden or removed in that dataset, but also on access to other datasets that may facilitate probabilistic identification of the individual. It is thus very difficult for anyone to assure anonymity of a dataset because critical factors in re-identifiability depend on conditions outside the dataset. The terms “anonymized” and “de-identified ” exist along a spectrum of properties of data and are not guaranteed endpoints (GHPP, 1999). Since people do not always specify what they mean by these terms, individuals may define them in different ways. The committee suggests:

De-Identified—Refers to information or data where direct identifiers such as name and address have been removed. In common use the term refers to data where it may still be possible to identify individuals by inference or through codes held by the investigator or a third party. Therefore data that is de-identified may not be anonymized because it may still permit at least probabilistic re-identification when analyzed in conjunction with other datasets.

Page 36 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Anonymized—Refers to information or data where identifiers (and codes that are linked to identifiers) have been removed, as well as other values that would enable individuals to be identified by inference. Close correlation with values in additional datasets, or unique values, or cells containing few data points, for instance, could support such inferences. A dataset, therefore, must at least have been thoroughly de-identified in order to be anonymized. For all practical purposes, anonymized data cannot be linked to the individual. Examples of anonymized data are public use files made available by the Bureau of the Census.

The DHHS proposal would create new requirements for privacy protection for all health care providers and health plans, and would establish research standards and oversight for all research. The proposed regulations suggest that the review function be performed by boards that are equipped to deal with data privacy and by organizational privacy officers who will ensure system-wide compliance with new privacy rules. The proposed regulations contemplate that IRBs might conduct privacy review in some circumstances, but the DHHS proposal does not suggest that IRBs are the only or even the best mechanism for privacy review with respect to data studies. The proposed rule would permit the use and disclosure of personally identifiable health information for research without authorization by the subject, as long as the research protocol had been approved by an IRB established in accord with the Common Rule (or FDA regulations) or by a privacy board. The proposed rule then specifies that a privacy board would have to have members with varying backgrounds but appropriate professional competence, at least one member not affiliated with the organization doing the research, and no members with conflicts of interest (DHHS, 1999, p. 60058).11 As this report was being written, DHHS was analyzing and responding to the approximately 52,000 comments that the proposed rule elicited. Recent studies of IRBs are another important policy context for this report: several have questioned whether IRBs adequately fulfill their role of protecting research subjects and whether they have sufficient resources to do so.

¹¹	The preamble to the proposed rule further specifies a privacy board as a body equivalent to an IRB. During the comment period, various parties disagreed about whether the privacy board as specified would actually be equivalent to an IRB and able to provide the degree of oversight necessary.

Page 37 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Historically, the focus of IRBs has been on protecting human subjects from potential harm associated with participation in clinical research that involves invasive medical procedures or new drugs. Little is known about IRB practices in the area of HSR projects, though DHHS regulations at 45 CFR 46 have always applied to non-clinical, as well as to clinical, research. Furthermore, much HSR using large databases is undertaken with private funding and, consequently, falls outside the purview of IRBs.

PROJECT AND SCOPE

This report is the product of a project sponsored by two agencies within the DHHS, the Agency for Healthcare Research and Quality (AHRQ) and the Office of the Assistant Secretary for Planning and Evaluation (ASPE).

This report is intended for all types of professionals and organizations that use or disclose data on health services. For organizations that have IRBs whose research is subject to federal regulation, the recommendations highlight practices already in place in some IRBs and suggest additional support for IRB activities. For organizations that use or disclose data but do not have an IRB or whose work is not subject to federal regulation, the practices and recommendations emphasize that the protection of human subjects from risks, including nonphysical risks from use of data, is of concern to anyone who uses or discloses data. that the protection of of human subjects from risks, including nonphysical risks form use of data, is of concern to to anyone who uses or discloses data.

Although not all organizations have IRBs, all human subjects should be treated with the same high standards. The committee urges organizations that do not have IRBs to adopt practices of reviewing proposed investigations to ensure that data confidentiality will be maintained. The committee likewise urges organizations that have, as well as those that do not have, IRBs to adopt system-wide confidentiality procedures and policies to protect nonresearch and research data.

The purpose of this project was to provide information and advice to the sponsors on the current and best practices of IRBs in protecting privacy in HSR. The charge to the committee was given in three parts as shown below.

To gather information on the current practices and principles followed by IRBs to safeguard the confidentiality of personally identifiable health information used for health services research purposes, in particular, to identify those IRB practices that are superior in protecting the privacy, confidentiality, and security of personally identifiable health information.
To gather information on the current practices and principles employed in privately funded health services research studies (that are generally not subject to IRB approval) to safeguard the confidentiality of personally identifiable health information, and to consider whether and how IRB best practices in this regard might be applied to such privately sponsored studies.

Page 38 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

FIGURE 1-2 Scope of the Institute of Medicine study.

If appropriate, to recommend a set of best practices for safeguarding the confidentiality of personally identifiable health information that might be voluntarily applied to health services research projects by IRBs and private sponsors.

In order to address these tasks, the IOM assembled a 12-member committee with expertise in medical ethics, health services research, IRB function, statistics, computer science, law, and database management. The committee met by telephone conference in January 2000. The committee and the IOM then convened a public workshop in March 2000. The committee invited testimony from IRB chairs and administrators, health services researchers, and other officers of academia, government and private industry (see Appendix B). The workshop also featured presentations of the drafts of two commissioned papers, one addressing special considerations of health services research and confidentiality when the data pertain to minors (see Appendix C) and the other presenting an international comparison of health information privacy standards (see Appendix D). In addition to the workshop, the committee posted an invitation on a list serve and on the National Academies' website to IRBs to contribute information (see Appendix A). The committee collected further information informally by email and telephone. The committee deliberated by telephone and e-mail, and in a closed meeting in April 2000, about the practices described to it. Finally the committee has summarized the practices it heard that seemed to be most effective in this report.

The committee addressed privacy and confidentiality pertaining to data used in health services research that had already been collected for another purpose. There are many other aspects of the privacy of electronic medical records that were beyond the charge to the committee (Figure 1-2). The committee focused

Page 39 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

its work on secondary analyses of data that had already been collected for other uses, because such studies pose the most difficult ethical issues regarding HSR.

Although HSR that utilizes surveys and interviews (including the qualitative HSR mentioned in Box 1-2) also raises ethical issues, the contact between researchers and subjects allows the subjects to learn about the research and decline to participate if they so choose. The committee recognized the strong connections between these related matters and the question of protecting data privacy in health services research that uses existing data. The committee therefore asks the reader to bear in mind that such related matters were not in its charge, were not addressed by the committee, and in particular, were not discussed at the workshop.

OUTLINE OF REPORT

Chapter 2 summarizes the federal regulations as they apply to HSR studies. Chapter 3 presents the committee's recommendations and findings based on the available information from IRB's working under federal regulations. Chapter 4 presents the committee's recommendations and findings based on available information from health care services and products companies that may not have IRBs or be subject to federal regulations. The committee holds the conviction that studies involving human subjects should be reviewed similarly whether the study is subject to Common Rule provisions or not. As a result, the committee makes similar recommendations regarding research that falls under the Common Rule and research that does not. The committee considered combining chapter 3 and chapter 4, but decided to keep them separate both because the implications of the recommendations might be different for different types or organizations, and because the separate structure seemed to reflect the committee's charge more clearly. Finally, Chapter 5 returns to the topic of the limited scope of this project in discussing research and steps for the future. As was mentioned in the workshop, the end of this study must not be the end of studying these important questions, and the final chapter suggests some directions for further work.

The committee gathered information through a public workshop (summarized in Appendix B), a general information request posted through the internet (see Appendix A), and various unstructured interviews in the course of study operations. Although the committee received just a few responses to the posted call for information, those received were very informative. The committee noted that all the providers of information, including respondents to the call for information, those who briefed the staff by telephone, and participants in the workshop, are a self-selected group of professionals committed to the IRB process. Information collection was thus not systematic and random, but particularly targeted. The committee also commissioned two background papers, one to examine HSR and minors and the other to compare privacy and confidentiality standards across national borders, which appear as Appendix C and Appendix D. Finally, biographical sketches of the committee members are included as Appendix E.

Page 20 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 21 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 22 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 23 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 24 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 25 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 26 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 27 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 28 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 29 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 30 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 31 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 32 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 33 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 34 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 35 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 36 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 37 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 38 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Page 39 Cite

Suggested Citation:"1 Introduction." Institute of Medicine. 2000. Protecting Data Privacy in Health Services Research. Washington, DC: The National Academies Press. doi: 10.17226/9952.

Next: 2 Human Subjects Protection and Health Services Research in Federal Regulations »

Protecting Data Privacy in Health Services Research (2000)

Chapter: 1 Introduction

1

Introduction

PRIVACY AND RESEARCH

Privacy and Confidentiality

Benefits and Risks of Harm in Research

Federal Regulations

HEALTH SERVICES RESEARCH

BENEFITS OF HSR

RISKS OF HARM FROM HSR

BACKGROUND AND POLICY CONTEXT

PROJECT AND SCOPE

OUTLINE OF REPORT

Welcome to OpenBook!

Get Email Updates