Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 51
Protecting Data Privacy in Health Services Research 3 Best Practices for IRB Review of Health Services Research Subject to Federal Regulations Research with human beings is subject to federal regulations if it is federally supported or regulated for some other reason (e.g., will be submitted to the FDA as part of a new drug application). In addition, organizations that hold a multiple project assurance (MPA) from the Office of Human Research Protections (OHRP, formerly OPRR) usually, as a condition of the MPA, require all research at the institution to be subject to federal regulations, including research that is not federally supported. Furthermore, some organizations that do not hold such an MPA may also require all research to be conducted in accordance with federal regulations as a matter of organizational policy. This chapter presents the recommendations and findings of the committee regarding the practices of institutional review board (IRB) review for health services research (HSR) that is done according to the federal regulations (whether the organization follows the federal regulations by requirement or by policy choice). The committee collected information from some universities and health centers and private research foundations, hearing testimony at a public workshop and collecting materials and statements from participating IRBs (see Appendix A and Appendix B). The committee was not able to conduct a comprehensive survey of IRB practices. The recommendations and findings that follow are based on the available data from a limited number of organizations and may not be representative of the entire IRB system. The committee presents these recommendations and findings in the hope that they may be helpful to some organizations, and may inform and stimulate further discussion about how IRBs can better fulfill their important role.
OCR for page 52
Protecting Data Privacy in Health Services Research RECOMMENDATIONS Recommendation 3-1. Organizations should work with their IRBs to develop specific guidance and examples on how to interpret key terms in the federal regulations pertinent to the use in HSR of data previously collected for other purposes. Such terms include generalizable knowledge, identifiable information, minimal risk and privacy and confidentiality. Organizations and their IRBs should then make such guidance and examples available to investigators submitting proposals for review. The committee found that several topics cause considerable worry to investigators and IRBs because federal regulations are open to varying interpretations, with divergent implications. The first of these topics is what activities are considered research and what criteria are used to operationalize the distinction between research and other activities. A key feature of the federal definition of research is whether the activity contributes to generalizable knowledge. In trying to distinguish research from activities such as quality assessment (QA) or quality improvement (QI) that use similar techniques to analyze personal health information in databases, however, both the federal regulations and the interpretations of these regulations by OHRP contain insufficient practical guidance for investigators and IRBs. A second important issue is what constitutes identifiable information as defined in the federal regulations. Once again, the federal regulations provide little direction to investigators and IRBs on how to operationalize these terms, for example, whether or how it would be determined that data were unidentifiable, if they were coded in such as way that the investigator would have access to the data but also have great difficulty in reestablishing the identity of subjects. A third issue is what constitutes minimal risk in HSR research and, in particular, what steps to protect the confidentiality of data in HSR suffice to allow the project to be considered as minimal risk. The issues of identifiable information and minimal risk have important implications for whether a project may be exempt from IRB review or receive expedited review or whether informed consent of research participants may be waived. On all these issues, IRBs should communicate more directly with investigators and give local examples more specific than the guidance currently available in federal regulations and clarifications by OHRP. Clearer guidance would make IRB review more efficient as well as enhance the protection of subjects by helping to ensure that HSR projects incorporate confidentiality protections that the IRB finds important. The committee found that IRBs vary in how they interpret federal guidelines pertaining to whether a project is intended to yield “generalizable knowledge ” and thus should be subject to IRB review.
OCR for page 53
Protecting Data Privacy in Health Services Research In the federal definition, research is “designed to develop or contribute to generalizable knowledge” (45 CFR 46.102(d)). The concept of generalizable knowledge seems to include both scientific rigor—to avoid error and to assure that findings can be widely applied—and an intent to disseminate the findings of the investigation. The IRB representatives participating in the workshop agreed that an activity would be considered research if the investigator plans to publish the findings. IRBs differ, however, in how they interpret other situations, particularly activities that might be considered QA or QI. Some organizations take an inclusive view of research, considering a project to be research if the findings will be disseminated outside the division or department that carried out the project. In this view, if the findings will be presented at a scientific meeting or to administrators from other organizations (e.g. other teaching hospitals), they will contribute to generalizable knowledge even if they are not published. Dr. James Kahn of the University of California in San Francisco, for instance, suggested at the workshop that if data are collected systematically, the project should be reviewed by the IRB, since it is reasonably likely that the investigator will publish the results if the findings are interesting (see Appendix B). Ms. Angela Khan, IRB administrator from the University of Texas Health Sciences Center in San Antonio (UTHSCSA) explained at the workshop (see Appendix B) that her institution's IRB considers a number of issues in deciding whether a project is research rather than QA or QI. In assessing whether certain studies (generally only those directed toward internal QA) should be exempt from review, the IRB would consider whether the findings of the study will be disseminated beyond the department proposing to carry out the study; the protocol includes any change in clinical care or clinical processes that will affect other patients; the data to be collected would be available to the investigator only through the study (i.e., the investigator would not have access to such data in normal practice); and, there is any risk of harm or wrong to patients or staff. If the answer to all of these questions is “no,” then the UTHSCSA IRB would consider the protocol exempt as a QA activity. Other research may be exempt under the regulations but probably also would be reviewed at least by a subcommittee of the IRB, and informed consent might still be required. Ms. Khan also noted that generally the first consideration alone is sufficient to classify a project as research, since most investigators do in fact wish to publish their findings, even from projects that were planned as internal investigations, if they should prove interesting. On the other hand, other IRBs use a narrower view of research. Dr. Robert Amdur of the University of Florida and also a presenter at the workshop (see Appendix B) suggested a contrasting approach. Starting from the premise that activities best characterized as QA or QI should be excluded from IRB review,
OCR for page 54
Protecting Data Privacy in Health Services Research he argued that publication (i.e., contributing to a lasting collection of generalizable knowledge), is a necessary condition for an activity to be considered research. Therefore if researchers say that they would not carry out the project if the results could not be made public, the project must be considered research. By contrast, for nonresearch activities such as QA or QI, there would still be sufficient internal organizational motivation for collecting the data, even though the activity would never increase the store of generalizable knowledge (Amdur et al., in press). The committee found that another common dilemma occurs when the investigator does not initially intend to publish, and therefore does not ask for IRB review, but afterwards discovers the findings to be so interesting that they ought to be published. IRBs apparently vary in the way they handle such situations. Because the intentions of the investigators may change, other authors have suggested additional criteria for research, similar to those that some IRBs are already using. Casarett et al. (2000) suggest considering a QA or QI project as research if most of the subjects would not be expected to benefit directly from the knowledge generated and if the subjects would incur risks beyond those of normal practice. The committee also heard that the determination of whether an activity is research, and hence how the observed individuals are to be protected, is particularly problematic in small organizations. Small organizations wishing to study their outcomes to improve their operations may not have access to resources for developing formal protocols and may not have an IRB that can review the project. Thus, an inclusive definition of research could preclude important projects in small organizations. In the workshop, participant Dr. Joanne Lynne of RAND gave the example of small hospices and home health care organizations who want to improve their own services but also share their findings with similar organizations, perhaps as part of a multisite study. The committee noted, however, that the likelihood of identifying individuals and the difficulty of maintaining confidentiality are both greatly increased in small organizations. Furthermore, in hospice care, information may be recorded about such sensitive topics as family disputes, emotional problems, or even illegal activities such as physician-assisted suicide. Hence individuals who are patients in small organizations and who are the subjects of projects carried out in small organizations that fall in the ambiguous zone between research and QA/QI, may be in need of the protection due human subjects; indeed they may be very vulnerable populations in need of strong protections. The committee concluded that in light of these different viewpoints of various IRBs, investigators may be unclear how federal guidelines define research and how their own IRB will interpret those guidelines with regard to HSR. The committee found that some IRBs have specific and detailed criteria for determining whether information is identifiable.
OCR for page 55
Protecting Data Privacy in Health Services Research In the federal guidelines, research on existing data, documents, or records is exempt from IRB review “if the information is recorded by the investigator in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects” (45 CFR 46.101(b)(4)). Thus, the concept of identifiable information is crucial in determining whether an HSR project is exempt from IRB review. As mentioned in Chapter 1, the question of whether a record is identifiable is difficult because identifiability is not a property solely of the record itself but may be an inferential result of the record plus a linkage with some as-yet-unspecified database by an as-yet-undefined algorithm. How the question is answered has profound implications for the way the research in question will be regulated (Lo, 1999). Ms. Khan noted that the UTHSCSA IRB, regarding projects using data from computer databases, asks the investigator to list all the fields to be collected and to indicate who will actually collect the data, how respect for confidentiality by any personnel involved will be ensured, and how further dissemination of the information will be prevented (e.g., storing data on computers that are not networked, storing codes identifying individuals separately from data, using passwords and/or key requirements to restrict access both to computers for data storage and to computer housing identifying codes). Another workshop presenter, Dr. Tora Bikson of RAND, suggested a general rule that RAND uses: if sorting data according to any variables produces subsets with ten or fewer members, these individuals will be at risk for identifiability by inference. The committee did not test or corroborate this cutoff point, which would require more theoretical work, but noted that rules of this type are good examples of useful practices. The committee concluded that it is desirable for several reasons to have such explicit criteria on the identifiability of information. Explicit criteria improve the quality of HSR by promoting more careful consideration of the issue of whether information can be linked to identifiable individuals. Furthermore, explicit criteria promote consistency in the IRB review and allow more efficient review. If investigators know how the IRB determines whether information is identifiable, they can use that knowledge in study design to avoid problems such as building in unintentional identifiability. At the same time, it is important to remember that identifiability is a dynamic property, so it will never be possible to rely on a list of steps or an algorithm—the investigator and the IRB will have to think critically and exercise judgment in every case. The committee found that IRBs vary in how they handle projects that may qualify as exempt from IRB review and in the formality of procedures for expedited review. The committee heard that some organizations require any investigator to notify the IRB of all projects, including projects that might qualify for one of the exemptions in the federal regulations, and to submit annual status reports. By notifying the IRB of a project, the investigator would at least have the benefit of
OCR for page 56
Protecting Data Privacy in Health Services Research some external review of the protocol. Several organizations provide investigators with interactive on-line or at least printable forms that “walk” the investigator through a short deliberation about whether the protocol really qualifies for exemption from IRB review (see Box 3-1 and Figure 2 in Appendix B). On the other hand, other organizations allow investigators to decide for themselves whether a protocol is exempt from IRB review, and do not attempt to determine whether the investigator's decision is consistent with federal regulations. Still other organizations may allow the department head or chair to certify that a research project qualifies for exemption. The sense of the committee is that any project benefits from at least some review from a party external to the project. In many projects, a review by an IRB chair or member alone may be sufficient, but even this quick review provides the project, the investigator, and most of all, the potential subjects with the benefit of an outside check that human subjects are adequately protected. Likewise, clarifying the institutional procedures for expedited IRB review in HSR would have several salutary effects. It would call the attention of investigators to ethical issues regarding HSR. Furthermore, such clarification would encourage health services researchers to consider in a standardized way the issues of IRB review, patient consent, and protecting confidentiality. Clearer and more standardized procedures would make IRB operations more efficient, first, by allowing IRB members to focus their attention on difficult cases and, second, by giving investigators suggestions that IRBs currently request only after a protocol is reviewed, so that the investigators would be likely to submit proposals that incorporated these elements the first time. Finally, such standardization increases compliance with IRB policies and federal regulations that are intended to protect subjects in HSR. Recommendation 3-2. IRBs should develop and disseminate principles, policies, and best practices for investigators regarding privacy and confidentiality issues in HSR that makes use of personal health data previously collected for other uses. Confidentiality in handling health information is important for its own sake and for the enhancement of public trust in research. The committee heard several innovative and feasible ways to facilitate the maintenance of confidentiality. The committee found that the identifiability of data in HSR is a continuum, such that absolute guarantees of confidentiality are impossible. Even when investigators have made reasonable and good-faith efforts to deidentify data, to restrict access to a need-to-know basis, and to maintain confidentiality, the identity of an individual can sometimes still be inferred. The committee heard of examples in which individuals could be probabilistically identified from supposedly de-identified public use files. The committee also heard about the increased chances of identification within small populations (see
OCR for page 57
Protecting Data Privacy in Health Services Research BOX 3-1 Excerpt from RAND's Human Subjects Research Screening Form Will your project use either primary or secondary data collected from or about a living individual? Are EXISTING DATA to be obtained or accessed, including official records, previously collected survey data, or other extant information? (specify as many as apply: Anonymous unrestricted public use data. Anonymous but restricted use data. De-identified private data (neither RAND nor a primary contractor or subcontractor for the research [if any] can link data to individuals, although a third party [e.g., an HMO, a government agency, a school district, or other data provider] may be able to do so). Identifiable data (data that can, in principle, be linked to individuals by RAND or by a primary contractor or subcontractor for the research [if any], either through direct identifiers or arbitrary code numbers associated with direct identifiers). Other: [Additional questions address types of interactions with subjects and types of interventions.] NOTE: See also Figure 2 in Appendix B. Appendix B). These probabilistic identifications by inference result from unforeseen links between the de-identified data in one database and complete or identifiable data from another source. It is also, of course, always possible for a human employee mistakenly to allow data to become available. Many health services researchers and IRBs have developed practical and specific procedures for protecting privacy and confidentiality in HSR projects that involve analyses of previously compiled databases. For example, researchers in health services may need to identify individual subjects to combine data from different datasets or to compare follow-up information with baseline data. Such projects can still protect subjects by using computer-generated identifiers or by encrypting the data, rather than identifying individuals by name, hospital record number, or Social Security number. When the project requires definite linking, the researcher will have to use unique individual identifiers, such as Social Security number, Medicare Health Insurance Claim number, health record number, or some unique code generated within the project for this purpose, to establish that records in different datasets belong to the same person. However, relying on a single linking variable can lead to some errors if the number is entered incorrectly for a particular transaction (hospital stay, doctor's office visit). Therefore, prudent investigators would, if possible, use some other attrib-
OCR for page 58
Protecting Data Privacy in Health Services Research ute as a corroborating linking variable, such as sex or date of birth. Probabilistic linking, in contrast, reflects the fact that people can share identifying attributes such as names and birthdates, so that investigator is not certain that the linked records belong to the same person. In other cases the identifying information needed for accurate data merging may not be specific to the patient. For instance, a study on hospital characteristics might require the names of hospitals to merge Medicare Part A claims files with the American Hospital Association survey database, but would not have to identify specific patients. In addition, researchers can take additional steps to prevent the identification of individuals with unusual characteristics (see also Table 3-1 for further detail). There may be only a few individuals of a given age with a rare diagnosis in a certain zip code who were hospitalized between certain dates, and such individuals may be readily identified by inference because there are so few persons with these characteristics. Researchers can, however, change the recording of the data so that there are more records in each data cell. For instance, it is harder to identify individuals if the investigator records the year of birth rather than the exact birthdate, the first three digits of the zip code rather than the entire zip code, and the number of hospitalized days rather than the exact dates of hospitalization. Furthermore, researchers can reduce the number of outliers on a scale by collapsing categories at the extremes of the scale. For instance, the researcher can set the highest value for a cost-of-hospitalization data field at something greater than a certain dollar amount, rather than retaining the exact figures for high-cost hospitalizations. Although there can be no absolute guarantees of confidentiality, these measures reduce the likelihood that individual subjects would be identified. If adopted more widely where appropriate, these procedures would enhance protections for subjects of HSR. At the same time, it is important for IRBs to bear in mind that different techniques are appropriate in different research projects, so different subsets of these techniques might be applicable or not usable in different studies. The committee found that IRBs were able to suggest many ways in which protocols could better protect confidentiality with simple measures. The committee heard of cases that illustrated problems or potential problems with confidentiality that IRBs had averted. In some studies, investigators planned to record identifiers with the data even though there was no need to maintain the identifiers. In fact when the IRB questioned the necessity for using identified data, the investigators realized they simply had never thought of whether their research required the identification and they immediately removed them. In another instance, already mentioned, that was reported to the committee at the workshop, one participant described finding his own HIV results projected in a meeting with personally identifying information, with no other purpose than showing an example of records in the database. In both cases, the basic problem appeared to be that the persons collecting data had not considered the confiden-
OCR for page 59
Protecting Data Privacy in Health Services Research tiality implications of their methods. These examples demonstrate the usefulness of a review independent of those involved in the protocol. The committee also found that some IRBs give detailed attention and clear advice to investigators on how better to protect confidentiality at various steps of an HSR project. Ms. Khan said that for any protocol involving particularly sensitive data, the UTHSCSA IRB requires the investigator to obtain a federal certificate of confidentiality. The certificate of confidentiality is a legal mechanism (described in the Public Health Service Act, Section 301(d)) designed to protect certain types of sensitive data from subpoena (see Wolf and Lo, 1999). The committee also heard, however, from organizations that store such sensitive data in facilities outside the United States to protect them from discovery processes (the committee did not, however, seek legal opinions as to whether such a strategy would provide effective legal protection from discovery). Colonel Anderson, IRB chair at the United States Army Medical Research Institute of Infectious Disease, reported that the Army's procedures specify that that an investigator may request that research records be maintained on request under special coded identification numbers, with a linkage to the individual's Social Security number. The key linking the study identification number and the Social Security number is then stored separately under extremely limited access. In general, networked, distributed, and backed-up digital information and environments together pose new types of threats to privacy. Some researchers, for instance, may not realize that taking a diskette with backup files home to work on a personal computer that is connected to a DSL line (that is on all the time) creates a serious security breach. Such examples suggest that the role of technical experts may yet be underappreciated. The committee found that violations of privacy and confidentiality might occur in HSR studies that have small numbers of subjects in cells, but that careful IRB review including appropriate consultation with persons knowledgeable about any specific community norms may help investigators to revise protocols to reduce such risks. The committee heard that small, isolated minority communities and their individual members might be particularly vulnerable to breaches in confidentiality, often unintended. The risk is increased in situations where the number of individuals is small and the individuals are readily recognized by others in the community. If a project targets a particular rural county or Indian Reservation, for instance, there may be only one or a few individuals with a particular characteristic (e.g., giving birth to twins) and these individuals would be readily identified in the local situation even if their identity were effectively hidden from strangers. At the same time, the risk of loss of the confidentiality veil and exposure to stigma is increased for the individuals and for the community as a whole if the
OCR for page 60
Protecting Data Privacy in Health Services Research community is relatively small and its members are readily recognized as members by the majority society. It would be important for those IRBs that review such research to consider concern for the community as a whole, but might be effected through protection of individual members, since U.S. legal tradition contemplates privacy as belonging to individuals, not groups. For example, a study designed to assess the need for certain health services can at the same time have the effect of identifying the community with a negatively valued characteristic (such as underuse of prenatal care or, more strongly, drug or alcohol abuse during pregnancy as evidenced by neonatal symptoms). In this case, all members of the community where the need has been shown may suffer stigma even if not involved in the study or not possessing the characteristic in question. The risk may be increased still further by the presence of culturally significant identifiers that are not recognized as sensitive private information by researchers who are not familiar with the community and therefore do not mask all the sensitive data fields. For example, specific locations, occupations, or other characteristics may indicate a very small subgroup even within a minority community (e.g., a few members of a particular tribe on a reservation inhabited primarily by another tribe), so that those individuals could be unintentionally identified. Similarly, among some Native Americans, revealing the name of a particular lodge or other immediate grouping could be considered an invasion of privacy. Dr. Freeman of the Indian Health Service IRB pointed out that many such mistakes could be avoided by consulting with the community for unanticipated risks to privacy (see Appendix B). For example, the name of the lodge need not be disclosed in a publication; the site of the study might be identified simply as a tribe in a certain state. Dr. Freeman also noted that a minority community may be particularly apt to worry about mistreatment from researchers, and any perceived mistreatment at the hands of one researcher will have negative impact on the ability of future researchers to gain community cooperation. Members of the committee also noted that there may be no generally agreed-upon spokesperson to represent the community, but even if there are multiple overlapping groups involved, the IRB could ensure that the investigator had consulted several representatives and at least had some input even if it is not possible to have a definitive or comprehensive statement. The committee found that the particular issues of the use of minors as subjects are connected mainly with informed consent and assent by the subjects. Specific cases in which children are at elevated risk, however, such as when they have been removed from their parents due to domestic violence, demand additional care and protection for these vulnerable subjects. The committee found that when children are the subjects of HSR that makes secondary use of previously collected data, there are situations in which the risks of breach of confidentiality may go beyond the risks existing for adult subjects. In such situations, investigators and IRBs should take special care to ensure that
OCR for page 61
Protecting Data Privacy in Health Services Research these vulnerable subjects are protected. In the type of HSR addressed in this study, including analysis of data collected for some other purpose, informed consent for each study is generally not practicable and the challenge for researchers is to build in appropriate confidentiality protections so that the risk to subjects will truly be minimal. As discussed in Appendix C, “minors” is not a homogeneous class, and the potential for psychosocial risks such as embarassment vary with age within the class. In many cases, the when risk of confidentiality breach in general has been minimized effectively, the committee sees no greater risk to minors, in respect of their being minors, than to any other subject. In specific cases, such as perhaps research on domestic violence and foster care, individual children might be identifiable because they are in a relatively small group. Furthermore, if subjects are identified, they may be at risk for being removed from a parent or guardian even though better placement options may not be available. Thus, the confidentiality of these subjects' identity should receive extra scrutiny. It is also true that records research might reveal patterns of injury, perhaps allowing abuse of a child to be detected and stopped. The committee also found that certain variables, such as hospitalizations, are so much rarer among minors than among older adults that special consideration for protecting the confidentiality of these variables as potential identifiers is warranted. As with the previous finding regarding subjects who are members of small minority communities, protecting the confidentiality of data on minors will be enhanced by an IRB whose members or consultants are knowledgeable about the particular issues of a study and about the relevant developmental changes of the minor subjects involved, and can help highlight variables of unusual identifying potential. Recommendation 3-3. IRBs should redesign applications and forms (paper and electronic) tailored to HSR that analyzes data originally collected for other purposes and then distribute them widely (e.g., post them on-line) to assist investigators in writing the human subjects sections of their HSR proposals and in preparing applications for IRB review. IRBs should be knowledgeable about the differences between HSR and clinical research, and any forms developed should reflect these differences. A checklist or logical series of questions lays out the criteria that the institution has adopted to determine, for example, what constitutes research. These instruments are useful in several ways: they call the attention of investigators to ethical issues arising in HSR, and they help investigators to think through systematically the specific issues regarding IRB review, patient consent, and protection of confidentiality. Interactive forms and checklists can make IRB review more effective and efficient. Investigators can see if their projects might meet criteria for waiver of informed consent, expedited review, or exemption from review. If necessary, the investigators can revise their study without the delays involved in resubmitting a revised hard copy of the protocol to the IRB and waiting for review by the IRB
OCR for page 62
Protecting Data Privacy in Health Services Research staff or a board member. Such forms would allow IRBs to focus their review, again by drawing attention to difficult cases. Overall, interactive forms would enhance compliance with IRB policies and federal regulations and would make review less burdensome for investigators and IRB members alike. The committee found that some organizations make use of interactive online forms to help investigators determine whether a project should be considered research and whether it qualifies for expedited review. In general the committee was favorably impressed with organizations such as RAND that had devised interactive on-line forms to minimize investigator time and paperwork requirements. RAND has implemented an on-line system to help ensure that there is appropriate IRB review of all protocols. The brief on-line questionnaire in Box 3-1 initially helps the investigator determine whether the project might require IRB review. If the questionnaire so indicates, then a more detailed questionnaire helps the investigator explore the alternatives of exemption from IRB review, expedited review, or full review see (Figure 2, Appendix B). The on-line system may indicate that a project would not fall into the category of full IRB review if it uses only anonymous or public use datasets, or de-identified datasets if neither RAND nor any another party on the contract has access to the identifiers. In addition, the IRB is notified whenever a project receives an internal funding account number—in fact, assigning such a number automatically triggers a message to the investigator to start the questionnaire. The system is designed to be inclusive, that is, to send any borderline cases to IRB members for specific attention. In more difficult situations, the IRB chair and/or selected members would have to decide whether the particular project could be exempt. Examples of situations in which an IRB members would have to become involved to decide whether further IRB review might be needed include projects that will use anonymous or nonsensitive primary data gathered through surveys, interviews, or other methods requiring a direct interaction with subjects; projects that gather data from public officials or candidates; or intervention research that is anonymous and without risk. The committee did not hear of comparable automation in a university setting but found that automating the burden of paperwork as much as possible would increase compliance and reduce burdens on both investigators and IRBs. Just as increased protections come at the price of increased investment in equipment and expertise, so would it be necessary for organizations to invest in IRB operations improvements to increase their efficiency. As has been mentioned, the committee concluded that, principal investigators intending to involve human subjects should not be in the position of exempting themselves—with or without forms and guidance documents; rather, the protocol should receive at least some outside review. At the same time, forms such as those by RAND would be helpful in facilitating prompt and high-quality review. The design and dissemination help the institution, the IRB, and
OCR for page 63
Protecting Data Privacy in Health Services Research the investigator systematically approach questions such as whether an activity is research. In another example, a university-based IRB showed how the university was working to help its investigators ascertain whether projects would be classified as research or QA and/or QI. There, the investigator is asked to consider whether the proposed study contains any of the following elements, which the institution has recognized as potentially associated with research rather than QA or QI: Characteristics of projects using HSR methods that are research, not QA or QI: Exploring any previously unknown phenomena Collecting information beyond that routinely collected for the patient care in question Comparing alternative treatments, interventions, or processes Manipulating a current process To which might be added: Being intended for publication if possible Although the committee would wish the IRB chair or designee to corroborate an investigator's assessment of a project as rather than research, the preparation of systematic materials such as the above list would facilitate review. Recommendation 3-4. IRBs should have expertise available (either on the committee or through consultants) to evaluate the risks to confidentiality and security in HSR involving data previously collected for some other purpose, including the risks of identification of individuals and the physical and electronic security of data. The committee urges IRBs and investigators to consult information technology and data security experts about protecting confidentiality in their specific situations. It is not the intent, nor would it be possible, for this committee or this report to provide an adequate basis for a data security program. The committee found that the IRBs would probably benefit from guidance on how confidentiality can be protected so that IRB members have more background on what to look for in a protocol. The committee followed the lead of previous IOM and National Research Council (NRC) reports regarding the question of how to protect confidentiality and considered protecting access to the data per se, as well as protecting individual subjects by manipulating the data after they have been collected.
OCR for page 64
Protecting Data Privacy in Health Services Research BOX 3-2 Summary of Technical Protections for Sensitive Data Individual authentication of users. To establish individual accountability, every individual in an organization should have a unique identifier (or log-on ID) for use in logging onto the organization 's information systems. Strict procedures should be established for issuing and revoking identifiers. Where appropriate, computer workstations should be programmed to automatically log off if left idle for a specified period of time. Access controls. Procedures should be in place for ensuring that users can access and retrieve only that information that they have a legitimate need to know. Audit trails. Organizations should maintain in retrievable and usable form audit trails that log all accesses to clinical information. The logs should include the date and time of access, the information or record accessed, and the user ID under which access occurred. Organizations that provide health care to their own employees should enable employees to conduct audits of accesses to their own health records, Organizations should establish procedures for reviewing audit logs to detect inappropriate accesses. Physical security and disaster recovery. Organizations should limit unauthorized physical access to computer systems, displays, networks, and medical records; they should plan for providing basic system functions and ensuring access to medical records in the event of an emergency (whether a natural disaster or a computer failure); they should store backup data in safe places or in encrypted form. Protection of remote access points. Organizations with centralized Internet connections should install a firewall that provides strong, centralized security and allows outside access to only those systems critical to outside users, Organizations with multiple access points should consider other forms of protection to protect the host machines that allow external connections. Organizations should also require a secure authentication process for remote and mobile users such as those using home computers. Organizations that do not implement either of these approaches should allow remote access only over dedicated lines. Protection of external electronic communications. Organizations should encrypt all patient-identifiable information before transmitting it over public networks, such as the Internet. Organizations that do not meet this requirement either should refrain from transmitting information electronically outside the organization or should do so only over secure dedicated lines. Policies should be in place to discourage the inclusion of patient identifiable information in unencrypted e-mail. Software discipline. Organizations should exercise and enforce discipline over user software. At a minimum, they should install virus-checking programs on all servers and limit the ability of users to download or install their own software. These technical practices should be supplemented with organizational procedures and educational campaigns to provide further protection against malicious software and to raise users' awareness of the problem. System assessment. Organizations should formally assess the security and vulnerabilities of their information systems on an ongoing basis, For example, they should run existing “hacker scripts” and password “crackers” against their systems on a monthly basis. SOURCE: Excerpted from NRC,1997; pp. 8–9, Box ES.1.
OCR for page 65
Protecting Data Privacy in Health Services Research Protections based on controlling access to sensitive data include procedural disciplines, such as making data available only under licensure agreements and training personnel not to use methods such as fax or Internet transmission that are not secure means of transferring data. Other ways of protecting data from unauthorized access include technical means, such as installing software that requires user authentication, and physical protections, such as guarding laptops with sensitive data while traveling and storing sensitive data where it would be safe from access (which may include storage outside the country for protection from subpoena, although the committee did not ascertain the reliability of this strategy). Previous reports from the National Academies have discussed technical means of protecting data privacy and maintaining confidentiality. For the Record (NRC, 1997); included a detailed list of technical and organizational measures for immediate adoption (see pp. 8–9, Box ES.1) to enhance confidentiality protection. The technical protections are shown in Box 3-2. The feasibility of these measures was demonstrated in a proof-of-concept project at a large medical center (Halamka et al., 1997) by a team including a member of the earlier NRC committee (Peter Szolovits, also a member of this committee). The committee emphasizes that this report is not the place to recommend a detailed data security program but suggests that IRBs consider the protective measures already described and implement them if they have not done so. The committee also emphasizes that increased protection comes at an increased cost, which requires investment, generally in both equipment and expertise, by the organizations conducting research. Protections based on manipulating the form of the data after collection have also received detailed examination in previous NRC reports. Private Lives and Public Policies (NRC, 1993) addressed the confidentiality and accessibility of government-held statistics generally and recommended confidentiality measures including data-masking techniques such as topcoding (setting an upper limit on a range of values [e.g., age 70 and over], so as to avoid reporting increasingly rare outlying values in ranges where they would be isolated —see Box 3-3 for other examples). Many of these measures are also feasible for handling data in general, not only government-held databases. As with any manipulation, however, each technique has disadvantages as well as advantages, so the committee emphasizes that it is important for the investigator to have flexibility in applying the techniques best suited to the particular research question and dataset(s) of the protocol. Again, the committee emphasizes that this study could not undertake detailed presentation of data-masking techniques, but suggests that investigators and IRBs consider the protective measures already described and implement them where possible if they have not done so. Ideally, these technical and data-masking safeguards for confidentiality would be implemented in the context of policies and procedures adopted by the organization for all uses of personal health information, including clinical care, business activities, or research. The chair of the Private Lives and Public Policies committee (George Duncan, also
OCR for page 66
Protecting Data Privacy in Health Services Research of a member of this committee) noted that many government agencies including the Bureau of the Census and the National Center for Health Statistics have significant experience with the release of data with confidentiality protections and should be consulted in future work. This committee notes that it may also be helpful for investigators and IRBs to have access to specific lists of potential direct identifiers for removal. Such lists of procedures and specific identifiers may, however, never be exhaustive and, as stated in the previous finding, a set of guaranteed conditions may not be possible.* *As an example of a good beginning of a list for identifiers to be wary of, the committee referred to 164.506(d)(2)(ii)(A) of the proposed rule (DHHS, 1999). The list includes name; address; names of relatives or employers; birth date; telephone, fax number, or email address; medical record; health plan beneficiary, account, certificate/license number; vehicle or other device serial numbers; Web universal resource locator; Internet protocol address number; voice or fingerprints; photographs; or any other unique identifying number characteristic or code that the covered entity has reason to believe may be available to an anticipated recipient of the information. The confidentiality of data that were de-identified to this extent would be better protected, but as noted, the data might still allow the probabilistic identification of persons by inference with other data sources. The proposed rule accounts for this possibility with an additional condition stipulating that if data are to be disclosed, then the covered entity must have no reason to believe that any anticipated recipient of such information could use the information, alone or in combination with other information, to identify an individual. As has been noted, however, this condition may be impossible to satisfy. The committee found that some organizations provide IRBs and investigators access to experts in information technology. RAND has installed a three-person privacy team as part of its IRB. The team includes an information resource specialist (who specializes in security measures such as encryption and creating codes to substitute for identifying data), a data librarian (who specializes in rules and practices for dealing with large datasets acquired from other organizations), and a networks specialist (who specializes in conditions and limitations of safe data transfer over the network). These professionals help design and implement data-safeguarding plans commensurate with the level of risk for various protocols. The committee concluded that in light of rapid developments in information technology (IT), such access to expertise in information technology is highly desirable. Most IRBs do not, however, have the power or resources to implement data security programs on their own, and their time must be devoted to reviewing research proposals, protocols, and annual reports. What IRBs can do is reject studies that do not have acceptable data security measures, while at the same time working to understand the value of reductions in the incidence and severity of security breaks relative to the cost of increased security precautions. The host
OCR for page 67
Protecting Data Privacy in Health Services Research organization and research sites are usually the loci of data security programs. These organizations determine their own level of investment in IT and levels of affordable data security. The committee therefore concluded that IRBs should obtain consulting services from data security experts to gain better understanding of the expected yield in reduction of the likelihood of break-ins to a secured data system produced by alternative security programs. Recommendation 3-5. Institutions that carry out HSR and train health services researchers should require that trainees, investigators, and IRB members receive education, with updates as technology changes, regarding the protection of privacy and confidentiality when using data previously collected for another use. Education is critical not only for IRB members, but also for researchers, technicians, and any other employees who may come in contact with health information. Better education about how to protect confidentiality and possible sources of risk will help investigators design better confidentiality protection into their proposed studies from the start. Better education of all employees who may come in contact with the data will help raise the level of understanding and alertness throughout the organization. The committee found that organizations vary in how they educate IRB members about research ethics and federal guidelines. The committee found that learning on the job may be inadequate preparation for IRB members. The committee heard at the workshop that some IRBs have apparently not had the opportunity to gain experience with HSR and may ask for incongruous changes. Some organizations provide training for IRB members in formal courses or seminars, or by providing orientation materials (OPRR, 1993). Several organizations send members to professional meetings and seminars, such as those sponsored by the organization Public Responsibility in Medicine and Research (PRIM&R). Certainly informal education from more experienced IRB members and administrators provides continuing training during IRB meetings. As noted earlier, the OIG has already observed that IRBs are facing greatly expanded workloads, including new types of research, and do not always have access to either the expert personnel or the training they would need in order to deal effectively with some of this research. At the same time, IRBs often face serious resource limitations, which in turn affect training. The question of the OHRP's role in IRB member education was raised at the workshop, not only disseminating regulations as the office currently does, but also the possibility of OHRP's collecting and disseminating information about the best practices of IRBs. Dr. Puglisi, representing OHRP, said that information and guidance are posted on the OHRP website, and that the OHRP is actively expanding its educational activities.
OCR for page 68
Protecting Data Privacy in Health Services Research BOX 3-3 Sample “Restricted Data” Tools for Providing Data While Protecting Confidentiality Tool Advantages Disadvantages De-identification—Removing personal identifiers such as names, addresses, telephone numbers, e-mail addresses, Social Security number Clearly necessary so that a person is not immediately linked to a record Data may still be at risk of disclosure. A data snooper may be able to re-identify a record—uncover the identity of a record—by linking certain key variables in the record to those in some separate and complete identified record held by the snooper. For data analysis, this makes it impossible to add additional variables or to guarantee for future retrieval the ability to contact individuals about adverse findings Coarsening—Releasing the data in categories or broadening the size of categories. Includes topcoding and bottomcoding—creating categories at the extremes of the data Makes re-identification more difficult since several subjects may have the same values in the records Decreases the information content of the data and makes certain statistical techniques, such as regression, more difficult to apply Dropping unnecessary variables Makes re-identification more difficult because there are fewer key variables present for linkage and because fewer records are unique in their values Dropped variables may later prove useful in analysis of the data Dropping records that are easy to identify, usually with unusual combinations of values Lowers disclosure risk These records may be of particular interest to a researcher. Geocoding—replacing person-specific information with the average from a geopolitical unit in which the person resides Makes re-identification more difficult Decreases information about unusual persons in the geopolitical unit. The geopolitical unit may be inappropriate for a specific research project Aggregating units to a higher-level group—(e.g., providing average values for all patients with the same health care provider) Values released do not correspond to a particular individual Loss of information when the within-group variability is nonnegligible
OCR for page 69
Protecting Data Privacy in Health Services Research Random error injection(e.g., adding independent zero mean random perturbations to each data value) Makes record linkage through key variables more difficult. Loss of information, also can make data analysis more complicated Winsorizing(e.g., removing highest values, replacing with the mean of values removed) Preserves mean of group, hides unique upper values Loss of information, especially on upper-value cases The committee found that some organizations require training of investigators in research ethics and IRB procedure. Many investigators in HSR are initially trained in a variety of disciplines, including clinical medicine, pharmacy, epidemiology, and health administration, but rarely specific programs in HSR. Training investigators as well as IRB members may greatly enhance human subjects protection and speed the initiation of good research. Educational activities must, however, be designed to target the needs and time constraints of adult learners who are also busy researchers. In particular, training should be tailored to the type of research methods that researchers use—the ideal training for clinical trials investigators would not be helpful for health services researchers. Several organizations require, or are planning to require that investigators pass a course on human subjects protection before their protocols can be reviewed. NIH already requires that intramural investigators pass an on-line course on research ethics and regulations. The committee believes such education should be encouraged and expanded, provided that this is feasible for the already heavy schedule of most investigators. The committee heard some promising ideas about how to provide this training, such as on-line tutorials, but several members noted that there could be resistance at some institutions to making any training a requirement, because of the heavy workload that many investigators already carry. In addition to formal courses, IRBs play an important role in educating investigators through individual discussions regarding specific projects. IRB administrators and chairs participating in the workshop reported that their organizations function more effectively as collaborative educators than when trying to function as enforcers and that collaboration also effectively reduces the need for enforcement. Recommendation 3-6. Health care or other organizations that disclose or use personally identifiable health information for any purpose including research or other activities using HSR methods should have comprehensive policies, procedures and other struc-
OCR for page 70
Protecting Data Privacy in Health Services Research tures to protect the confidentiality of health information and should have in place appropriate strong and enforceable sanctions against breaches of health information confidentiality. Access to specific expertise and enhanced general education are important, but the committee also observed that the human element of the research enterprise necessarily includes human potential for error and even malfeasance. Therefore organizations should complement and support the proactive strategies of expertise and education for better confidentiality protection with deterrents to wrongdoing. Such sanctions ought to be graded according to the offense (e.g., whether the incident was a simple mistake or an intentional violation) and should apply not only to researchers but to all employees of the organization.
Representative terms from entire chapter: