2
Health Databases and Health Database Organizations: Uses, Benefits, and Concerns

No one engaged in any part of health care delivery or planning today can fail to sense the immense changes on the horizon, even if the silhouettes of those changes, let alone the details, are in dispute. 1 Beyond debate,  

1  

The Clinton administration's proposed Health Security Act (HSA, 1993) gives appreciable attention to information systems and related matters. It calls for the establishment of a National Health Board to oversee the creation of an electronic data network consisting of regional centers that collect, compile, and transmit information (Sec. 5103). The board will, among other duties, provide technical assistance on (1) the promotion of community-based health information systems and (2) the promotion of patient care information systems that collect data at the point of care or as a by-product of the delivery of care (Sec. 5106).

The types of information collected would include: enrollment and disenrollment in health plans; clinical encounters and other items and services provided by health care providers; administrative and financial transactions and activities of participating states, regional alliances, corporate alliances, health plans, health care providers, employers, and individuals; number and demographic characteristics of eligible individuals residing in each alliance area; payment of benefits; utilization management; quality management; grievances, and fraud or misrepresentation in claims or benefits (Sec. 5101).

The HSA further specifies the use of (1) uniform paper forms containing standard data elements, definitions, and instructions for completion; (2) requirements for use of uniform health data sets with common definitions to standardize the collection and transmission of data in electronic form; (3) uniform presentation requirements for data in electronic form; and (4) electronic data interchange requirements for the exchange of data among automated health information systems (Sec. 5002). It also calls for a national health security card that will permit access to information about health coverage although it will contain only a minimum amount of information (Sec. 5105) (Health Security Act. Title V. Quality and Consumer Protection. Part 1. Health Information Systems).



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 40
--> 2 Health Databases and Health Database Organizations: Uses, Benefits, and Concerns No one engaged in any part of health care delivery or planning today can fail to sense the immense changes on the horizon, even if the silhouettes of those changes, let alone the details, are in dispute. 1 Beyond debate,   1   The Clinton administration's proposed Health Security Act (HSA, 1993) gives appreciable attention to information systems and related matters. It calls for the establishment of a National Health Board to oversee the creation of an electronic data network consisting of regional centers that collect, compile, and transmit information (Sec. 5103). The board will, among other duties, provide technical assistance on (1) the promotion of community-based health information systems and (2) the promotion of patient care information systems that collect data at the point of care or as a by-product of the delivery of care (Sec. 5106). The types of information collected would include: enrollment and disenrollment in health plans; clinical encounters and other items and services provided by health care providers; administrative and financial transactions and activities of participating states, regional alliances, corporate alliances, health plans, health care providers, employers, and individuals; number and demographic characteristics of eligible individuals residing in each alliance area; payment of benefits; utilization management; quality management; grievances, and fraud or misrepresentation in claims or benefits (Sec. 5101). The HSA further specifies the use of (1) uniform paper forms containing standard data elements, definitions, and instructions for completion; (2) requirements for use of uniform health data sets with common definitions to standardize the collection and transmission of data in electronic form; (3) uniform presentation requirements for data in electronic form; and (4) electronic data interchange requirements for the exchange of data among automated health information systems (Sec. 5002). It also calls for a national health security card that will permit access to information about health coverage although it will contain only a minimum amount of information (Sec. 5105) (Health Security Act. Title V. Quality and Consumer Protection. Part 1. Health Information Systems).

OCR for page 40
--> however, is the need for much more and much better information on use of health care services and on the outcomes of that care. The needs are quite broad: health care reform; evaluation of clinical care and health care delivery; administration of health plans, groups, and facilities; and public health planning. Policymakers, researchers, health professionals, purchasers, patients, and others continue to be frustrated in their attempts to acquire health information. They may not be able to determine with confidence the outcomes, quality, effectiveness, appropriateness, and costs of care for different segments of the population, for different settings, services, and providers, and for different mechanisms of health care delivery and reimbursement. When this is so, they can say little, with confidence, about the value of the investment in health care for population subgroups, regions, or the nation as a whole. In principle, this information can be acquired through numerous avenues, such as surveys, electronic financial transactions for health insurance claims, computer-based patient records (CPRs), and disease registries. In practice, no one system will suit every need or produce information appropriate for every question. As introduced in Chapter 1, however, health database organizations (HDOs) hold considerable promise as a reasonably comprehensive source of the information needed to: assess the health of the public and patterns of illness and injury; identify unmet regional health needs; document patterns of health care expenditures on inappropriate, wasteful, or potentially harmful services; find cost-effective care providers; and improve the quality of care in hospitals, practitioners' offices, clinics, and various other health care settings.  The latter half of this chapter outlines these and other benefits of HDOs, the databases they access or control, and the analytic and information dissemination activities they undertake. It also discusses the applications that user groups might have for different types of databases. The committee advances some views on how major concerns about these databases, chiefly relating to the quality of their data, might be addressed, and it makes two recommendations. In preparation for those sections, the chapter next offers some definitions of key concepts and terms, explores the basic construct of HDOs (which the committee sees as the administrative and operational structure for regional health databases), and provides some examples of the variety of entities that now exist, are being implemented as this report was written, or are envisioned for the future.

OCR for page 40
--> Definitions Even among experts, terms such as database and network are not used in the same manner. For this report, the committee advances the following working definitions for certain major concepts, building to its view of an HDO. Database The term database embraces many different concepts: from paper records maintained by a single practitioner to the vast computerized collections of insurance claims for Medicare beneficiaries; from files of computerized patient encounter forms maintained by health plans to discharge abstract databases of all hospitals in a given state; from cancer and trauma registries maintained by health institutions and researchers to major national health survey data of federal agencies. As commonly used and meant in this report, a database (or, sometimes, data bank, data set, or data file) is ''a large collection of data in a computer, organized so that it can be expanded, updated, and retrieved rapidly for various uses" (Webster's New World Dictionary, 2nd ed.). Although databases may eventually be linked (or linkable) to primary medical records held by health care practitioners, this report addresses databases composed of secondary records.2 Secondary files are generated from primary records or are separate from any patient encounter (as in the case of eligibility or enrollment files for health plans and public programs). They are not under the control of a practitioner or anyone designated by the practitioner, nor are they under the management of any health institution (e.g., the medical records department of a hospital). Furthermore, they are not intended to be the major source of information about specific patients for the treating physician. Secondary databases facilitate reuse of data that have been gathered for another purpose (e.g., patient care, billing, or research) but that, in new applications, may generate new knowledge. 2   According to the IOM (1991a, p. 11): "A primary patient record is used by health care professionals while providing patient care services to review patient data or document their own observations, actions, or instructions. A secondary patient record is derived from the primary record and contains selected data elements to aid nonclinical users (i.e., persons not involved in direct patient care) in supporting, evaluation, or advancing patient care." At present, most medical records are maintained on paper, not in computers, and the U.S. General Accounting Office provides the following startling figures on the equivalent volume of paper: "We estimate that the 34 million annual U.S. hospital admissions and 1.2 billion physician visits could generate the equivalent of 10 billion pages of medical records" (GAO, 1993a, p. 2).

OCR for page 40
--> The committee distinguishes between databases composed of secondary records and CPRs or CPR systems (IOM, 1991a; Ball and Collen, 1992), but its broader vision of computer-based health information systems includes direct ties to CPR systems. Many experts argue that until CPR systems are linked in some fashion to such data repositories or networks, neither will be complete or reach their full health care, research, or policymaking potential.3 This chapter cites several examples of health databases used today for many purposes, but the ones noted are highly selective and intended to illustrate particular applications or kinds of data maintained. To understand the range of databases that HDOs might access and why there might be concern about protection of personal data, readers are referred to the many inventories of health databases. Publications from the National Association of Health Data Organizations (NAHDO) describe state and insurance databases (NAHDO, 1988, 1993). For databases related to federal programs supported by the Department of Health and Human Services (DHHS), readers can consult publications and manuals from the Health Care Financing Administration (HCFA, for Medicare and Medicaid), the Public Health Service (PHS, for surveys conducted by the National Center for Health Statistics; see also Gable, 1990; IOM/CBASSE, 1992; NCHS, 1993; Smith, 1993), and the Agency for Health Care Policy and Research (AHCPR, for the National Medical Expenditure Surveys and Patient Outcome Research Teams [PORTs]; AHCPR, 1990a). Major research databases include those developed for the RAND Corporation's Health Insurance Experiment (a large-scale social experiment conducted in the late 1970s and early 1980s on the utilization, expenditures, and outcomes effects of different levels of cost sharing [Newhouse and Insurance Experiment Group, 1993]), which were turned into a large number of carefully documented public-use tapes. Key Attributes of Databases In reviewing the considerable variation in databases that might be accessed, controlled, or acquired by HDOs, the committee sought a simple way to characterize them by key attributes.  It decided on two critical 3   One major hurdle to the development of CPRs involves standards for vocabulary, structure and content, messaging, and security, according to GAO reports (1991, 1993a); without standards for uniform electronic recording and transmission of medical data, effective automated medical record systems will be delayed. This committee did not examine these technical issues, although they pertain as well to large-scale regional HDOs; arguably, the government and the private sector will need to move more forcefully on development of such standards—perhaps moving beyond near-total reliance on voluntary efforts—if CPRs, CPR systems, and regional health databases and networks are to succeed.

OCR for page 40
--> dimensions of databases: comprehensiveness and inclusiveness. (Because these terms are used with distinct meanings in this report, they are italicized whenever used.) Comprehensiveness. Comprehensiveness describes the completeness of records of patient care events and information relevant to an individual patient (Table 2-1).4 It refers to the amount of information one has on an individual both for each patient encounter with the health care system and for all of a patient's encounters over time (USDHHS, 1991, refers to this as completeness). A record that is comprehensive contains: demographic data, administrative data, health risks and health status, patient medical history, current management of health conditions, and outcomes data. Each category is described briefly below. Demographic data consist of facts such as age (or date of birth), gender, race and ethnic origin, marital status, address of residence, names of and other information about immediate family members, and emergency information. Information about employment status (and employer), schooling and education, and some indicator of socioeconomic class might also appear. Administrative data include facts about health insurance such as eligibility and membership, dual coverage (when relevant), and required copayments and deductibles for a given benefit package. With respect to services provided (e.g., diagnostic tests or outpatient procedures), such data also typically include charges and perhaps amounts paid. Administrative data commonly identify providers with a unique identifier and possibly give additional provider-specific facts; the latter might include kind of practitioner (physician, podiatrist, psychologist), physician specialty, and nature of institution (general or specialty hospital, physician office or clinic, home care agency, nursing home, and so forth). Health risks and health status Health risk information reflects behavior and lifestyle (e.g., whether an individual uses tobacco products or engages regularly in strenuous exercise) and facts about family history and genetic factors (e.g., whether an individual has first-degree family members with a specific type of cancer or a propensity for musculoskeletal disease). 4   The discussion of comprehensiveness and inclusiveness of databases is couched in terms of what might be regarded as the traditional domain of medical care, including mental health care. Clearly, more advanced databases could include information on dental care and care provided by health professionals that practice independently, such as nurse-practitioners and nurse-midwives, acupuncturists, or alternative healers of various sorts. Even more far-reaching databases might contain information on sociomedical services provided through, for instance, day care and home care for adults or children.

OCR for page 40
--> TABLE 2-1 Comprehensiveness: Data Elements as a Critical Dimension of Health Care Databases Data Elements Examples of Data Elements that Might Be Included in HDO Databases Demographic Name   Address of residence   Names and other information on immediate family members and emergency information   Age (or date of birth)   Gender   Race and ethnic origin   Employment status (and employer)   Schooling and education   Indicator of socioeconomic class Administrative Unique identifier   Health insurance eligibility and membership   Dual coverage when appropriate   Required copayments and deductibles   "Insurance claim" information, e.g., charges for diagnostic tests and procedures and amounts paid   Provider and provider identification number   Type of practitioner   Physician specialty   Type of institution   Provider-specific authorization and date for informed consent Health risks Health-related behavior, e.g., use of tobacco products and seat belts, exercise   Genetic predisposition Health status (or health-related quality of life) Physical functioning   Mental and emotional well-being   Cognitive functioning   Social and role functioning   Perceptions of health Medical history Past medical problems, injuries, hospital admissions, pregnancies, births   Family history or events (e.g., alcoholism or parental divorce) Current management of health conditions Current problems and diagnoses   Medications prescribed   Allergies   Health screening   Diagnostic or therapeutic procedures performed   Counseling Outcomes General and/or condition-specific states, e.g., functional status, readmission to hospital, and unexpected medical or surgical complications of care   Satisfaction

OCR for page 40
--> Health status (or health-related quality of life), generally reported by individuals themselves, reflects domains of health such as physical functioning, mental and emotional well-being, cognitive functioning, social and role functioning, and perceptions of one's health in the past, present, and future and compared with that of one's peers. Health status and quality-of-life measures are commonly considered outcomes of health care, but evaluators and researchers also need such information to take account in their analyses of the mix of patients and the range of severity of health conditions. Patient medical history involves data on previous medical encounters such as hospital admissions, surgical procedures, pregnancies and live births, and the like; it also includes information on past medical problems and possibly family history or events (e.g., alcoholism or parental divorce). Again, although such facts are significant for good patient care, they may also be important for case-mix and severity adjustment. Current medical management includes the content of encounter forms or parts of the patient record. Such information might reflect health screening, current health problems and diagnoses, allergies (especially those to medications), diagnostic or therapeutic procedures performed, laboratory tests carried out, medications prescribed, and counseling provided. Outcomes data encompass a wide choice of measures of the effects of health care and the aftermath of various health problems across a spectrum from death to high levels of functioning and well-being; they can also reflect health care events such as readmission to hospital or unexpected complications or side effects of care. Finally, they often include measures of satisfaction with care. Outcomes assessed weeks or months after health care events, and by means of reports directly from individuals (or family members), are desirable, although these are likely to be the least commonly found in the secondary databases under consideration here.  The more comprehensive the database is, the more current and possibly more sensitive information about individuals is likely to be. This suggests that comprehensiveness as envisioned here will have a direct correlation with concerns about privacy and confidentiality. By analogy, the Department of Defense treats information with increasingly higher levels of security as it becomes more comprehensive, even when the aggregated information is not considered sensitive (Ware, 1993). Some patient events are unlikely to appear in databases (depending on how they originate); missing from the databases considered here are services that may have been advised but neither sought nor rendered—screening examinations not given, physician follow-up visits not advised or kept, and prescriptions given but not filled. Other reasons for missing data involve out-of-area care for an individual who is otherwise in the database; an example is medical services provided in Florida to New York residents

OCR for page 40
--> when they are on vacation or living part of the year out of state. Yet another is when patients do not make claims against health insurance policies (regardless of where they are rendered); this transaction may not be recorded through any of the usual claims processing mechanisms used to generate the database. Furthermore, databases may never be sufficiently comprehensive for research or outcomes analysis, especially if the choice of core data elements is parsimonious. Thus, when the question at hand is health status and outcomes long after health care has been rendered, HDO staff or outside researchers may need the capability and authority to contact individuals (providers and possibly patients) for information about outcomes and satisfaction with care. Such outreach activities would require some adequate funding mechanism. Inclusiveness. Inclusiveness refers to which populations in a geographic area are included in a database. The more inclusive a database, the more it approaches coverage of 100 percent of the population that its developers intend to include. Databases that aim to provide information on the health of the community ought to include an enumeration of all residents of the community (e.g., metropolitan area, state) so that the information accurately reflects the entire population of the region, regardless of insurance category. Conversely, inclusiveness is reduced when membership is restricted to certain subgroups or when individuals expected to be in the database are missing (Table 2-2). For instance, a database that is intended to include all residents in a local area may include only those who are insured and file claims for services; it misses those not insured and those who, although insured, do not use health services. An insurance claims database that does not include members of a health maintenance organization (HMO) because no claims are filed will also not be inclusive for the geographic area. Databases may be (and often are) designed to include only subsets of the entire population of a geographic area: those eligible for certain kinds of insurance, such as enrollees (subscribers, their spouses and dependents) in commercial insurance plans; persons receiving care from specific kinds of. providers or in certain settings (e.g., prehospital emergency care from emergency medical services and hospital emergency departments); persons with a given set of conditions (e.g., a cancer or trauma registry); an age group such as those age 65 and older (e.g., Medicare beneficiary files);5 residents of a defined geographic area or political jurisdiction or scientifically selected samples of individuals, as in major health surveys. Clearly these categories are not mutually exclusive—individuals (as well as providers)  5   This is illustrative only because Medicare files also include younger but disabled beneficiaries and persons with end-stage renal disease.

OCR for page 40
--> TABLE 2-2 Inclusiveness: Populations Covered as a Critical Dimension of Health Care Databases Defined Populations Examples National All persons physically resident in the 50 states, District of Columbia, Puerto Rico, and the Trust Territories Geographic area All persons resident in a defined geopolitical or other describable area, such as an MSA Insurance type HMO, indemnity, Medicaid, none Site and care setting Hospital, nursing home, clinic Disease, injury type Cancer, trauma registry Age or other demographic characteristic Age 65 or older, belonging to a defined ethnic or racial group NOTE: MSA = Metropolitan statistical area. can and do appear in more than one such database. The potential benefits of the database, however, will increase as the database moves toward being inclusive of the entire population of a defined geographic area. HDOs will have to be clear about what groups are missing when describing their databases and the results of their analyses. Perhaps more important, HDOs should seek ways to ensure that all relevant populations are included, so that their analyses accurately reflect the population of the region and, thereby, yield estimates of the levels of underuse of health care in their respective regions. Table 2-3 summarizes these two attributes.6 The dummy matrix, al- 6   The congressional Physician Payment Review Commission (PPRC) has been in the forefront of advocates for a national data system (PPRC, 1992, 1993). In its 1992 annual report, PPRC described an "all-patient database" [emphasis in the original], conceptualized as a "network of local or regional data processing centers ... to streamline the transfer of administrative information for payment and service-use tracking purposes" (p. 269). The report goes on to posit "parallel organizing entities ... to coordinate the use of these data [and] the data processing centers and the organizing entities would make up an all-patient data network" (p. 269). The commissioners also envisioned the network evolving into a "means to link and assimilate more detailed clinical information." Although the general thrust of the PPRC idea is consonant with the long-range views of this IOM committee, the specific understanding of what a database or network is differs. In defining an all-patient database, the commissioners appear to have in mind what this committee terms inclusiveness; what the PPRC report lays out as "core data elements'' in that database approaches what the IOM report calls comprehensiveness.

OCR for page 40
--> TABLE 2-3 Characteristics of Databases According to Two Critical Dimensions COMPREHENSIVENESS (Data Elements) Inclusiveness (Population) High Low High a b Low c d though empty, illustrates how databases can be described, evaluated, and differentiated from each other. Cell a represents patient populations and data elements that are included in a database. Cell b depicts the individuals who are missing from a database that is otherwise fairly comprehensive. Cell c represents patient nonevents and missing data in a database that is otherwise reasonably inclusive.  Cell d represents missing individuals and missing data. To the extent cells b, c, and especially d are large, the database in question will be less able to provide extensive, or unbiased, information; the sizes of cells b, c, and d are, therefore, three determinants of database quality. Other Characteristics of Databases The more comprehensive and inclusive databases are, the more they facilitate detailed and sophisticated uses and, in turn, entail both greater anticipated benefits and possible harms. The magnitude of either benefits or harms can depend on several other important properties of databases, however, as noted below. Linkage over time. The ability to analyze patterns, quality, and costs of care over a period of time may be very important to users. They may want to construct episodes of care or develop other longitudinal profiles; cases in point (respectively) involve all the care provided to a specific patient for a discrete course of illness or injury, regardless of site or setting, and compilations of information on services provided by a local HMOs over rolling five-year periods. Such studies require not only unique identifiers for patients and providers (see below) but also a record structure that permits analysts to link dates and times with patient care events, problems, and diagnoses. Timeliness. Facts based on patient-provider interactions and other relevant information (e.g., employment, health plan, health status, or outcomes)

OCR for page 40
--> should be entered or updated frequently enough to permit their timely use and analysis. If databases are to be of assistance with direct patient care, then information must be sufficiently up to date that caregivers can rely on it in all clinical decision-making situations. Accuracy and completeness.  Data used for clinical care—decision making about a given individual—must be of far greater accuracy and completeness than those required for administrative uses. Databases used for clinical decision making must, in describing an individual, describe only that individual and do so accurately. For instance, missing or out-of-date data or files that commingle data for more than one individual under a single identifier have grave potential for harm. In addition, correcting errors found at a later time must be possible; ideally, alerting past users of the database to those errors and corrections ought to be possible as well. Control, ownership, and governance. Whether a given database has been established by the public or the private sector (or is some hybrid) will have important implications for inclusiveness and access. For instance, databases addressed in this report may be publicly supported—especially at the state level—and may be operated and administered by a private entity. Some state hospital discharge databases—such as the Health Care Policy Corporation in Iowa and the Massachusetts Health Data Consortium—are of this kind. Alternatively, they may be developed, maintained, and financed wholly in the private sector, such as those developed by professional or health care organizations, insurers, or business coalitions. A database created by state or federal law can require participation; that is, it can demand that health professionals, institutions, and patients participate in providing data. For example, Washington state has passed legislation that mandates development of a statewide data system by a health services commission that will identify a set of health care data elements to be submitted by all providers (e.g., hospitals and physicians) (Engrossed Second Substitute Senate Bill 5304, 1993). To the extent databases are developed and maintained in the public sector or are networked with public-sector databases (especially at the federal level), they will be subject to regulations that differ from those affecting databases operated purely within the private sector for the benefit of private sponsors. Given the evolving nature of state and national health care reform plans and programs, movement toward electronic data interchange (EDI), progress toward CPRs, and emergence of various hybrid arrangements for financing and delivering health care, the development of HDOs is taking place in very different (and perhaps unpredictable) environments that will likely have disparate effects over time. Origin of data. Databases can vary widely in the source(s) of their

OCR for page 40
--> receive indicated auxillary lymph node dissection or assays for hormone receptors; in another, the study question was the percentage of women receiving breast-conserving surgery who did not receive indicated radiation therapy. In commenting on these studies, Chassin (1991) notes that they suggest problems "in the extent to which physicians fail to communicate options and outcomes data objectively" (p. 3473) and advocates routine feedback of these kinds of data to hospitals. With respect to injury, the American College of Surgeons National Trauma Registry is another example of a database that provides information on patterns of injury and their outcomes (see Table 2-4); for those concerned with emergency medical services, such sources of epidemiologic and clinical information are critical (IOM, 1993d). Other public health applications of HDO databases relate to preventive care and health behaviors. For some industries, for instance, epidemiologic information from large databases may enable analysts to identify potential safety or health-related problems in workplace environments and to suggest corrective steps. Immunization tracking systems, currently under development regionally and nationally, might be incorporated into HDO databases to simplify monitoring and recording of children's immunization status both in aggregate and individually. HDOs might also maintain information about blood type, organ donors, and tissue matching in their databases, as a means of fostering improved blood banking and organ procurement and transplant services. Promoting Regional and Community Health Planning, Education, and Outreach Health Planning and Education When HDO databases are statewide, or sponsored by state health departments, the potential uses by states and all subordinate levels of government for health planning, health care delivery, public health, and administrative responsibilities become quite extensive; they can involve the health departments and social services agencies of states, counties, and municipalities in many overlapping efforts. Planning and educational activities that could employ HDO data might be focused on improving access to, reducing costs of, and enhancing quality of care; on organizing provider systems of care; or on investigating epidemiologic patterns of injury or illness. For example, community-specific studies conducted using HDO data might examine the kinds of cases treated by local hospital emergency departments, whether use differs by hospital or patient characteristics, and whether patient outcomes differ accordingly. Such information might en-

OCR for page 40
--> able public agencies to target public funds or other resources in new ways to meet previously undetected problems or needs. Integrating data on vital statistics, epidemiologic surveillance, and local and regional public health programs with those in the personal-health-care files of HDOs raises the possibility of more effective public health activities for monitoring health, attaining public health objectives at a population level, and targeting efforts for hard-to-reach individuals. For example, researchers in Boston have developed and operationalized a distributed health record system for a homeless population seen at many sites by many different providers (Chueh and Barnett, 1994). Community Outreach In addition to whatever public-sector agencies might do to monitor the public health of communities, community and consumer organizations may wish also to carry out population-based studies as a means of learning where significant health problems exist and of making elected officials and others more accountable for solving those problems. Another significant way that information held by HDOs may contribute to the work of community, voluntary, and consumer groups is in their public education and outreach programs. Here the data may suggest emerging problems that warrant increased attention (or waning problems that need reduced effort); data may also indicate where (in geographic areas or population subgroups) education initiatives might best be targeted. For example, recognition that bicycle accidents are a major source of children's head injuries could lead to community education programs in schools and neighborhood associations. Public-sector agencies, academic centers, or consumer groups might pursue such public health efforts by analyzing HDO data and developing community-specific informational materials (e.g., public information brochures on sources of care for special problems). Charitable groups and voluntary organizations concerned with particular diseases and conditions have many roles: providing information to and support for patients with particular illnesses and for their families; sponsoring research; and lobbying for more policy attention, social acceptance, and research support for the problem. Because they are likely to be private organizations that secure their funds through donations from individuals and corporations, most must engage in aggressive fund-raising campaigns. Information from health data banks might enable them to increase their efficiency in amassing epidemiologic information and perhaps in targeting fund-raising efforts.

OCR for page 40
--> Other Uses for HDO Databases The IOM committee identified a great many other potential users and uses of HDO databases, including agencies engaged in law enforcement at the federal, state, and local levels; law firms and attorneys; and various commercial entities. The more plausible are briefly described here. Law enforcement officials can be expected to find many uses for the information held in HDO files. They may wish to trace individuals (for instance, to locate parents not paying child support). They may also need to investigate alleged illegal acts; in the health context, this might extend to abuse of illegal substances or cases of possible child abuse. Conceivably, law enforcement agencies might want genetic information to assist them in identification of a suspect. Finally, such agencies may be expected to monitor providers and patients for possible fraud. Arguably attorneys and law firms might identify many uses for HDO data, including malpractice litigation. Plaintiffs' lawyers, for instance, might try to access information from HDOs concerning previous quality-of-care deficiencies of a physician or hospital; defendants' counsel might seek to demonstrate, through analysis of HDO data, that the provider acted well within community standards. One important application occurs in cases where the past or current health condition of the patient is relevant to the case or is at issue in the case. Product safety litigation may also call forth requests for data from the network, especially when a medical device is in question. Finally, attorneys representing health plans, insurers, medical groups, hospitals, and other providers in their business (e.g., financial) concerns may find information contained in the databases of use in advising their clients about risk management, taxes, financing, and similar matters. A wide array of other kinds of companies, organizations, and services might well have an interest in the information available through HDOs. Among them are direct marketing firms, financial and credit institutions, and bill collection agencies. Such entities (especially the last named) might wish to have person-identified information, but in general many applications of the information might not be directed at patients but rather at providers or at groupings such as zip codes. Financial and credit institutions might be interested in health plan and hospital data to determine market share or estimate solvency for a given group practice or facility. In general, this committee takes an extremely negative view toward giving these groups access to HDO files, particularly any data that might conceivably identify individual persons, and thus these uses are not explored further here.

OCR for page 40
--> Comment The committee emphasizes that its roster of users includes examples of current as well as potential HDO database users;16 it does not believe that HDOs necessarily ought to satisfy all such claimants. It does acknowledge, however, that the mere existence of a database creates new demands for access and new users and uses. Consequently, those who establish health databases and HDOs may be creating something for which the end uses cannot always be anticipated. Because this study took place at a time of change in both health care infrastructure and information systems, the committee tried to anticipate the probable sources of the tension that will exist between those who create databases and wish to protect the information and those who might argue for access to those databases on grounds of anticipated benefits. Historically, the creation of large databases, such as those to administer the Social Security program and the National Crime Information Network, has been followed by modifications in the databases themselves and in the policies and legislation that regulate access to them—which results more often than not in relaxing prohibitions or barriers to access. Realism dictates that large databases such as those maintained by HDOs will be dynamic. In the committee's view, policies regarding access to these databases should, therefore, be based on firm principles but flexible enough to accommodate unavoidable changes and unanticipated uses. The benefits of electronic patient records should not be overlooked, however. These benefits include the availability of much more powerful databases, elimination of the need for repeated requests to record subjects for the same information, and assurance that information is available when needed. Despite the privacy concerns described, it should be possible to improve privacy protection and safeguard the confidentiality of health information in HDOs through a variety of methods described in later chapters. Moreover, information must be acted on by individuals in a position to 16   In The Computer-Based Patient Record: An Essential Technology for Health Care, the IOM examined in some depth the array of users of computer-based patient records (CPRs) and CPR systems, indicating that an ''exhaustive list ... would essentially parallel a list of the individuals and organizations associated directly or indirectly with the provision of health care. Patient record users provide, manage, review, or reimburse patient care services; conduct clinical or health services research; educate health care professionals or patients; develop or regulate health care technologies; accredit health care professionals or provider institutions; and make health care policy decisions" (IOM, 1991a, p. 31). It is difficult to improve on that enumeration in the present context, even though the nature of the databases themselves (CPRs versus networks based on, e.g., insurance billing transactions or surveys) is quite different and the emphasis (e.g., patient care delivery versus health plan management) differently placed.

OCR for page 40
--> change their own, and others', behaviors and performance. Most experts agree that getting information to people and organizations is just the first, and perhaps not the most important, step in the change process. Although this committee (in Chapter 3) places great store on information dissemination efforts by HDOs, HDOs will not be well placed to follow up the actions taken (or not taken) by recipients of that information. Many of the challenges faced by the health care sector are essentially exogenous—for instance, the changing demographics of the U.S. population, problems of international competition in the manufacturing and information-services sectors, and increasing disintegration of social and familial structures. No amount of radical change in health care, let alone tinkering, will demonstrably affect those problems, and HDOs similarly cannot influence them. Further, despite the promise that HDOs hold for addressing certain health policy issues, this committee emphasizes that information derived from the files of HDOs and similar entities will not be the solution to all the ills of the health care system. Information may be incomplete or untimely, lack critical variables such as health status, or otherwise be imperfect. In addition, such data may be observational, meaning that they lend themselves more to description than to causal or inferential analysis, and more to retrospective commentary than to prediction. In the terminology used earlier, HDOs and their constituent databases may be neither acceptably comprehensive nor inclusive. Commentary on a related information activity is instructive. In 1992 the IOM, in conjunction with the Commission on Behavioral and Social Sciences and Education (CBASSE) reviewed the plans of the National Center for Health Statistics for a new National Health Care Survey. The survey is described as having the following objective: "to produce annual data on the use of health care and the outcomes of care for the major sectors of the health care delivery system. These data will describe the patient populations, medical care provided, financing, and provider characteristics" (IOM/CBASSE, 1992, p. 6). The IOM/CBASSE report commented at some length on the ability of existing data sources (e.g., current NCHS surveys) to provide these kinds of information and noted (p. 38): [They] are rapidly becoming outdated and less comprehensive than is desirable. Often they do not cover the universe of providers and sites of health care [or] patients or potential users of health care. They lack sufficient information on exactly what services are provided and what the outcomes of those services are . . . are inexact with respect to financial data ... are not timely; and ... are inaccurate, incomplete and unreliable. These faults may well affect the data repositories and networks considered by this IOM committee; they are discussed in greater detail below.

OCR for page 40
--> Ensuring the Quality of Data The above discussion has outlined the many potential users, uses, and benefits of HDOs. Ultimately, however, the real rewards of developing and operating HDOs will depend heavily on the quality of the data that they acquire and maintain. The committee considers this subject of sufficient importance that it elected to comment on it directly. The absolute prerequisites to successful implementation of any type of database or HDO with the expansive goals implied by the foregoing discussion are reliable and valid data. Developers must ensure that the data in their systems are of high enough quality that the descriptive compilations, the effectiveness research, and the comparative analyses envisioned can be done in a credible, defensible manner. (McNeil et al., 1992, describe limitations of current data systems for profiling quality of care, especially at the individual provider level.) Mistakes, qualifications and caveats, retractions, and similar problems must be minimized, and precision about what data are actually being sought must be maximized. All this must be done from the outset so that the long-term integrity and believability of the database and work based on its information will not be undermined irretrievably. The committee did not wish to prescribe methods that HDOs might employ for ensuring data quality, judging that approaches might differ by type of database and HDO. It did, however, consider that success in meeting this responsibility will call for attention on several fronts. First, the committee held the view that information becomes more useful when it is used. Although the characteristic of comprehensiveness is clearly of primary importance in considering the value of a database, HDOs need to avoid the trap of collecting everything that it is possible to collect, regardless of its reliability and completeness, and thereby end up with data elements that will be used only rarely and, worse, be of questionable value when they are used. Part of the problem is that analysts will have little experience with such data elements and may make incorrect assumptions about their reliability or about how to interpret values correctly. Another part of the problem is that some data, although currently collected routinely because an entry must be made in a box on a form, are not used for anything by anyone. Such data will likely have a very low level of accuracy. A commonly cited example relates to information on hospital diagnoses in the Medicare program; diagnoses were often doubtful before the advent of the DRG-based PPS (see Gardner, 1990). When diagnostic data began to figure in decisions about reimbursement, studies of quality of care, choices in clinical care, or analyses about productivity, the situation changed. After 1983 hospitals came to be paid on the basis of DRGs (which obviously are diagnosis based), and

OCR for page 40
--> diagnostic information improved markedly, although some problems persist. Similar problems of suspicious (missing, wrong, or even fraudulent) information on insurance claims forms for outpatient care exist to this day; the underlying problem is that payment mechanisms do not depend so heavily on outpatient diagnostic data—that is, the information is not used in the same way as inpatient data—so little incentive exists to record diagnoses accurately.17 The least that can happen in these instances is that those data elements consume computer memory; the worst is that the data will be used in ways that contaminate an entire study or cause unwarranted harm to individuals, groups, or practitioners. Second, data must be accurate and analyzable. Sometimes these points are couched in terms of reliability and validity of data.18 More generally, the accuracy and completeness of data elements that will be used extensively must be guaranteed if they are to be useful. Among the problems one must guard against are the following: missing data; out-of-range values for quantitative data (e.g., age of patient; charges; even laboratory values in the most advanced databases of the future); unrealistic changes in parameters over time (e.g., the doubling of a patient's weight between office visits); clearly erroneous information (e.g., wrong sex); and miscoded information on diagnostic tests, actual diagnoses, surgical procedures, medications, and the like. Analysts must also be cautious about their interpretation of patient care events—for example, not misconstruing the reasons for or timing of a particular diagnostic procedure when interpreting events in the course of treatment of a life-threatening emergency. Third, the committee also believes that structural aspects of health data- 17   One example of the problem of diagnostic coding for insurance claim purposes was provided during a study site visit. A member of an internal medicine group noted that he used essentially six outpatient (office visit) diagnoses because "they work" and because he would otherwise be questioned or second-guessed too much by insurers if he recorded more, or more detailed, diagnostic codes. Because reimbursement is keyed to length and complexity of a visit, rather than to diagnosis, he had a clear conscience about this practice. 18   Reliability in this context relates to the need for data to be reasonably accurate and complete—that is, essentially free of missing values, systematic bias in what data are captured or recorded and how those data are coded, and random errors. Validity concerns relate to the issue of whether analyses done on a given database are appropriate for the questions being asked and whether those analyses will provide defensible answers that are internally consistent and externally generalizable. According to Palmer and Adams (1993), measures of quality can be reliable if the rate of random error is low, although they may still contain systematic error (meaning that some attribute is being captured but that it may not be the one intended); for quality measures to be valid, both random and systematic error must be low. These considerations of random and systematic error mean that the level of reliability of a measure (or the underlying data) place a ceiling on the level of validity that can be attained; unreliable measures or information can never be valid.

OCR for page 40
--> bases should be emphasized as conducive to high-quality data and information. Databases should be built around a core of uniformly reported (or translatable) data that is relevant and can be shown to be accurate and valid for the HDO's intended analyses (in keeping with the comments just above). In addition, HDO should have an easily implemented capacity to supplement core data elements. The committee and other experts agree on the significant tension that exists between the desire for comprehensive databases and the consequently broad uses to which HDO data might be put and the wisdom of a certain parsimony in the actual gathering of person-identifiable information. Although the committee realizes that the federal government may have to take the lead in standards development and improved coding systems, the committee urges HDOs to foster, encourage, and work toward national standards for coding and definitions for (at least) core data elements. 19 Government leadership is indispensable in matters of coding and data uniformity, but widespread input from the private sector is desirable. The reason is that the costs of momentous or frequent changes (in terms of money, loss of comparability of data, potential incompatibility of clinical and payment coding, and incentives for fragmentation and upcoding of services) can be significant; consultation between the public and private sectors can help avert excessive or unnecessary costs of these types. Fourth, the committee takes the position that the basic structure and content of these databases ought to be carefully designed from the beginning, but they must have sufficient capacity for expansion and change as health care reform, effectiveness and outcomes research, and other dynamic aspects of the health care sector evolve in coming years. This requirement implies that due attention will be paid to the quality of new categories of data that may become available for HDOs in the future. RECOMMENDATION 2.1 ACCURACY AND COMPLETENESS To address these issues, the committee recommends that health database organizations take responsibility for assuring data quality on an ongoing basis and, in particular, take affirmative steps to ensure: (1) the completeness and accuracy of the data in the databases for which they are responsible and (2) the validity of data for analytic purposes for which they are used. Part 2 of this recommendation applies to analyses that HDOs con- 19   AHCPR has explored the feasibility of linking administrative databases for effectiveness research and urged the development of uniform messages and vocabulary standards (USDHHS, 1991). See also Aronow and Coltin (1993).

OCR for page 40
--> duct. They cannot, of course, police the validity of data when used by others for purposes over which the HDOs have no a priori control. Until HDOs can demonstrate the quality of their data, the committee cautions that their proponents must guard against promising too much in the early years, particularly in the area of improving quality of care and conducting research on the appropriateness and effectiveness of health services. The committee returns to this point in Chapter 4 in a discussion of data protection and data integrity. As many investigators have pointed out, the absence of sufficient clinical information in most databases today (and likely for tomorrow) is a critical limitation (Roos et al., 1989; Hannan et al., 1992; Chassin, 1993b; Krakauer and Jacoby, 1993). Efforts to acquire such information through manual abstraction of relevant information in hospital records, which is the basis of various patient classification programs (e.g., Medis Groups or HCFA's proposed Uniform Clinical Data Set), are costly and time-consuming. Some means of obtaining such information more directly from patient records will be needed. Clinical data should be obtained, whenever practical, to validate analyses. The committee does not regard the clinical data found in medical records, whether computerized or not, as always sufficiently comprehensive, accurate, or legible to characterize them as a "gold standard," but they are a valuable, and sometimes indispensable, touchstone against which to judge the less rich administrative data on which many types of health policy and health services research are and must be based. The validity of elements in a database must be matched with the kinds of inferences that can be drawn. The committee believes that the best method of enhancing the comprehensiveness of HDO databases and the accuracy and completeness of data elements is to move toward CPRs in which the desired variables themselves, rather than high-level abstraction and proxy coding systems, could be accessed. This committee does not wish to convey the impression that the transition to CPR systems is anything but an extraordinarily difficult task. Although the progress made in establishing a CPR Institute is laudable, much remains to be done for that organization to realize even the main objectives set forth for it in the IOM report on CPRs and CPR systems. In addition, planning efforts by the Computer Science and Telecommunications Board (a unit of the Commission on Physical Sciences, Mathematics, and Applications of the National Research Council) on the national information infrastructure and its role in health care (and health care reform) make clear that both the health care and the computer and information sciences communities have a considerable way to go even in agreeing on details about the directions that policies and

OCR for page 40
--> technical advances should take in addressing major issues in this critical area. In its April (1993) report to the Secretary of DHHS, the Work Group on Computerization of Patient Records supported the development of national standards for documenting and sharing patient information. It also called on the American National Standards Institute Healthcare Information Standards Planning Panel to coordinate the development, adoption, and use of national information standards for patient data definitions, codes and terminology, intersystem communication, and uniform patient, provider, and payer identifiers. RECOMMENDATION 2.2 COMPUTER-BASED PATIENT RECORD Accordingly, the committee recommends that health database organizations support and contribute to regional and national efforts to create computer-based patient records. The committee acknowledges the importance of computer-based patient records with uniform standards for connectivity, terminology, and data sharing if the creation and maintenance of pooled health databases is to be efficient and their information accurate and complete. The committee urges HDOs to anticipate the development of CPRs and to contribute to the development and adoption of these standards. HDOs should take a proactive stance, by joining efforts by the CPR Institute and other organizations working to facilitate implementation of CPRs, helping in standards-setting efforts, and otherwise becoming full participants in the multidisciplinary effort that is now under way. Summary Much of the thrust of this report concerns how to maximize the benefits that this committee believes can be realized from the construction and operation of inclusive and comprehensive health databases. In examining these questions, the committee has focused on what it calls health database organizations. HDOs are emerging entities of many different characteristics in states and other geographic regions of the country; the committee made two key assumptions about them: (1) HDOs have access to and possibly control considerable amounts of person-identifiable health data outside the care settings in which those data were originally generated and (2) the chief mission of HDOs is public release of data and results of studies about health care providers or other health-related topics. The broad-based value of HDOs and their databases might be said to be the provision of reliable and valid information in a reasonably timely man-

OCR for page 40
--> ner to address all the major questions in health care delivery—access, costs, quality, financing and organization, health resources and personnel, and research—facing the nation today and in the coming years. The chapter also details the narrower benefits that might accrue to a variety of potential users, including patients and their families, health care providers, purchasers and payers, employers, and many other possible clients in the public and private sectors. In assembling the data that will go into products for all such users and uses, the committee had sobering concerns about the quality of those data. Thus, it recommends that HDOs take responsibility for assuring data quality on an ongoing basis, and in particular take affirmative steps to ensure: (1) the completeness and accuracy of the data in the databases for which they are responsible and (2) the validity of data for analytic purposes for which they are used [by HOOs] (Recommendation 2.1). The committee also recommends that HDOs support and contribute to the regional and national efforts to create CPRs and CPR systems (Recommendation 2.2). Initially, HDOs will attempt to provide data for particular users and uses to answer particular kinds of questions. Nevertheless, advances in the creation and operation of computer-based databases, whether centralized or far-flung, can be expected in the coming years. The committee believes that thoughtful appreciation of their potential and anticipation of their potential limitations will hasten that progress. The development of HDOs—their structure, governance, and policies on disclosure as well as on protection of data—must be designed for the achievement of these long-term goals. The next chapter takes up the major responsibilities of HDOs in carrying out a critical mission: furnishing information to the public on costs, quality, and other features of health care providers in a given region or community. The committee adopted two strong assumptions as it began to consider this topic. The first is that considerable benefits will accrue to interested consumers and to the public at large from having access to accurate and timely information on these aspects of the health care delivery system with which they deal; this has been the thrust of the present chapter. The other assumption is that HDOs supported by public funds ought to have a stated mission of making such information available, and this will be a core element of several committee recommendations. The committee also assumes, however, that harms can arise from some uses of the information in such databases. For this reason, in the next chapter the committee considers administrative and other protections that it believes HDOs should put in place.