3


Current Data Collection Methods and Sources

Summary of Key Findings

  • There is a lack of comparable, standardized data (due in part to a lack of consistent definitions) in the measurement of health status and quality of health care for children and adolescents.
  • Many health conditions and health care processes that are important to children appear in rates/numbers that are too small to be adequately represented in survey data sets.
  • Improving linkages among administrative record systems and between those systems and population-based survey data sets would facilitate comprehensive assessment of child and adolescent health and health care quality.
  • The use and interoperability of electronic health records are expected to increase dramatically over the next 5 years, creating a robust source of data that can be readily analyzed and acted upon.

Imagine that you are driving a complex piece of machinery. You want to know the direction in which you are headed, your rate of speed, how much fuel you have, the engine temperature (and possibly the external temperature as well), and whether the engine is performing as it should. If



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 67
3 Current Data Collection Methods and Sources Summary of Key Findings • There is a lack of comparable, standardized data (due in part to a lack of consistent definitions) in the measurement of health status and quality of health care for children and adolescents. • Many health conditions and health care processes that are im- portant to children appear in rates/numbers that are too small to be adequately represented in survey data sets. • Improving linkages among administrative record systems and between those systems and population-based survey data sets would facilitate comprehensive assessment of child and adoles- cent health and health care quality. • The use and interoperability of electronic health records are expected to increase dramatically over the next 5 years, creat- ing a robust source of data that can be readily analyzed and acted upon. Imagine that you are driving a complex piece of machinery. You want to know the direction in which you are headed, your rate of speed, how much fuel you have, the engine temperature (and possibly the external temperature as well), and whether the engine is performing as it should. If 67

OCR for page 67
68 CHILD AND ADOLESCENT HEALTH you are flying a plane, you want to know more details, such as your alti- tude and the wind speed. If you are under water, you want to know other things. The display that signals whether you are on track is derived from hundreds of intricate gauges, sensors, computer chips, and monitoring de- vices. Each mechanism is designed to collect certain types of performance data; these data are then compared against standard specifications, and the results are analyzed to determine whether the data are signaling a problem that requires the operator’s attention. Some gauges are large and dominate the operator’s routine field of vision; others are more peripheral and show alerts only when significant problems arise. The above analogy is useful in considering the monitoring systems that are used in determining the quality of child and adolescent health and health care services. The clinician examines an individual child and collects data from numerous sources—temperature, heart rhythm, height, weight, sleep- ing and eating habits, and so forth—before concluding whether the child is “healthy” or requires attention for some specific reason. In much the same way, health professionals and policy makers examine data from a variety of population surveys and administrative data sets in making judgments about the health and health care of children and adolescents. Yet the data system used to measure the quality of child and adolescent health and health care services is not as finely developed as the instrumentation in the above anal- ogy or the collection of clinical data. Indeed, it may be inappropriate even to refer to the existing data sets on child health and health care services as a “system,” since these data sets consist of multiple, independent efforts that are largely uncoordinated and unrelated to each other. In many cases, data sets were designed for specific objectives without regard to how they fit within the larger landscape of child health measures. Furthermore, child and adolescent health data sets are not harmonized or coordinated with efforts that collect data about other aspects of development, education, or family and social contexts. The result is a tremendous wealth of data about many different specific dimensions of child and adolescent health and well- being, significant gaps with respect to important areas of health and selected populations, and the absence of an analytic framework that can provide routine guidance for general or even specific areas of concern. The remainder of this chapter begins with a brief review of current methods used to collect data on health and health care. It then describes existing sources of these data for children and adolescents. Next, the chap- ter examines the limitations of these data sources. The final section argues for the need for a coordinated approach to integrate measures of child and adolescent health and health care quality.

OCR for page 67
69 CURRENT DATA COLLECTION METHODS AND SOURCES DATA COLLECTION METHODS Methods used to collect data on health and health care can be charac- terized by the following features: • Sample versus census—Some data are collected for the entire popu- lation to which they apply; such data are sometimes referred to as census data. One example is the actual decennial census, which aims to obtain counts by geographic location and basic demo- graphic characteristics for the entire resident population of the United States. However, the term census may be used to refer to any data collection aimed at collecting data for every unit in the population of interest (i.e., a subset of a larger population of em- phasis). Conversely, many data cannot be collected for the entire population without excessive cost and/or a burden on respondents. Instead, the data are collected from a subset of the population, or a sample, that is selected (usually by randomization) in a way that makes it representative of the entire population; thus, estimates can be calculated from the sample that approximate those for the entire population. • Based on administrative records versus respondents—Some data are extracted from records that already exist because they are necessary for the administration of a program or intervention. Ex- amples are government records (tax files, social security and Med- icaid enrollment, school enrollment, accident reports), commercial records (health plan enrollment files, medical claims), and medical records (from physicians’ offices, hospitals, and other providers of health care). Other data are collected directly from respondents, for example, by interviewing individuals about their experiences. The line between the two may not be entirely distinct; for example, a physician might be asked to provide data derived from the medical records she uses in her practice; thus the data collection is respon- dent based, but the data are ultimately derived from administrative records. In the case of children, most respondent-based data are collected from proxy respondents (e.g., parents and caregivers). A third category to consider is that pertaining to clinical data, such as observational studies. • Population- versus service-based—Some data collection efforts fo- cus on a general population defined only by broad demographic characteristics, such as all children under age 6 or all adolescent girls. (Note that population-based in this sense could encompass data collection using sampling, and thus is unrelated to census data collection from an entire population.) Other data collection

OCR for page 67
70 CHILD AND ADOLESCENT HEALTH TABLE 3-1 Data Collection Methods Source Census Sample Population-based Administrative Vital statistics Some components of records Medical Expenditure Panel Survey (MEPS) cost data; national samples of discharge abstracts, etc. Respondents Decennial census Most national surveys (e.g., Behavioral Risk Factor Surveillance System [BRFSS], MEPS, National Health Interview Survey [NHIS], National Immunization Survey [NIS], National Survey of Family Growth [NSFG], Pregnancy Risk Assessment Monitoring [PRAMS]) Service-based Administrative Some Healthcare Some HEDIS measures records Effectiveness Data (those requiring and Information medical record review) Set (HEDIS) measures (those available in plan billing records) Respondents Health plan Consumer Assessment collection of race/ of Healthcare ethnicity data Providers and Systems (CAHPS) measures SOURCE: Committee on Pediatric Health and Health Care Quality Measures. efforts in health and health care operate only through specific sites or administrators of services, such as health plans or clinics; such service-based data collection can cover only subpopulations defined by their attachment to the service providers. While the above three features (summarized in Table 3-1) are not unre- lated in practice, they are nonetheless conceptually and practically distinct. Two examples follow:

OCR for page 67
71 CURRENT DATA COLLECTION METHODS AND SOURCES • Census and administrative records—Given the costs and burden of respondent-based data collection, census (100 percent) data collec- tion for a specific population is almost always limited to adminis- trative records that can be accessed inexpensively and efficiently. However, not every data collection from administrative records is a census; cost, access, or confidentially issues may necessitate use of a sample of records. • Respondent-based and population-based—For some data needs, the relevant administrative records are service based. To obtain general population coverage, either records must be consolidated across providers or a respondent-based collection must be con- ducted. However, many respondent-based data collections are aimed only at coverage of a set of service providers, not a general population. It should also be noted that none of these distinctions bears a perfect relationship to the distinction between health and health care data. Com- pared with health care data, health data tend more often to be population based (at least in objective) and respondent based; however, many examples of health care data are population or respondent based, while many ex- amples of health data are based on administrative records or service based. Furthermore, the same data on health might be regarded as a population measure or as a measure of quality (through sentinel care processes) for a health care provider, depending on how they are collected and reported. For example, immunization rates are both a population measure and a measure of system performance. Assessment of child and adolescent health and health care quality relies on data collected through a variety of the methods discussed above and from a variety of sources. Sources may include primary or second- ary sources, surveys or registries, and voluntary or required reports. They may include parents or health care providers, as well as older children and adolescents who self-report their own data. Surveys may be conducted by telephone or through interviews with children and their families in health care or other service settings. Some surveys may involve a review of health records in providers’ offices or claims records submitted to public or pri- vate health plans. Surveys may be conducted at one point in time, or they may recur annually or over other time periods. The reporting source may change over different time periods, or the same population may be surveyed or interviewed on multiple occasions. Data may be retrospective, based on respondents’ recall of certain events or conditions, or prospective, which involves collecting data at multiple intervals over time to monitor changes in health characteristics. Surveys may be administered to a universal or randomized sample of children on a national, state, or local basis; or they

OCR for page 67
72 CHILD AND ADOLESCENT HEALTH may focus on selected populations, such as underserved children, children with special health care needs, or children with specific demographic char- acteristics. Registries are another common source for data on health and health care, especially when a specific procedure (such as immunization) can be recorded electronically in a central data collection site. The consistency and rigor of the measurement method are directly associated with the quality of the data collected. In examining child and adolescent health and health care, therefore, it is important to know details about the sampling strategy, data collection method, and reporting source associated with surveys or reports. EXISTING DATA SOURCES The federal government supports numerous surveys and information systems that collect data about selected aspects of child and adolescent health and health services. Prior studies have reviewed many of these data sets, often with detailed analyses of their sampling strategy, periodicity, and specific data components (IOM and NRC, 2004; NRC, 1998, 2010; NRC and IOM, 1995). Federal Population Health Data Sets The committee developed Appendix F, a table briefly describing the major population health data sets that include information about child and adolescent health and health care services. In developing this table, the committee examined the following sources: • Children’s Health, the Nation’s Wealth: Assessing and Improving Child Health (IOM and NRC, 2004), which identifies 30 federal data sets used for measuring children’s health and relevant influ- ences and includes a gap analysis of specific measures for 12 of these data sets; data sets reviewed by the Federal Interagency Forum on Child and • Family Statistics, which produces the annual America’s Children reports (FIFCFS, 2010a); the Directory of Health and Human Services Data Resources, pre- • pared by the Department of Health and Human Services’ (HHS’) Data Council (HHS, 2003); a list of federal data sets and repositories available on the research • portal of the National Information Center on Health Services Re- search and Health Care Technology (NICHSR) at the National Institutes of Health (NIH, 2010a);

OCR for page 67
73 CURRENT DATA COLLECTION METHODS AND SOURCES three research papers examining selected federal data sets for chil- • dren, youth, and families (Hogan and Msall, 2008; NRC and IOM, 1995; Stagner and Zweigl, 2007); a review of longitudinal data sets compiled during the planning for • the National Children’s Study (The Lewin Group, 2000); and a list compiled by the Agency for Healthcare Research and Qual- • ity’s (AHRQ’s) Data and Surveys web site (AHRQ, 2010a). This inventory includes surveys of health and health care services ad- ministered for children and adolescents (aged 0−18) within the past 20 years (beginning in 1990). Data sources for these surveys include informa- tion provided by children, adolescents, parents, caregivers, and health care providers. Some surveys involve reviewing health records. Only surveys administered within the United States to sample sizes greater than 1,000 are included in the above list. The largest number of population health surveys, registries, and studies are administered by HHS. Other federal agencies collect child health data as part of their administration of information systems for other purposes, such as environmental quality (Environmental Protection Agency), education (U.S. Department of Education), or occupational injuries (U.S. Department of Labor). In addition, some federal agencies collect data on health influ- ences, such as poverty (Census Bureau), housing and homelessness (U.S. Department of Housing and Urban Development), and motor vehicle safety (U.S. Department of Transportation). Longitudinal Studies of Children and Youth In addition to data systems administered directly by federal agencies (or their contractors), federal funds have supported hundreds of longitudinal studies examining selected aspects of child health, frequently focusing on small populations that are followed intensely over several years or even de- cades. No central source exists that can catalogue the information gleaned from these longitudinal studies, although many of these studies have been described in earlier reports (NRC, 1998). One example of a longitudinal study is the National Children’s Study (NCS), launched in January 2009. The NCS is the largest long-term study of environmental and genetic effects on children’s health conducted in the United States. A nationally representative probability sample of 100,000 births will be followed from before birth to age 21. Data will be collected on multiple exposures and multiple outcomes using repeated measures over time (NIH, 2010c). Other longitudinal studies include the National Longitudinal Study of Adolescent Health (Add Health) and the Great Smoky Mountains Study

OCR for page 67
74 CHILD AND ADOLESCENT HEALTH (GSMS). Add Health, which began in 1994, examines how social contexts (such as families, friends, peers, schools, neighborhoods, and communi- ties) influence adolescents’ health and risk behaviors (NICHD, 2007). The GSMS, a population-based community survey of children and adolescents in North Carolina, estimates the number of youth with emotional and behavioral disorders, the persistence of those disorders over time, the need for and use of services for those disorders, and the possible risk factors for developing them (Costello et al., 1996) (see Appendix F for additional information on selected longitudinal studies of children and adolescents). Administrative Data Sources In addition to the population health and longitudinal studies described above, data on child health and health care services can be derived from service-based records. These data sets include those prepared for adminis- trative purposes, such as vital statistics (birth and death records), medical records, health plan payments, and quality measures. They also include surveys of populations from selected service settings, such as children or youth who are enrolled in specific health plans (e.g., Medicaid or CHIP), children who are hospitalized, or children who are identified in cases of abuse and neglect. The committee identified and catalogued these service-based data sets by reviewing the sources on population health described above and draw- ing on a commissioned background paper (MacTaggart, 2010). Appendix F provides a listing of the individual data sets derived from service-based studies, which include, for example, Healthcare Effectiveness Data and Information Set (HEDIS) measures, National Committee for Quality As- surance (NCQA) measures, and hospital administrative data. LIMITATIONS OF EXISTING DATA SOURCES Estimates of the scope and severity of certain health conditions are sometimes derived from service-based information sources rather than gen- eral population surveys. Existing data sources have a number of limitations related to standardization, data collection, the ability to capture disparities, case mix adjustment, and data aggregation methods. Standardization There is no lack of standards; rather, there are multiple standards that are competing and conflicting in nature. The same is true of existing qual- ity performance measures. A range of such measures exist for children and adolescents, and the administrative requirements for their collection vary

OCR for page 67
75 CURRENT DATA COLLECTION METHODS AND SOURCES with respect to which measures are collected, the sources of the data (based on administrative records or respondents or a mix of the two), validation of the data sources, and the reporting period. The lack of comparable, stan- dardized data has limited the ability to develop benchmarks from national or state sources. Interstate issues are significant as a result of variations in state reporting requirements, state information technology (IT) infrastructure capacity and specifications, state collection methods, cross-state access to data, and the way various parameters are defined. For instance, the definition of “fully” immunized and the components of a newborn screening can vary by state; therefore, the data elements that are collected and tracked may vary and not be comparable (Ferris et al., 2001). Data are more likely to be equivalent if claims data are used as the source and the services are provided in the same setting; however, the conversion from the ninth to the tenth edition of the International Classification of Diseases (ICD-9 to ICD-10) in the coming years will require additional scrutiny to ensure continued comparability. One of the greatest challenges is standardizing the definition of chil- dren. For Medicaid early and periodic screening, diagnosis, and treatment (EPSDT), a child is defined as up to age 21. For the Children’s Health Insur- ance Program (CHIP), a child is defined as up to age 19. For the Consumer Assessment of Healthcare Providers and Systems (CAHPS) (Berdahl et al., 2010), a child is defined as age 17 or younger. And the Federal Interagency Forum on Child and Family Statistics (FIFCFS) of the National Center for Health Statistics defines teens as those aged 12−17 (FIFCFS, 2010a). Family structure likewise is not standardized across funding mechanisms and time. Other problems occur in attempting to compare similar health issues across data sets. These problems illustrate both the advantages and difficul- ties of attempting to standardize definitions and data collection methods. For example, Bethell and colleagues’ (2002) characterization of good health raises concern about how the information is obtained. Many national sur- veys have converged on using a single question on how the individual rates his/her own health or parents rate their child’s health along a spectrum of excellent, very good, good, fair, or poor (Anderson et al., 2001; Andresen et al., 2003; Hennessey et al., 1994; NCHS, 1973; Roghmann and Pless, 1993). Such convergence allows for comparison over time and across age groups. However, little variation in the responses is seen, and the measure is insensitive to fairly major differences in health. A more nuanced measure that captures more dimensions of perceived health status would be useful, but its use might sacrifice the value of comparability. Addressing such is- sues would require ongoing methodological work on assessing and refining measures and establishing comparability over time, as is done with changes in the ICD (Anderson et al., 2001). Likewise, the Maternal and Child Health Bureau has developed a short

OCR for page 67
76 CHILD AND ADOLESCENT HEALTH screener to identify children with special health care needs (Bethell et al., 2002). While ensuring comparable ascertainment across populations, the use of this instrument hinders comparisons with data sets that rely on di- agnoses. Standardized measures of child health and the quality of relevant health care are also important for all child health problems, but especially for those children with preventable, ongoing, or serious health conditions (Kuhlthau et al., 2002). Child health problems include a large number of relatively rare conditions (see Chapter 4). Moreover, the implications of the existence of a health condition may vary with child development (IOM and NRC, 2004). Thus, an early sign of a health problem may be slower rates of physical growth, but later implications may include poorer school achievement, perhaps due to repeated absences (Byrd and Weitzman, 1994; Weitzman et al., 1982), and may be associated with behavioral issues that may further impede school success (Gortmaker et al., 1990). In addition, conditions may vary in severity across different children and over time and have implications for adult health. Criteria for the design of health measures are identified in Children’s Health, the Nation’s Wealth (IOM and NRC, 2004, p. 43): importance to current and future health, • reliability and validity, • meaning in terms of the special aspects of child health and • development, cultural appropriateness, • sensitivity to change, and • feasibility of collection. • Inherent in these criteria is the challenge of a measurement system that speaks to the various parties engaged in improving the health of children. Diagnoses (ICD codes), for example, may be meaningful to health care providers but less so to parents, who, in turn, may be concerned about functional implications, including management strategies. Both types of information may be critical to the development of an education plan for special education students. Data Collection The use of administrative data to assess child health and health care quality is limited to some extent to certain dimensions of quality, such as access and some process measures. The combining of medical records and claims data through the development and operation of electronic health record (EHR) systems and electronic health information exchange (e-HIE) will appreciably reduce this limitation. The evolution to ICD-10 coding will also expand the value of claims data. Data linkages resulting from Medic-

OCR for page 67
77 CURRENT DATA COLLECTION METHODS AND SOURCES aid Transformation Grant initiatives, Children’s Health Insurance Program Reauthorization Act (CHIPRA) provisions, and American Recovery and Reinvestment Act (ARRA) funding are providing critical data elements. For example, the opportunity to collect some measures more efficiently is enhanced through the linkage of Medicaid with vital statistics, state labo- ratories, and registries. In addition, the availability of web-based interfaces expands options for the collection and transmission of data. Given that the cost of quality oversight and performance measurement reporting is a cost to public and private purchasers and providers, the fiscal impact as well as efficiency of using standardized, formatted data through an ongoing infrastructure is considerable. However, the realization of these benefits assumes that the data are collected and documented at the site of care, which is not always the case. Also assumed is that the individual is identifiable. A current issue is that Medicaid requires coverage of newborns under their mother’s identification until their own eligibility can be estab- lished, which may take up to a year. Data coded to a mother’s identification may or may not be tracked back to the newborn when the child becomes individually enrolled. Another factor that can potentially affect the data collected is a change in payment methods. For example, while there is significant interest in episode-of-care payment methods, there is a risk that some of the previous detailed claims data may be lost. A lesson learned from the transition from individual to bundled payments for prenatal visits and delivery was that the requirement to collect and track the number of prenatal visits through administrative data no longer existed. Identification and Monitoring of Disparities As discussed in Chapter 2, it is crucial to identify and monitor health and health care equity issues among children and adolescents. Racial/ethnic and linguistic disparities in children’s health and health care cannot be identified, tracked, addressed, or eliminated without consistent collection of race/ethnicity and language data on all patients (Flores, 2009). Yet, one- third of all health plan enrollees (28.7 million individuals) are covered by plans that collect no race/ethnicity data (AHIP and RWJF, 2006). A national survey of 272 hospitals found that only 39 percent collected data on pa- tients’ primary language (Hasnain-Wynia et al., 2004), and no information is available on what proportions of hospitals or health plans collect data on English proficiency. Parental limited English proficiency (defined by the U.S. Census Bureau [Shin and Kominski, 2010] as the self-rated ability to speak English less than “very well”) has been shown to be superior to primary language spoken at home as a measure of the impact of language barriers on children’s health and health care (Flores et al., 2005a). Although the Office of Management and Budget (OMB) requires highly

OCR for page 67
80 CHILD AND ADOLESCENT HEALTH Restriction to homogeneous populations—Some measures can be • made comparable by restriction to a homogeneous population. For example, childhood immunizations typically run on strict age- based schedules and are appropriate for essentially all children in the age window; hence the measure can be calculated from a specific age group, and no age adjustment is needed. One can then compare immunization rates in different states at that single age. Stratified reporting—There might be several groups of interest for • a measure, each of which is homogeneous. For example, one might be interested in immunization rates across a range of ages, but recognize that younger children are more likely than older ones to have immunizations complete. A simple comparison of childhood immunization rates across states could be confounded if one state has a higher proportion of young children. Instead, one might stratify reporting by age, that is, prepare a separate measure for each of several nearly homogeneous age groups. Unconfounded comparisons could then be made for each stratum. Direct standardization—Stratified reporting might be impractical • for any of at least three reasons: (1) there might be insufficient data with which to calculate measures for each of the relevant strata with adequate precision for stratified reporting; (2) stratified reports might provide more detail than is desired (for example, comparing 51 states in 10 age strata involves cognitively processing 510 measures, obscuring overall state differences); and (3) when a control variable has many levels or several control variables must be considered at once, the number of strata can become very large, exacerbating both of the previous problems. A set of strati- fied measures can be consolidated into a simpler single measure by combining measures across strata with fixed weights corresponding to some reference population. To develop a single immunization measure for comparison of states, for example, one might combine immunization rates by year of age with weights based on the na- tional age distribution. Then no state would receive a higher score simply because it had a larger proportion of young children. Model-based standardization—Direct standardization may fail • when the number of observations per cell is small or zero. Model- based (regression) standardization is a generalization that can be more robust against such problems (Little, 1982). Regression stan- dardization can accommodate simultaneous adjustment for mul- tiple variables. A variety of models are appropriate for use with different kinds of data.

OCR for page 67
81 CURRENT DATA COLLECTION METHODS AND SOURCES Given the existence of technical methods for implementing case mix adjustment in a variety of settings, the key scientific or policy question is which variables to adjust for in reporting any particular comparison. Since case mix adjustment is a method of removing extraneous composi- tional effects from a comparison, the key is to figure out which effects are extraneous for a given purpose and which are of interest. For example, it is common to adjust for severity of illness and comorbidities when using outcome measures to evaluate the quality of care provided by hospitals. Without such adjustment, hospitals that treat more severely ill patients might be rated as worse than those of similar quality that treat mildly ill patients. Similarly, when evaluation is based on a measure of process, it is appropriate to adjust for patient variables associated with either the degree of appropriateness of the process or the difficulty of applying it. To consider a slightly more complex example, one might be interested in unadjusted rates of severe emotional distress (SED) if one simply wanted to determine how to distribute funds for mental health services across schools. If one wanted to compare schools on their psychological climates, one might want to adjust for age distributions (if age is a predictor of a determination of SED). If one wanted to evaluate schools on how well they (and their associated support systems) help children cope with stressors that tend to engender SED, one might further adjust for known stressors such as family poverty or instability. While adjusting for age is rarely controversial, adjusting for socioeco- nomic or race/ethnicity variables raises more subtle issues. Suppose, for example, that low-income patients with a certain condition at each hospital are less likely than upper-income patients at the same hospital to obtain a service equally needed by both. Without adjustment of two hospitals that perform identically on a measure of this service, the one with a greater proportion of low-income patients would receive a worse quality score. By the logic of the previous examples, adjustment for patient composition by income group might be considered. It has been argued that such adjustment obscures and excuses inferior performance for disadvantaged (low-income, in this case) patients (Romano, 2000). On the other hand, by hypothesis in this example and perhaps empirically in many cases, inferior performance for low-income patients is a systemwide failure, not just a failure of the hospitals that see many such patients. Such a systemwide failure might arise, for example, from a lack of insurance coverage for needed medica- tions, a lack of resources required to enable less educated patients to master complex treatment regimens, or unconscious discrimination against such patients. Indeed, such a pattern of inferior treatment within each hospi- tal is not discernible in unadjusted hospital-level reports, which combine income groups. (If some hospitals serving many low-income patients have

OCR for page 67
82 CHILD AND ADOLESCENT HEALTH generally inferior performance—that is, for each income group—this could be observed in either adjusted or unadjusted reports.) Reports stratified by income for each hospital would reveal the pattern, albeit only after further analysis, and become subject to the disadvantages discussed above. In fact, the pattern would be revealed most explicitly in the coefficients of the case mix regression model, which summarize the within-hospital differences in a single number (Zaslavsky, 2001). The point here is that hospital (or other unit-specific) reports are good for some purposes but are best examined in conjunction with analysis of more general patterns. Another controversy concerns the applicability of case mix adjustment in assessment of racial/ethnic health and health care disparities. It is logical to age- and sex-adjust intergroup comparisons of health, and similarly to adjust comparisons of health care for clinical characteristics affecting need and outcome. However, the IOM report Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care (2003a) argues that it is not appropriate to adjust for socioeconomic measures (that is, remove their effects) in such comparisons since worse socioeconomic status is one of the aspects of disadvantage imposed on disadvantaged racial/ethnic groups and a mediator of effects on health, treatment, and outcomes. Others have argued for adjustment for socioeconomic variables, thus more or less ex- plicitly taking a much narrower view of what counts as a disparity that ex- cludes effects mediated through socioeconomic differences between groups at variance with the IOM-endorsed definitions (Satel and Klick, 2006). This controversy illustrates how important scientific and normative principles may arise in case mix adjustment. Data Aggregation Methods Any analysis of data used to measure health or health care quality requires aggregation of the data. These data may be collected with the primary goal of measurement, using any combination of tools and design approaches as described previously; in this case, the time-consuming and expensive process of data collection for measurement must be balanced against the rigor with which these data can be collected. In many cases, sec- ondary data, such as those collected for clinical, billing, research, or other purposes, may be used secondarily to assess health or health care quality. These data are often less well validated and may contain errors or formats that compromise data analysis; for some data types in some populations, however, secondary data are the only accessible source of the needed infor- mation. In either case, IT often plays an important role. Databases, medical data registries, and clinical health information technology (HIT) are three common approaches to data aggregation and reuse. Databases, defined as a structured collection of organized, retrievable,

OCR for page 67
83 CURRENT DATA COLLECTION METHODS AND SOURCES and (typically) machine-readable information (Frawley et al., 1992), are a common tool for assembling data before conducting analyses. Database software is specifically designed to support the storage, manipulation, and retrieval of data, and is a critical tool for the biostatistician dealing with large data sets. One of the key features of databases is the ability to define relationships among data elements. For example, databases allow billing system data that include provider identifiers and sites of care to be com- bined with survey data that may include a provider name. These two col- lections of data can be combined because the provider name and date of visit may match the provider name and date of completion in the survey. This relationship allows the site of care to be linked to the survey, thereby supporting a variety of analyses that compare some measure across sites of care. Medical data registries are a specialized type of database designed to contain data collected in the course of caring for a specific patient popula- tion (Drolet and Johnson, 2008). Because the goal of medical data registries is often to support secondary data analysis, they feature well-characterized data collection methods and carefully constructed data fields that rely on controlled terminologies to support the aggregation of data in ways not always defined a priori. Medical data registries also characteristically sup- port longitudinal data collection (i.e., the collection of data on a particular patient over time), as well as cross-sectional data collection (e.g., survey results on functional status after hip replacement in clinics across the country). Finally, the use of a medical data registry implies attention not only to the quality of the data, but also to the rigorous policies of human subjects assurance, the Health Insurance Portability and Accountability Act (HIPAA), and internationally sanctioned approaches to privacy and security. Clinical HIT has received significant attention because of its potential impact on quality and safety (IOM, 1999). EHR and, more recently, per- sonal health record (PHR) systems are primary data sources that provide a rich source of information about health and health care quality. These systems promote the collection of comprehensive, patient-specific data on active medications, allergies, medical diagnoses, encounter summaries, re- ferrals, and laboratory tests, as well as other longitudinal data. As utiliza- tion of EHRs and PHRs continues to grow, they will provide an important opportunity to integrate data across specialty care, such as care for mental health and substance use disorders. In addition to the above three approaches, the adoption of con - trolled terminologies, such as the Systematized Nomenclature of Medicine (SNOMED) or the ICD, together with relatively structured formats for encounter summaries or document types, makes it possible to aggregate data across patients, sites of care, and even entire regions, as demonstrated

OCR for page 67
84 CHILD AND ADOLESCENT HEALTH by numerous health information exchange demonstration projects around the United States (Denny et al., 2009; Doan et al., 2010). These systems may catalyze the formulation of new health and health care quality mea- sures and may radically lower the implementation cost of measurement. Moreover, through the use of algorithmic approaches to data analysis, researchers are beginning to demonstrate near-real-time feedback of quality measures to providers at the point of care (Roberts et al., 2009; Starmer and Giuse, 2008; Starmer and Waitman, 2006; Zaydfudim et al., 2009). Unfortunately, as of 2008, fewer than 20 percent of providers were using a comprehensive EHR in their practice (DesRoches et al., 2008). Similarly, demonstration projects of e-HIE have achieved usage for under 20 percent of encounters (Johnson et al., 2008; Vest, 2009), although with recent federal incentives, the adoption of both EHRs and e-HIE is expected to increase dramatically over the next 5 years. The promise of these technologies suggests that measurement research- ers should modify validated measures to support them and investigate how best to integrate efforts to collect valid and reliable data with available pop- ulationwide data samples that may be of lower quality. Furthermore, issues surrounding privacy and access to state-based Medicaid data continue to underscore challenges in EHR and e-HIE implementation. While the issues of privacy and confidentiality are of critical concern, detailed discussion of these issues is beyond the scope of the report. (For a more comprehensive discussion of privacy and confidentiality issues, see Engaging Privacy and Information Technology in a Digital Age [NRC, 2007] and Beyond the HIPAA Privacy Rule: Enhancing Privacy, Improving Health Through Re- search [IOM, 2009b].) HIPAA and the regulations that followed protect personal health information held by third parties and give patients an ar- ray of rights. They also established a range of administrative, physical, and technical safeguards to ensure the confidentiality, integrity, and availability of electronic health information. HIPAA was followed by the Patient Safety and Quality Improvement Act of 2005 (PSQIA), which established a voluntary reporting system to resolve patient safety and health care quality issues: “To encourage the reporting and analysis of medical errors, PSQIA provides Federal privilege and confidentiality protections for patient safety information called patient safety work product. Patient safety work product includes information collected and created during the reporting and analysis of patient safety events” (HHS, 2011a). Both of these pieces of legislation represent the policy consensus and technical capabilities at the time they were enacted. It is unlikely that new legislation will be enacted in the near future to refine and update this policy consensus and incorporate technical advances. In the meantime, well- designed systems that produce robust data with strong privacy protection

OCR for page 67
85 CURRENT DATA COLLECTION METHODS AND SOURCES will be able to meet the needs and protections encompassed by these two pieces of legislation, but also self-adjust to adapt to the needs and chal- lenges of the future. At present, privacy protections can conflict with attempts at data ag- gregation. The adolescent population poses special data collection issues, particularly with regard to privacy and security concerns, as confidentiality is known to be a significant and necessary component when interviewing adolescents. Conflicts also exist at the state and local levels with respect to accessing Medicaid and vital statistics data; there is marked variation in the way states have interpreted recent guidance from the Centers for Medi- care and Medicaid Services (CMS) regarding access to and the availability of Medicaid data. Successful future efforts to conduct cross-state quality measurement will require specific guidance from CMS to the states regard- ing the priority associated with these efforts. Although necessary safeguards for patient confidentiality are essential, they need not preclude the ability to develop and utilize analytic methods to conduct both cross-sectional and longitudinal comparisons among states. The failure of CMS to facilitate the comfort of states in providing limited yet essential access to Medicaid data would restrict the ability to perform quality measurement across the nation for this important patient population. Illustrative Examples This section presents two illustrative examples of the challenges dis- cussed above: an assessment of a state-based demonstration program and measurement of health insurance coverage. Hypothetical State-Based Demonstration Program The first example is a hypothetical state-based demonstration program designed to examine the effect of changes in insurance coverage strategies aimed at reducing preventable hospitalizations and hospital costs among low-income children. To conduct such an assessment would require data on the details of insurance coverage; on the details of hospitalizations; and on personal characteristics of each child’s family, notably income, by state. The Medical Expenditures Panel Study (MEPS) is carried out by interview- ing parents of a nationally representative sample of children about their children’s health and health care use (AHRQ, 2010b), the parents’ employ- ers about insurance benefits, and health care providers about the children’s use of services and charges. Thus, this data set would appear to contain all the necessary data. In 2006, however, the sample included only 12,609 individuals younger than 24, slightly fewer than half of whom were from low-income families. Moreover, hospitalization is a relatively infrequent

OCR for page 67
86 CHILD AND ADOLESCENT HEALTH event for children: only 6.5 percent of children younger than 5 and 1.5 per- cent of those aged 5−17 have any hospital expenditures. With such small samples, further winnowing by specific diagnoses (e.g., those preventable), by subgroups of interest (e.g., by race/ethnicity or type of insurance cover- age), and by state would preclude stable or meaningful estimates. Two state-based data systems might prove more useful. The Kids’ In- patient Database (KID) contains data on all admissions for those younger than 20 from 38 states in the most recent compilation (HCUP, 2006). Data elements include primary and secondary diagnoses and procedures, admis- sion and discharge status, demographic information such as age and gender, hospital characteristics, length of stay and charges, and expected source of payment on 2−3 million discharges per year. While providing a substantial window on hospital use by children, however, this data set has significant limitations. Among these is the characterization of socioeconomic status, as the income data reflect the median income of the zip code of the hospital, not the income of the child’s family, and the insurance data (expected source of payment) may not be for the final payer. In addition, the data set does not permit linkage of multiple hospitalizations for the same child, nor does it provide much information on the events before and after hospitalization. Even with substantial numbers of events, quality indicators designed to parallel those used for adults may not occur in sufficient numbers to yield information on safety (Scanlon et al., 2008) or to support stratification by important covariates such as race/ethnicity, income, or insurance status (Berdahl et al., 2010). Other state-based assessments of child health can be obtained from the series of surveys funded by the Maternal and Child Health Bureau on general child health (NCHS, 2009c) and the health experience of children with special health care needs (NCHS, 2009b) based on the State and Lo- cal Area Integrated Telephone Survey (NCHS, 2009a). These surveys are designed to provide robust samples for analysis at the state level and a wealth of data on health conditions and functional status, insurance cover- age, use of medical care and other services, and individual family health behaviors for children generally and for the more vulnerable subgroup of those with special needs. As with the MEPS, however, the data come from parent reports and may be limited on any one issue because of the breadth of the topics covered. Unlike the MEPS, moreover, these surveys include no longitudinal component, so that assessing changes in health status or use of care is not possible. For the purposes of assessment of a hypotheti- cal state-based demonstration program, virtually no data on costs of care are available except for out-of-pocket costs for families with children with special needs. Thus, each of these data sets might provide some insight, but none would be sufficient to support a comprehensive assessment.

OCR for page 67
87 CURRENT DATA COLLECTION METHODS AND SOURCES Measurement of Health Insurance Coverage Another example of the limitations imposed by the fragmentation of current data collection systems is measurement of health insurance cover- age. Currently, there is no agreement on the number of children who are uninsured (CBO, 2003; Kenney et al., 2006; SHADAC and RWJF, 2009). Confusion as to the number of uninsured children arises in part because a range of different insurance concepts are relevant, in part because there is no proven method for collecting health insurance information, and in part because multiple surveys produce coverage estimates for children on an annual basis. A number of different insurance coverage concepts exist—for example, the number of children who are uninsured at a particular point in time, the number of children who have been insured for a year or longer, the number of children who experienced short periods (less than 12 months) without coverage in a 12-month period, and the average number of children who are uninsured over a particular period in time. A priori, one would expect the number of uninsured children to depend on the particular concept: the number of children who are uninsured for a full year is expected to be smaller than the number of children who are uninsured at a particular point in time, which in turn is expected to be smaller than the number of children who experienced any period without coverage in a given year. Indeed, ac- cording to one source, which includes measures of two different insurance concepts, the number of children who are uninsured at a particular point in time is 1.6 times larger than the number of those who are uninsured for a full year (Davern et al., 2009; Klerman et al., 2009). Each of the different insurance concepts provides valuable information about the nature of the coverage problem facing children. In particular, esti- mates of the number of children who are uninsured at a particular point in time are useful for budgeting purposes (Orszag, 2007). For example, when Medicaid and CHIP programs assess how eligibility expansions could affect program enrollment and spending, they rely on estimates of how many chil- dren are uninsured in the targeted income group. Similarly, knowing how many children are uninsured for a full year or longer provides important information on the extent to which uninsurance is a chronic problem for children, whereas knowing how many children experience short bouts of uninsurance could provide key insights about program operations related to churning (how individuals move back and forth between having and not having insurance) and retention (Tang et al., 2003). Since there is no proven method for accurately measuring a given in- surance concept, moreover, each survey’s approach to measuring the unin- sured differs along a number of dimensions that likely affects the estimated number of uninsured children. In particular, surveys differ in the wording

OCR for page 67
88 CHILD AND ADOLESCENT HEALTH of the insurance questions they include, the names used to designate dif- ferent Medicaid and CHIP programs, the order of the questions, whether the insurance questions pertain to a specific child or to multiple individuals in the family, who is providing information on the insurance coverage of a particular child, what survey mode is used to collect the data (e.g., mail, telephone, in person), whether the survey is cross-sectional or longitudinal (which likely affects duration-dependent concepts such as the number of children who have lacked insurance coverage for a full year), how missing data on coverage are handled, how a response that requires some interpre- tation is coded (e.g., when respondents reply that they have both private coverage and Medicaid), and whether an explicit attempt is made to adjust for what appears to be a systematic underreporting of Medicaid and CHIP coverage in household surveys (Kenney et al., 2006; SHADAC and RWJF, 2009). The factors listed here shape the coverage estimates that emerge from a particular survey. Four federal surveys—the CPS, the American Community Survey (ACS), the MEPS, and the National Health Interview Survey (NHIS)— currently provide annual estimates of the number of children who are uninsured. The ACS, MEPS, and NHIS all ask explicitly about coverage at the time of the survey, which corresponds to the point-in-time concept. The MEPS and NHIS also include measures of full-year uninsurance, with the MEPS tracking coverage over the course of a year through multiple interviews at 3- to 4-month intervals and the NHIS collecting information on current and prior coverage from a single interview. In principle, the CPS provides an estimate of the number of children who were uninsured for a full year. However, the survey’s long recall period (14−16 months) may lead to inaccurate responses, especially among individuals who were enrolled in Medicaid for a brief period in the previous calendar year or at the beginning of the previous calendar year (DeNavas-Walt et al., 2009; Klerman et al., 2009). For 2008, the most recent year for which official estimates are available from each of these surveys, the number of uninsured children aged 0−17 at a particular point in time ranges from 6.6 million on the NHIS to 10.7 million on the MEPS (the CPS [unadjusted] and ACS estimates are both 7.3 million). Not only is there disagreement about how many children lack health insurance coverage at a particular point in time nationally, but state- level estimates vary across surveys as well (Blewett and Davern, 2006; Call et al., 2007).

OCR for page 67
89 CURRENT DATA COLLECTION METHODS AND SOURCES THE NEED FOR A COORDINATED APPROACH TO INTEGRATE MEASURES OF CHILD AND ADOLESCENT HEALTH AND HEALTH CARE QUALITY Much progress has been made in developing and expanding the scope of measures of child and adolescent health and health care quality. How- ever, a comprehensive set of ideal measures does not yet exist for children and adolescents that can support the types of analyses needed in both of these areas. What is available instead is a patchwork of measures of health and health care quality drawn from different population surveys, admin- istrative data sets, and longitudinal studies of children and adolescents, each of which was designed for different specific purposes, as reviewed above. In the absence of a framework that can prioritize selected mea- sures of health outcomes, health services, or care processes, it is difficult to achieve an appropriate balance between population-based measures of health and service-based measures of health care quality. Separate efforts to strengthen both systems of measurement are currently under way at the federal, state, and local levels, as well as in private-sector initiatives (see, for example, How et al., 2011; IOM, 2011a; NQF, 2011). But the nation lacks a coherent strategy and process for coordinating these efforts and for establishing national priorities to guide emerging health informatics efforts at the federal, state, and local levels. One example of the latter activity is the new Health Indicators Warehouse, part of the Community Health Data Initiative (Bilheimer, 2010), which is aimed at improving data transparency and timeliness and access to federal health and health care data sets. The committee believes a coordinated approach is needed to link these data sets and recommended measures to accomplish several objectives: prioritize the health domains that should inform the next genera- • tion of quality improvement efforts; suggest strategies by which child health indicators could be devel- • oped from existing child and adolescent data sources; and identify gaps that should be addressed through future research on • health measures or enhanced data collection efforts. Any effort to create such an integrated approach is challenged by mul- tiple factors: a lack of consensus on the fundamental areas of health that are • important to monitor both for the general population of children and adolescents and for vulnerable groups; the absence of high-quality state-level data that make it possible to • monitor the health status of children and adolescents over time;

OCR for page 67
90 CHILD AND ADOLESCENT HEALTH a growing realization that children’s and adolescents’ health status • and levels of functioning are frequently influenced by social and economic factors; methodological challenges in establishing relationships among chil- • dren’s and adolescents’ health status, insurance status, use of health care services and their quality, care processes, and health outcomes; the recognition that access to and utilization of high-quality health • care services may be insufficient to compensate for adverse social and economic conditions within families and communities; and the persistent inability within various data sets to link measures of • children’s and adolescents’ health status with measures of social and economic status and family conditions. A coordinated approach is a necessary step toward building consensus on the definition of health and the types of health indicators that are impor- tant to monitor in assessing the health status of children and adolescents, especially those from disadvantaged and underserved communities. SUMMARY This chapter has provided an overview of current methods used to col- lect data and demonstrated how the consistency and rigor of measurement methods are directly associated with the quality of the data collected. In examining the measurement of child and adolescent health and health care, the committee identified several key findings that highlight areas in which current measurement efforts fall short. In particular, the evidence reveals a need for greater consistency, standardization, and interoperability of data. From its examination of the evidence, the committee determined that consistent standards for data elements, based on common definitions of key concepts, are necessary to facilitate the integration of data across health care systems and geographic areas. In particular, greater consistency is needed in measuring such characteristics as insurance coverage. Improving linkages among administrative record systems and between population-based survey data sets and administrative records would enhance the comprehensive as- sessment of child and adolescent health and the quality of their health care. Finally, the emergence of EHRs and personal health records (PHRs) has the potential to provide an important and novel source of primary data for as- sessing health and health care quality. The committee believes that the use and interoperability of EHRs and PHRs will create a robust source of data that can be readily analyzed and acted upon.