| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 61
Chapter 4
SUMMARY AND RECOMMENDATIONS
The study reported here was conducted by the Institute of Medicine to
assess the reliability of six selected items of information describing
the utilization of hospital services by a sample of Medicare benefi-
ciaries during calendar year 1974. The information was obtained from
a Medicare record created from a data base maintained by the Health
Care Financing Administration (HCFA).
The analysis was needed to assist in identifying an existing data base
that might be used in assessing the effects of Professional Standards
Review Organizations (PSROs), since baseline data were not gathered prior
to implementing the PSRO program. The study is a logical extension of an
earlier TOM examination of the reliability of hospital utilization data
compiled by private abstracting services and based on abstracts of med-
ical records.
The Medicare data and information compiled by abstracting services con-
stitute two of the very few existing national data bases for monitoring
utilization of health services. Both are potentially useful for a varies'
of health services research, policy, and administrative needs, in addition
to the PSRO evaluation, provided they are sufficiently reliable.
The accuracy of six information items on the Medicare record was determined
by comparing those items with the results of an independent abstracting of
patient medical records by a trained field team and noting the frequency
and type of discrepancies. In addition, selected hospital characteristics
and the process by which data are forwarded by the hospitals to the fiscal
intermediaries were studied to determine if any were related to the accuracy
of abstracted data and to assist in identifying areas for improvement.
The analysis showed that information on hospital admission date, discharge
date, and patient's sex was highly reliable. For all principal diagnoses
combined, however, when codes were compared to four digits, the Medicare
record and the ION abstract agreed for only 57.2 percent of the cases.
For a small percent of discrepancies, the field team was unable to state
with certainty which data source was correct. If these cases are re-
distributed, the increased percent of correct Medicare records ranges
from 59.8 to 64.1. The presence of additional diagnoses was accurately
61
OCR for page 62
62
noted on 74.5 percent of the records. The Medicare record and the IOM
abstract agreed on principal procedure in 78.9 percent of the cases.
These findings are very similar to those from the earlier study (see
Appendix I). In particular, when the principal diagnoses on the original
abstract and IOM re-abstract were compared to four digits for the total
data base (both Medicare and Medicaid patients), 65.2 percent of the
cases were in agreement; for the Medicare portion of the data' 64.4
percent agreed. Comparable levels of agreement on principal procedure
in the earlier study were 73.2 percent for the total data base and 72.5
percent for the Medicare patients alone.
Interpretation of the findings for diagnoses and procedures in this Med-
icare study is extremely difficult for a variety of reasons. The infor-
mation is highly technical and almost every statement needs to be quali-
fied. The limitations of the study should also be noted. The set of
information items examined is very restricted and relevant primarily
for utilization statistics and, in particular, diagnostic-specific util-
ization statistics. The data were gathered in only seventy-one hospi-
tals, although they were selected and weighted to be representative of
all Medicare discharges for persons aged sixty-five and older from short-
term general hospitals during 1974. The weights used in the analysis
may have introduced some instability. There may have been instances In
which the field work was not reliable. Finally, the role of the fiscal
intermediary in data reliability is not known. Although some conclusions
are reached regarding the relative contributions of hospitals and the
Health Care Financing Administration to the accuracy of Medicare records,
they are based on the assumption that the intermediaries do little more
than simply forward information received frog the hospitals to HCFA. If
additional error is introduced by the intermediary, it is not reflected
in the IOM findings. In general, extreme care was taken in designing
the sample, conducting the field work, and processing and analyzing
the data. Any error associated with the limitations specified above
are expected to be small.
Despite these limitations, it is possible to draw some rather firm con-
clusions about the accuracy of information on diagnoses and procedures
contained on the Medicare record and the factors that appear to influence
reliability. The credibility of the findings is heightened by the strik-
ing similarity to the findings of the earlier study of the reliability of
utilization data processed by private abstracting firms.
CONCLUSIONS
Individual diagnoses contribute to the overall level of reliability.
The data were quite reliable for some diagnoses. As examples, no
discrepancies were found on 97.3 percent of the records for patients
with cataract; 96.7 percent for inguinal hernia without mention of
obstruction; 89.9 percent for bronchitis; 87.1 percent for hyperplasia
of the prostate; and 86.5 for diverticulosis of the intestine--all
OCR for page 63
63
analyzed to four-digits. However, discrepancies were found for
63.2 percent of the records for patients with chronic ischemic
heart disease; 50.3 percent for diabetes; 41.9 percent for intes-
tinal obstruction without mention of hernia; 41.6 percent for con-
gestive heart failure; and 41.5 percent for cerebrovascular disease.
Similar findings resulted from the previous analysis (see Appendix J
for comparisons of reliability of diagnoses common to both studies).
In particular, no discrepancies were found on 94.3 percent of the
records for Medicare patients with cataract; and 87.9 percent for
inguinal hernia without mention of obstruction. Discrepancies were
found for 72.1 percent of the records for Medicare patients with
chronic ischemic heart disease; and 39.5 percent for diabetes
mellitus.
2. For most diagnoses that had a relatively low level of reliability,
the percent of cases with no discrepancy increased if the analysis
was confined to those with no additional diagnoses. This was par-
ticularly evident for chronic ischemic heart disease, acute myo-
cardial infarction, diabetes, congestive heart failure, and intes-
tinal obstruction without mention of hernia. Since the presence
of additional diagnoses influences the accuracy of data, adjustments
in analysis might be made if the fact of co-morbidity were accurately
noted on the Medicare claim form. However, this occurs for only 74.5
percent of the discharges. In most cases, the inaccuracy stems from
an incomplete review of the medical record within the hospital.
The prior study raised the possibility that the designation of
primary diagnosis was likely to be more difficult for cases with
co-morbidity. Directly comparable data were not gathered, however.
3. The reliability of diagnostic information varies according to the
level of coding refinement. For all diagnoses combined, AUTOGRP
comparisons were most reliable and comparisons to four digits of
the diagnostic codes were least reliable. The increased reliability
of the more aggregated data must be balanced against the loss of
precision in the information, however.
AUTOGRP was not used in the earlier study, but three-digit compari-
sons were more reliable than four-digit.
4.
There does not appear to be a systematic bias on the part of hospi-
tals to submit on the claim an '.admitting diagnosis.' (reflecting the
patient's condition at the time of hospital admission) instead of a
"principal diagnosis" (established after a more thorough review of
the medical record). This is generally true regardless of whether
discrepancies were found between the principal diagnoses on the
Medicare record and the IOM abstract.
The earlier report did not include a similar analysis.
OCR for page 64
64
5.
Sometimes the field team detected a discrepancy between their work
and the Medicare record, but after re-examining the patient's medi-
cal record, they were unable to state with certainty which was cor-
rect. Instead, they concluded that it was a matter of judgment and
that data on either document was equally acceptable. The problem
stemmed from difficulty in determining which diagnosis should be re-
garded as principal. This situation was obtained for 4.6 percent of
all diagnoses combined and for 1.7 percent of all procedures.
Comparable percents for individual diagnoses ranged from 7.6 percent
for chronic ischemic heart disease to 1.4 percent for bronchopneumonia
and pneumonia unspecified.
Similar figures in the earlier study were somewhat higher--10.7 per-
cent for all diagnoses combined and 16.3 percent for all procedures.
For individual diagnoses the percents ranged from 0.3 for fracture
of neck of femur to 18.0 for low back pain. The presence of such
cases in both studies suggests that for some patients it may be very
difficult to isolate a single primary diagnosis responsible for hos-
pital admission--either because of inadequacies in physician's record-
ing habits or because of limitations in current diagnostic classifi-
cation schemes.
6. The difficulty of determining a principal diagnosis with certainty
was confirmed in the independent assessment of the field work
(see Appendix F). The levels of discrepancies between the
consultant and the field team are not directly comparable to those
between the field team and the Medicare record. However, when the
consultant and field team disagreed, quite often both also disagreed
with the Medicare record. This happened for about fifty percent
of the discrepant principal diagnoses and about forty-one percent of
the discrepant principal procedures in the assessment of the field
work. In other words, each of the three data sources contained
different pieces of information--all based on the same patient
medical record.
Similar variability was found in the assessment of the field work
in the earlier study.
7. The reasons for discrepancy vary by diagnosis. For most diagnoses
with high discrepancy rates and a likelihood of co-morbidity where
the TOM abstract was found to be correct (especially chronic ischemic
heart disease, diabetes, and congestive heart failure), discrepancies
occurred primarily because of erroneous selection of principal diag-
nosis (an ordering discrepancy), rather than mistakes in assigning a
code number (a coding discrepancy). Diagnoses whose reliability im-
proved with less specific coding tended to have coding discrepancies.
These findings affirm the conclusions of the earlier study.
OCR for page 65
65
8. Several factors affect the reliability of information on procedures,
An incomplete review of the medical record is the single most fre-
quent reason for error and accounts for about forty percent of the
discrepancies when the IOM abstract was correct. For about sixty-
three percent of the discharges, no procedure was coded, presumably
because no procedure was performed and/or the hospital billing
office did not submit this information to the fiscal intermediary.
The field team agreed with this determination in about ninety per-
cent of the cases. When the analysis was confined to cases for
which a procedure was coded on the Medicare, record, the level of
agreement on principal procedure dropped to about fifty-seven per-
cent.
The reasons for discrepancies on principal procedure are not
directly comparable between studies. Nevertheless, the percents
of discharges for which no procedure was coded are similar (65.2
percent in the initial study). The field team agreed with this
determination in 86.9 percent of the cases. The level of agree-
ment dropped to 64.8 percent when only cases containing a procedure
code were compared.
9. When the IOM abstract was correct, many of the discrepancies orig-
inated in hospital procedures. Many hospitals routinely select the
first-listed diagnosis on the face sheet of the medical record as the
principal diagnosis, regardless of additional documentation in the
record. The other major related reason that led to discrepancies
(both ordering and coding) was failure to adequately review the med-
ical record before designating a principal diagnosis and/or principal
procedure. Perhaps because of these practices, the claims information
submitted to the fiscal intermediary was frequently incorrect. More
specifically, when there were discrepancies between the Medicare re-
cord and IOM abstract, the information on principal diagnosis sub-
mitted by the hospital on the claim form did not accurately reflect
the patient's condition for about seventy percent of the cases. Com-
parable figures for principal procedure were about forty-three percent
for all claims for which discrepancies were detected and about fifty-
five percent when the analysis was confined to cases for which the
Medicare record indicated that a procedure had been perforated.
These findings confirm similar, but more tentative, impressions
raised in the earlier report. The two studies reinforce one another,
but the findings fray the Medicare study about the influences of hos-
pital policies on data reliability are more persuasive.
10. Coding by HCFA coders also influences the reliability of the Medicare
record, but the variability within the coding function is difficult
to specify. When cases with correct claims information but incorrect
Medicare information for principal diagnosis were submitted to more
senior Medicare RRAs for re-coding, agreement with the claims
codes reached about forty-two percent. The comparable figure for
principal procedure was about sixty percent. Though the level of
agreement increased, an unexplained residual of discrepancies remained.
OCR for page 66
where the claims information submitted by the hospital was incorrect,
about seventy-five percent of the codes on the Medicare record for both
principal diagnosis and procedure agreed with the first-listed item
on the claim form, and the overall percents remained fairly constant
upon recoding. For these cases the work of the Medicare coders was
apparently reliable, even though the narratives submitted by the
hospitals on which their codes were based were not. When the ~iedi-
care record did not agree with the first-listed diagnosis or re-
lated procedure on the claim form, about one-half of the cases re-
flected the appropriate application of a Medicare coding guideline
requiring the coding of something other than the first-listed item.
Nevertheless, an unexplained residual of discrepancies persisted.
The earlier study did not contain claims information, and there was
no parallel assessment.
11. If one examines the number of cases that should have been coded as
specific diagnoses but were not, the influence of false positive
and false negative diagnoses on admission rates and lengths of stay
can be assessed. Proxy admission rates based on the Idol abstracts
(including false negatives) suggest that reliance on Medicare data
with both three- and four-digit comparisons will under-estimate the
number of admissions for most diagnoses except chronic ischemic nears
disease and diabetes mellitus. Similar results are found when AUTOuRP
is used to analyze entire DRGs. Consistent differences were not de-
tected for diagnostic-specific lengths of stay, although variations
were noted.
ION admission rates were higher in the earlier study as well--again,
with the notable exception of chronic ischemic heart disease. 'The
earlier study also suggested that diagnostic-specific lengths-of-
stay based on abstract service data may over-estimate the average
stay; consistent differences for length-of-stay were not detected in
the Medicare study.
12. Some hospital characteristics were associated with tne reliability
of data. Hospital location within the Northeast census region was
consistently associated with less reliable data. Billing office
personnel with no medical record experience appeared to provide
accurate diagnostic information, if medical record department per-
sonnel abstracted the medical record. The accuracy of the data was
greater if coded, rather than narrative, information was sent to the
billing offices. The submission of updated diagnostic and procedure
information from the medical record department to the billing office
and fiscal intermediary increased the accuracy of the data. lathe
role of the physician in completing the medical record was also an
important variable.
'The hospital characteristics considered in the earlier study dif-
fered somewhat from those included in the Medicare study, but there
OCR for page 67
67
are some similarities. Variables associated with increased accuracy
of the abstract service data included the thoroughness with which
the medical record was reviewed before code numbers were assigned,
frequency of communication between the hospital and abstract service,
regular medical staff review and use of abstracted reports, and pos-
sibly, hospital location within a Standard Metropolitan Statistical
Area. Because of the frequent difficulty encountered by the field
team in determining which of several diagnoses should be regarded
as principal, the care with which the physician completed the record
was also considered as an influential structural variable, even
though it was not measured directly.
The cumulative effect of both studies elicits serious reservations
about the adequacy of existing hospital utilization data on diag-
noses and procedures. The Social Security Administration (and more
recently Health Care Financing Administration) is to be commended
for successfully initiating and continuing to administer and monitor
a program of critical significance to the health and- social welfare
of millions of Americans. Private abstracting services have made
a similar, if less monumental, contribution by providing informa-
tion with the potential to assist in administering and monitoring
the provision of hospital services by their subscribers. The
uncritical processing of diagnostic-specific utilization information
must be questioned, however, since increasingly important decisions
regarding the adequacy of health care, the allocation of resources,
and, perhaps, hospital reimbursement rates will be based on these
data. To improve the quality of future data and to assist in ap-
propriately using existing data, the following recommendations are
offered.
RECOMMENDATIONS
1.
3.
One must assume that diagnostic and procedural data on Medicare
records contain errors and use them with caution. As in the
previous study, the seriousness of the, error depends on the
purpose to which the data are applied.
Existing Medicare data are adequate for some aspects of general
program monitoring, such as descriptions of general utilization
patterns by age and sex or comparisons of overall lengths-of-
stay among hospitals for entire institutions. However, diagnostic-
specific discrepancies are of sufficient magnitude to preclude the
use of such data for detailed research and evaluation or to mea-
sure diagnostic case mix as an indication of intensity of services
that could then form the basis for determining reimbursement rates.
Similarly, the usefulness of Medicare data on principal procedures
should be seriously questioned.
The reliability of existing and future information may be improved
by selective adjustments (assuming that the same kinds of errors
OCR for page 68
68
persists. It was noted that the accuracy of diagnostic information
increases when diagnostic codes are analyzed to three digits, rather
than four, or when broader diagnostic classifications such as AUTOGRP
are used. For determining basic utilization trends, and, in some
instances, lengths of stay these more aggregate analyses are suf-
ficient. The increased reliability resulting frog less precise
coding must be balanced against the loss of ability to detect the
presence of complications (usually denoted by the fourth digit),
which may significantly affect length of stay, however. In addition,
adjustments for differences in patient mix may be approximated
in the future if the presence or absence of additional diagnoses
were accurately noted.
4. If the Medicare data are to be used by Professional Standards Review
Organizations for either routine review activities or program eval-
uation, many of the recommendations from the earlier study are also
appropriate here. If quality assurance programs discontinue the
current practice of reviewing all patients and physicians and move
to a more targeted review of cases likely to be associated with
poor quality, in many cases this will require improving the data
base in order to detect changes in utilization patterns. As an
example, it is quite likely that criteria for hospital admission and
continued stay for a diabetic patient with mention of ketoacidosis
and coma (code 250.0, using HCFA's modification of ICDA-8) would be
quite different from criteria for a patient without mention of acido-
sis or coma due to diabetes unspecified (code 250.9~. In order to
evaluate the effect of review, it is essential that diagnostic in-
formation be accurately coded to the fourth digit.
5.
The likely reliability of data on diagnoses and procedures might be
one consideration for selecting cases for targeted review. This
should not be the only criterion, however, since it may result in
eliminating from review those diagnoses or conditions for which
both the quality of care and data are questionable. In such cases
it would be important to intensify review efforts, but also to im-
prove medical recording and diagnostic coding.
6. Whenever Medicare data are used to measure changes in utilization
patterns, the amount of error, including the influence of false
negative and false positive diagnoses, must be assessed at each
time that measurements are taken. This is essential in order to
determine whether perceived changes are truly associated with altered
utilization or, instead, with changes in the reliability of data.
If Medicare data are used by PSROs, reliability must be assessed
at the local level, as well as nationally. The Health Standards and
Quality Bureau of HCFA (formerly the Bureau of Quality Assurance)
should develop guidelines to assist in such assessments.
7. Because much of the error in Medicare data is introduced at the
hospital level, hospitals should be assisted in developing programs
OCR for page 69
69
to improve the information submitted on Medicare claim forms. This
should include additional training for persons abstracting informa-
tion from the medical record, routinization of hospital procedures
so that the activities of billing personnel could be limited to infor-
mation transfer, rather than interpretation of medical record data,
and instructional programs for physicians in classifying diagnoses,
determining primary diagnosis, and completing the medical record.
8. If the current hospital practice of determining principal diagnosis
and principal procedure by referring to the first-listed item on the
face sheet of the medical record continues, several steps should be
taken to assure that the first-listed item is appropriately selected.
The medical record format should be revised, so that the sequence of
conditions recorded are in order of priority, with the principal
diagnosis listed first. Additional training of physicians and med-
ical record personnel is required, as noted above. Alternatively,
a more comprehensive review of the medical record might be required
before selecting a primary diagnosis and/or procedure.
Officials of the Health Care Financing Administration should initiate
intensified efforts to assure the reliability of HCFA coding of in-
formation on diagnoses and procedures. In addition, serious con-
sideration should be given to up-dating or replacing the Current Pro-
cedural Terminology classification for procedures.
10. Data recording and reporting guidelines should continue to require
that diagnoses are coded to at least four digits. This is the only
way to assure that the resulting data base will have sufficient
flexibility to meet a variety of data needs. For less precise re-
quirements, the data can easily be analyzed to three digits only.
The influence on reliability of coding to five digits, as antici-
pated by the 9th International Classification of Diseases-Clinical
Modification, should be thoroughly evaluated.
Additional research is needed to develop more appropriate classi-
fication schemes for describing patient status that could be in-
corporated into health information systems. These should include
explicit consideration of signs and symptoms, co-morb~dity, and
functional status. For many patients it may be unrealistic to
expect that a single primary diagnosis can or should be determined.
12. Regardless of the type of activity initiated to improve the quality
of abstracted information, assessments should be made of the result-
ing improvements in reliability and the associated costs. Increas-
ing the quality of data is likely to be costly, and careful evalua-
tion would help to insure that only the most effective methods are
disseminated.
The deficiencies in the accuracy of currently available diiagnostic Ian sens
OCR for page 70
70
the need for a comprehensive and reliable national health information
system, however. Because of the likely expense of improving the
reliability of data and maintaining a high level of accuracy, the
Department of Health, Education, and Welfare should explore the feasi-
bility of integrating the multiple data systems located thoughout its
component agencies. The National Center for Health Statistics' Hospi-
tal Discharge Survey has recently incorporated dHDuS definitions for
principal diagnosis and principal procedure. Therefore, the Department's
activities might begin with an assessment of the reliability of that
information and an exploration of the potential for expanding the survey
to include information needed for management, reimbursement, program
evaluation, and epidemiological purposes.
Representative terms from entire chapter:
medicare record