| ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 52
3
Attributes of Good Practice Guidelines
Remember the drunk who searched for his keys under the lamp post
because that's where the light was? Science is a highly systematic process
of creating lamps and then looking under them.
David Warsh, Washington Post
Developing practice guidelines that enlighten practitioners and patients
is an exceptionally challenging taste It requires diverse skills ranging from
the analysis of scientific evidence to the management of group decisioIlmak-
ing to the presentation of complex information in useful forms. Although
the need for these skills has not always been recognized in the past, the
recent focus on guidelines is bringing not only a greater awareness of what
is required for their development but also a higher level of expertise to the
field. The Office of the Forum for Quality and Effectiveness in Health Care
should make every effort to reinforce this trend as it works with contractors,
expert panels, and others to develop and disseminate practice guidelines.
This chapter describes eight attributes that the committee believes
are essential if a set of guidelines are to serve their intended purposes
of assisting practitioners and patients, providing a better foundation for
the evaluation of services and practitioners, and improving health out-
comes. These attributes are ideal characteristics to which real guidelines
are unlikely to conform fully either now or in the future. However, in the
committee's judgment, guidelines can approach these ideals to a greater
extent than has generally been achieved to date.
52
OCR for page 53
ATIRIBl1ES HE G - D GUIDELINES
53
The next four sections review the context, working assumptions, prin-
ciples, and sources that guided the committee in developing its list of
attributes, followed by a discussion of the attributes themselves. This
chapter, however, is not intended either as an exhaustive description of
how guidelines should be developed or as an endorsement of one specific
method.) The discussion In this chapter focuses on attributes of guidelines
rather than attributes of medical review criteria, standards of quality, and
performance measures. The recent IOM report on quality assurance in
the Medicare program (1990d) discusses some attributes that good medical
review criteria should have, for example, specificity and sensitivity.
One further introductory point: the committee has urged AHCPR and
its Forum to focus their efforts on guidelines for clinical conditions rather
than specific treatments or procedures. This focus will undoubtedly make
their task more difficult: a consideration of conditions generally involves a
broader look at alternatives, evidence, practice settings, and outcomes. The
result, however, should be guidelines that are both more broadly and more
specifically useful tO clinicians and patients. The discussion of attributes in
this chapter reflects this emphasis on conditions rather than procedures.
BACKGROUND AND TERMINOLOGY
OBRA 89 specifies that "the Director tof the Forum] shall establish
standards and criteria tO be utilized by the recipients of contracts" for "de-
veloping and periodically reviewing and updating" guidelines, standards,
performance measures, and review criteria. Confusion is likely if "criteria
arid standards" are used to label both the bases for prospectively assessing
practice guidelines and the bases for assessing clinician practice. Con-
sequently, to reduce possible terminological confusion, this report refers
to "attributes of guidelines" rather than to "standards and criteria" for
"guidelines, standards, perfonnance measures, and review criteria." Syn-
onyms include properties and characteristics.
The Forum must be able to employ the list of attributes set forth
in this chapter in at least two ways. First, it will need to communicate
its expectations in advance to the contractors or expert panels that may
develop guidelines for the agency. Second, the Forum and potenha1 users
of the guidelines must be able to assess the soundness of a given set of
guidelines after they are developed. The IOM expects in a second project
11le list of works by Eddy, Gottlieb and associates, and Park and colleagues at the end of this
chapter contains more detailed discussions of processes for developing guidelines.
21lis language generally follows the precedent set by the TOM report Medicare: A Strategy for
QuaI'~Assurance (199Od). It is also consistent with the booklet "Attnbutes to Guide the Devel-
opment of Practice Parameters" (AMA, 1990a).
OCR for page 54
At
CLINICAL PRACTICE GUIDELINES
to prepare a practical assessment instrument that the Forum can use to
systematically review guidelines developed by its panels or by other groups
(Appendix C).
During the committee's deliberations, a question was raised about
whether the Forum has formal authority under OBRA 89 either to reject
or approve the guidelines developed by its contractors or expert panels.
This report does not speak to that legal point. Nevertheless, regardless
of the Forum's statutory authority in this regard, it is reasonable that the
agency should examine the soundness of guidelines developed under its
auspices. This examination may (1) improve the way the agency works
with contractors or panels in the future, (2) contribute to more informed
consideration of dissemination options and evaluation strategies, (3) allow
more sophisticated consultations with HCFA and other government agen-
cies about their use of the guidelines, and (4) provide feedback about the
feasibility of the assessments proposed here.
In this report, assessment means the prospective or initial judgment
of the soundness and feasibility of a set of guidelines. In contrast, the
empirical evalu anon of the COSt, quality, and other effects of guidelines
occurs after they are published and implemented.
Further, a set of guidelines includes a series of statements or recom-
mendations about appropriate practice and the accompanying descriptions
of evidence, methodology, and rationale. A guideline in the singular refers
to a discrete statement or recommendation (for example, annual breast
physical examination for women aged 40 to 49 with no family or personal
history of breast cancer). Each of the appropriateness reports published
by the RAND Corporation clearly exemplifies a set of guidelines (Park et
al., 1986~. Likewise, using this terminology, the report of the U.S. Preven-
tive Services Disk Force (1989) contains 60 sets of guidelines and not 60
guidelines.
WORKING ASSUMPTIONS
The committee's first working assumption has been that a set of guide-
lines will be assessed as a whole; that is, its elements will not be assessed
individually in isolation. Under this assumption the Forum could judge
a set of guidelines acceptable even if individual statements lacked for
legitimate reasons some essential attributes. Realistically, early guidelines
and (especially) existing guidelines are not likely to score well on all eight
attributes collectively. However, the committee expects that, as the devel-
opment process matures, guidelines will continue to comprise more and
more of the attributes.
Second, the committee assumes that the Forum will (in line with OBRA
89 provisions) convene expert panels to assess either existing guidelines or
OCR for page 55
ATTRIBUTES OF GOOD GUIDELINES
55
guidelines for which the Forum has contracted. These panels will need to
make both objective and subjective assessments guided by instructions from
the Forum. This report is a step toward the preparation of an assessment
instrument that the expert panels can use in their reviews and deliberations
(Appendix C). The AMA has recently taken a similar step by developing a
preliminary worksheet to evaluate what it terms practice parameters (AMA,
l990b).
Third, the committee sees the initial assessment of guidelines as part of
an evolutionary process of guidelines development, assessment, use, evalu-
ation, and revision. This evolutionary process will involve the government,
professional organizations, health service researchers, consumers, and oth-
ers. As a result, the committee fully expects the set of attributes presented
here to be tested, reassessed, and revised, if necessary.
PRINCIPLES
The identification of attributes of practice guidelines rests on four
principles. These principles call for:
usage;
and
· clarify in the definition of each attribute;
· compatibility of each attribute and its definition with professional
· clear rationales or jusiificanons for the selection of each attribute;
.
sensiiivify to practical issues in using the attributes to assess actual
sets of practice guidelines ("assessibility"~.
That the definition of an attribute be clear and succinct is obviously
desirable, although often difficult when one is working with very abstract
or technical concepts. It is also desirable that the term used to label an
attribute be recognizable and consistent with customary professional usage.
The label should be a single word or short phrase that is carefully chosen to
convey the core concept. (Thus, attributes will not be described by number,
for example, Attribute No. 1.)
The rationale or justification for each attribute should be clearly de-
scribed, and it should also be consistent with the professional and technical
literature and the legislative mandate. The rationale should describe explic-
itly any trade-offs between the theoretically ideal attribute and the practical,
usable one.
Practicality requires that attributes be definable in operational as well
as conceptual terms; that is, it should be possible to devise an instrument
that instructs assessors of a set of guidelines on how they can determine
whether the guidelines conform to the attributes. Not only is this necessary
if the Forum is to judge the soundness of the guidelines that emerge from
OCR for page 56
56
CLINICAL PRACTICE GUIDELINES
its expert panels; it is fundamental that the Forum instruct developers of
guidelines on the desired properties of guidelines and on the documentation
needed as a basis for assessment. As mentioned earlier, the development
of a formal instrument for assessing guidelines is an important next step
for this committee.
More generally, the number of attributes must be sensible and practical.
An appropriate balance must be reached between enough attributes to allow
adequate assessment of the guidelines but not so many that the assessment
exercise becomes infeasible, confusing, or excessive, given limited resources.
It is likely that an instrument for assessing guidelines will need to weigh!
the eight attributes in some manner, specifying which of them are more
significant in determining whether a given set of guidelines are sound. Given
its time and resource constraints, this committee did not systematically
rank the different attributes by relative importance, although the discussion
below does distinguish some of the more important ones.
A final point: this report differentiates between the pnonnes for select-
ing particular targets for guidelines and the desirable tributes of guidelines.
The attributes listed in this chapter do not incorporate the OBRA 89 pro-
visions requiring that priorities for the development of guidelines reflect
the needs and priorities of the Medicare program and include clinical
treatments or conditions accounting for a significant portion of Medicare
expenditures.
The legislation also calls on the Secretary of Health and Human Ser-
mces to consider the extent to which guidelines can be expected "(i) to
improve methods of prevention, diagnosis, treatment, and clinical manage-
ment for the benefit of a significant number of individuals; (ii) to reduce
clinically significant variations among physicians in the particular services
and procedures utilized in making diagnoses and providing treatment; and
(iii) to reduce clinically significant variations in the outcomes of health care
services and procedures." In arriving at its eight recommended properties
of guidelines, the committee did not incorporate these factors. Priority
setting is a crucial but separate task and one that IOM has undertaken as
part of other studies (IOM, 1990a,b,c,e).
PAST WORK ON DEFINING ATTRIBUTES
This committee considered three primary sources in identifying at-
tributes for practice guidelines: (1) the legislation, (2) the IOM report on
quality assurance for Medicare, and (3) work by the AMN Other important
materials, which in some cases were used in the primary sources, include
the work of Brook, Chassin, Eddy, Greenfield, and their collaborators, as
cited elsewhere in this report.
In addition to describing priorities lo guide the Forum in selecting
OCR for page 57
AlTRIBUTES OF GOOD GUIDELINES
57
topics for guidelines, OBRA 89 sets forth some characteristics that guide-
lines should have. The committee distinguished these four points from the
legislation.
1. Guidelines should be based on the best available research and
professional judgment regarding the eRectiveness and appropriateness of
health care services and procedures.
2. The Forum director is expected to ensure that appropriate, inter-
ested individuals and organizations will be consulted during the develop-
ment of guidelines.
3. The director has the power to pilot-test the guidelines.
4. Guidelines should be presented in forms appropriate for use in
clinical practice, in educational programs, and in reviewing quality and
appropriateness of medical care.
A second major source for the committee's work, the IOM report on
a quality assurance strategy for Medicare (1990d), included a chapter on
attributes of quality of care and appropriateness criteria. These attributes
derived from a June 1989 meeting of experts on the construction and use of
practice guidelines. Some of the distinctions proposed by the quality panel
are not used here. For example, this committee's report emphasizes key
attributes of good guidelines but contains relatively little discussion of de-
sirable but less critical attributes. In addition, this report drops the panel's
distinction between substantive and implementation guidelines because the
committee found it awkward to label every attribute as either one or the
other. The point that lay behind the original distinction should, nonethe-
less, be stressed: the designers of guidelines need to keep implementation
in mod whether and how the guidelines can be used.
A third source considered by the committee was the Amps booklet,
"Attributes lo Guide the Development of Practice Parameters" (1990a),
which sets forth five attributes. They are (minus their accompanying dis-
cussion and more detailed descriptions) as follows: (1) practice parameters
should be developed by or in conjunction with physician organizations; (2)
reliable methodologies that integrate relevant research findings and appro-
priate clinical expertise should be used to develop practice parameters; (3)
practice parameters should be as comprehensive and specific as possible;
(4) practice parameters should be based on current information; and (5)
practice parameters should be widely distributed.
ATTRIBUTES FOR ASSESSING PRACTICE GUIDELI~S: O~RW
The art of developing practice guidelines is in an early stage, and the
strengths and weaknesses of specific approaches are still being debated.
As a consequence, the committee recognizes that what is expected of
OCR for page 58
58
CLINICAL PRACTICE GUIDEr [NES
guidelines, ~ terms of their development and implementation, will need to
evolve beyond these initial specifications.
Table 3-1 lists the eight attributes for assessing guidelines that the
committee identified. One theme emphasized here, which ties these guide-
line attributes together, is credibility-credibility with practitioners, with
patients, with payers, with policymakers. This theme encompasses the sci-
entific grounding of the guidelines, the qualifications of those involved In
the development process, and the relevance of the guidelines to the actual
world in which practitioners and patients make decisions.
A second and related theme is the importance of accountability, a
key element of which is disclosure. That is, the committee expects that
procedures, participants, evidence, assumptions, rationales, and analytic
methods will be meticulously documented preferably in an accompanying
background paper. This documentation will help those not participating
in any given process of guidelines formulation to assess independently the
soundness of the developers' work
Explanations should be provided for any conflict or inconsistency be-
tween the guidelines in question and those developed by others. The issue
of disagreement or inconsistency among practice guidelines is an important
one for patients, practitioners, managers, payers, and policymakers. As
discussed in Chapter 5 of this report, merely identifying inconsistencies in
guidelines says nothing about the legitimacy of such differences. Careful
documentation of the evidence and rationales can help potential users of
guidelines judge whether inconsistencies arise from differences in the in-
terpretation of scientific evidence, from differences in the care taken in
developing the guidelines, or from other factors.
VALIDITY
In the committee's view, the validity of practice guidelines ranks as
the most critical attribute, even though it may be the hardest to define and
measure. Conceptually, a valid practice guideline is one that, if followed,
will lead to the health and cost outcomes projected for it, other things being
equal. In the research literature, validity is commonly defined by three
questions. Do the instruments for measuring some concept (for example,
quality of care) really measure that concept? Does the relationship or effect
that the researchers assert exists (for example, following a set of guidelines
improves quality of care) really exist? Can that relationship be generalized
(for example, from clinical trials to everyday medical practice)?
Until a guideline is actually applied and the results evaluated, validity
must be assessed primarily by reference to the substance and quality of
the evidence cited, the means used to evaluate the evidence, and the
OCR for page 59
AlTRIBlJTES OF GOOD GUIDELINES
TABLE ~1 Eight Attributes of Good Practice Guidelines
59
Attribute
Discussion
Validity Practice guidelines are valid if, when followed, they lead to
the health and cost outcomes projected for them, other
things being equal. A prospective assessment of validity
will consider the projected health outcomes and costs of
alternative coupes of action, the relationship between the
evidence and recommendations, the substance and quality
of the scientific and clinical evidence cited, and the means
used to evaluate the evidence.
Reliability/ Practice guidelines are reliable and reproducible (1) if-
reproducibility given the same evidence and methods for guidelines
development-another set of experts would produce
essentially the same statements and (2) if-given the same
clinical circumstance~the guidelines are interpreted and
applied consistently by practitioners or other appropriate
parties. A prospective assessment of reliability may
consider the results of independent external reviews and
pretests of the guidelines.
ainica1 Practice guidelines should be as inclusive of appropriately
applicability defined patient populations as scientific and clinical
evidence and expert judgment permit, and they should
explicitly state the populations to which statements apply.
Clinical Practice guidelines should identify the specifically known or
flexibility generally expected exceptions to their recommendations.
Clarity Practice guidelines should use unambiguous language,
define terms precisely, and use logical, easy-to-follow modes
of presentation.
Multidisciplinary Practice guidelines should be developed by a process that
process includes participation by representatives of key affected
Soups. Participation may include serving on panels that
develop guidelines, providing evidence and viewpoints to the
panels, and reviewing draft guidelines.
Scheduled review Practice guidelines should include statements about when
they should be rewewed to determine whether revisions are
warranted, given new clinical evidence or changing
professional consensus.
Documentation The procedures followed in developing guidelines, the
participants involved, the evidence used, the assumptions
and rationales accepted, and the analytic methods employed
should be meticulously documented and described.
OCR for page 60
60
C! rNICAL PRACTICE GUIDELINES
relationship between the evidence and recommendations.3 In the context of
the Forum's practical needs, the committee recommends that an assessment
of validity look for 11 elements in a set of guidelines. These elements are
listed below:
Projected health outcomes
Projected costs
Relationship between the evidence and the guidelines
Preference for empirical evidence over expert judgment
Thorough literature review
Methods used to evaluate the scientific literature
Strength of the evidence
Use of expert judgment
Strength of expert consensus
Independent review
Pretesting.
PROJECTED HEALTH OUTCOMES
A key reason for developing and using practice guidelines is the
expectation that they will improve health outcomes. Ideally, a set of
guidelines should give practitioners, patients, and poligymakers an explicit
description of the projected health benefits (for example, a reduction in
postoperative infection rates from 4 to 2 percent) and the projected harms
or risks (for example, an increase in the risk of incontinence from 10 to 20
percent). If reasonable and technically feasible, the net effects of a course
of action- the balance of benefits against risks or harms also need to be
estimated. In addition, projected outcomes should be compared with those
for alternative courses of care for the clinical condition ~ question.
The ideal set of projections just described will often be technically or
practically beyond the reach of guidelines developers. In most situations,
the assistance of outside consultants or specialized technical advisory panels
will be at least helpful or at most essential; yet even with such help,
projecting health outcomes is intrinsically a complex and subjective process.
The nature of the process makes it particularly important that the methods
3The committee discussed four types of validity: face validity, content validity, criterion validity,
and construct validity. These concepts may have the following connotations when applied to
practice guidelines. First, the content of guidelines and their development processes need to
be plausible, at Fist pass, to practitioners-to have face validity. Second, content Bade has to
be assessed by reviewing the scientific evidence on which a set of guidelines are based how
much evidence there is, how clear it is, how directly it relates to the guidelines, how sound its
methodology is. Third, for a prospective assessment of az~enon validity, one judges whether the
guidelines would be likely to produce predicted results when applied in the real world of health
care delivery. Construe validity involves the fit of the guideline to broader scientific theories.
OCR for page 61
ATTRIBUTES OF GOOD GUIDELINES
61
INFORMATION COW CLIP BEEP ~ ~= USE OR
HARMS
2.
3.
4.
6
What are the potential benefits and risks (type and importance) for individual
patients associated with an intervention or procedure?
What is the probability that a benefit or harm will occur?
Do benefits exceed harms? By how much?
What characteristics of delivery settings or patients affect the probability of a
benefit or harm?
How are benefits and harms distributed across time and populations?
How do these benefits and harms compare for alternative practices?
INFORMATION ABOUT COSIS AND SAVINGS
1.
2.
What is the production cost/price of a particular test, intervention, etc.?
What are other important costs, such as for program administration?
What is the total cost, given the projected number of services provided?
3. What is the cost per unit of some benefit, including not only the immediate
cost of providing service but the cost of follow-up services (for example, the
cost of screening for cancer and the cost of treating cancers that are
detected)?
4. What costs may be averted or saved (individual, total)?
How do costs and savings compare for alternative courses of action?
FIGURE ~1 A possible checklist for describing benefits, risks, and costs. SOURCE:
Lois figure is adapted in part from the National Research Council report, Improving Risk
Corr~nicat~on (1989~.
for projecting outcomes, the limitations in these methods, and the evidence
for such projections be described.
When empirical evidence is limited, potential effects may only be listed,
not quantitatively compared or weighed. In addition, in cases in which
patient preferences about different risks and benefits may differ, practice
guidelines will need tO be sensitive to such variation, and a comprehensive
statement of net effects may have to be omitted (Mulley, 1990~. In any
event, a systematic effort should be made to provide practitioners, patients,
and others with information that will help them make their own judgments
of the balance of benefits and risks.
Figure 3-1 provides a simple checklist of outcomes that might be
estimated. The particular outcomes to be considered will vary with the
clinical conditions aIld practices under consideration.
1b support the eventual evaluation of the actual impact of guidelines,
guidelines developers should indicate what information related to outcomes
will be needed, where it can be obtained, and whether better means for
collecting and analyzing data need tO be established tO permit evaluation.
OCR for page 62
62
CrrNICALPRACTICEGUIDElrNES
On this last point, limitations in the sources of data and the variables used
to project outcomes are likely to provide inspiration for recommending
improvements.
PROJECTED COSTS
Recent interest in practice guidelines is founded in part on the explicit
or implicit expectation that they can help control escalating health care
costs. The committee has already cautioned that some guidelines, if fol-
lowed, may increase short- or long-term costs and that the net cost effects of
current initiatives are not clear. These lands of uncertainty underscore the
desirability of including some form of cost projections in the background
documentation for guidelines.
Cost estimation, like the projection of health outcomes, has its own
special technical complexities and subjective aspects that will often require
the services of outside consultants or specialized technical advisory panels.
Even with such assistance, the committee recognizes that the results will be
imperfect. In general, estimates of the costs associated with a set of guide-
lines should follow the same principles of documentation and discussion
described for the estimation of health outcomes, including comparisons of
alternative courses of care (see Figure 3-1~. The remainder of this section
describes desirable elements of cost projections, elements the committee
sees as goals rather than minimum requirements.
Ideally, cost estimates should have two components, one involving
projected health care costs and the other relating to administrative costs.
The estimated health care costs of following the guidelines should reflect
(1) the estimated total number of services that will be added, substituted,
or deleted if a guideline is followed and (2) the substantiated charges (or
production costs) for these services. For example, for screening services,
the expected COStS of providing the services and of treating the problems
that are detected all need to be included. Depending on the available
information and the assumptions used, estimates will often take the form
of ranges rather than point estimates.
If health outcomes are projected in terms of additional life expectancy
or similar measures, then the cost per unit of each identified outcome should
be projected. Again, ranges may be more suitable than point estimates. If
the guidelines indicate acceptable alternative courses of care, the total costs
of the major alternatives and their cost per unit of each expected benefit
should be described.
Cost estimates should also consider the additional expenses that may
be associated with administering or using the guidelines. For example,
computer hardware or software may be required lo support easy access to
OCR for page 67
A17RIBl˘ES OF GOOD GUIDELINES
PRETESTING
67
Pretesting a set of guidelines on members of the intended user group
(for example, practitioners or patients) using a real organization or a set
of prototypical cases is desirable. (See also the discussions of reliabil-
iy/reproducibility and clarity, below.) Description of methods, settings,
and results of any pretests of the guidelines should be described. The
Forum has been given authority to pretest guidelines, and the committee
believes it should exercise that authority.
RE:LL\BILITY/REPRODUCIBILII Y
As conventionally used in a research context, reliability is linked lo the
measuring, diagnosing, or scoring of some phenomenon such as intelligence
or bacterial infection.5 In the context of guidelines, the committee uses the
concept to refer to the ability of some method or process to produce
consistent results across time or across users, or both. In strictly technical
terms, levels of reliability dictate possible (achievable) levels of validity;
that is, qualitative and quantitative instruments and tools cannot be valid
if they are not reliable.
One kind of reliability is methodological. Ideally, if another group of
qualified individuals using the same evidence, assumptions, and methods
for guidelines development (for example, the same rules for literature
review) were to develop guidelines, then they should produce essentially
the same statements. In practice, such replications are almost unknown
given the expense of the process,6 but discussion of previous trials of the
methodology (for different conditions) and any resulting revisions may be
useful. Likewise, review of the guidelines by an outside panel can help in
assessing reliability. (Recall that independent review is also important to an
assessment of validly, a fact that underscores the link between reliability
and validity.)
5The committee discussed how two common methodological concepts, sensitivity and specificity,
applied to practice guidelines. For medical review and other evaluation criteria, these two related
terms are fairly straightforward. Sensitivity and specificity refer, respectively, to a high "true pos-
itive rate" in detecting deficient or inappropriate care and a high "true negative rate" in passing
over cases of adequate care. The concepts can be operationalized by requiring some evidence,
drawn, for example, from pretesting of the review criteria on "prototype" cases or through pilot-
testing in a specific organization. As described in Chapter 10 of the Medicare quality report,
case-finding screens have often been found to be deficient on these two attributes. The commit-
tee concluded that these attributes need to be considered for evaluation instruments but do not
add anything to the assessment of practice guidelines.
6 One effort at replication has been undertaken by those involved with the RAND Corporation's
work to develop appropriateness indicators (Chassin, 1990~.
OCR for page 68
68
CLINICAL PRACTICE GUIDELINES
A second kind of reliability that is important for practice guidelines
is clinical reliability. Practice guidelines are reliable if given the same
clinically relevant circumstances the guidelines are interpreted and applied
consistently by practitioners (or other appropriate parties). That is, the
same practitioner, using the guidelines, makes the same basic clinical
decision under the same circumstances from one time to the next, and
different practitioners using the guidelines make the same decisions under
the same circumstances. Pretesting of guidelines in actual delivery settings
or on prototypical cases can help test this kind of reliability as well as
contribute to assessments of validity.
For medical review criteria and other specific tools for evaluating
health care actions or outcomes, the concept of reliability (or reproducibil-
ity) seems straightforward. Ideally, review criteria and other tools for
evaluating performance should be pretested to provide evidence that they
meet a specified level of reliability over time for the same user (test-retest
reliability) and between users (interrater reliability). Review criteria often
run into reliability problems when they use undefined terms such as "fre-
quent" or "serious" or "presence of comorbid conditions" that different
users may interpret quite differently. Thus, one tactic developers of guide-
lines and review criteria should use to maximize reliability is to avoid such
terms unless precise definitions are provided.
CLINICAL APPLICABILITY
Because of the considerable resources and opportunity costs involved
in developing practice guidelines, guidelines should be written to cover as
inclusive a patient population as possible, consistent with knowledge about
critical clinical and sociodemographic factors relevant for the condition or
technology in question. For instance, a guideline should not be restricted
to Medicare patients only through age 75 or through age 85 if evidence
and expert judgment indicate that the clinical condition or the technology
in question is pertinent to those over age 85.
This attribute requires that guidelines explicitly describe the population
or populations to which statements apply. These populations may be
defined in terms of diagnosis, pathophysiology, age, gender, race, social
support systems, and other characteristics. The purpose of such a definition
is to help physicians concentrate specific services on classes of patients that
can benefit from those services and avoid such services for classes of patients
for whom the services might do harm or produce no net benefit. Again,
the relevant scientific literature needs to be cited or its absence noted.
OCR for page 69
AlTRIBUTES OF GOOD GUIDELINES
69
CLINICAL FLEXIBILITY
Flexibility requires that a set of guidelines identify, where warranted,
exceptions to their recommendations The objective of this attribute is
lo allow necessary leeway for clinical judgment, patient preferences, and
clinically relevant conditions of the delivery system (including necessary
equipment and skilled personnel).7
Operationalizing this attribute may be difficult. In the committee's
view, a fairly rigorous approach should be adopted, one that requires a set
of guidelines to (1) list the major foreseeable exceptions and the rationale
for such exceptions, (2) categorize generally the less foreseeable or highly
idiosyncratic circumstances that may warrant exceptions, (3) describe the
basic information to be provided to patients and the kinds of patient
preferences that may be appropriately considered, and (4) indicate what
data are needed to document exceptions based on clinical circumstances,
patient preferences, or delivery system characteristics.
The role of patient preferences, whether considered in the context
of daily clinical practice or in the context of developing guidelines, is a
particularly complex issue. For example, there is much disagreement about
the proper behavior for practitioners faced with preferences they believe are
unreasonable or unacceptable (B rock and Wartman, 1990~. Likewise, the
balance between patient preferences and societal resources is the subject
of intense debate.
A thorough treatment of this issue was not part of the committee's
charge. However, in addition to recommending that patient interests be
taken into account at several points in the process of developing guidelines,
the committee makes two observations. First, patient preference for a
service generally need not be acceded to when the service cannot be
expected to provide any benefit or when it can be expected to produce
a clear excess of harm over benefit. Second, when a mentally competent
patient unreasonably wishes (in a practiiioner's view) to forego treatment,
the practitioner can try to persuade the patient to accept care but cannot,
with rare exceptions, insist on treatment.
CLARITY
Clarity means that guidelines are written in unambiguous language.
Their presentation is logically organized and easy to follow, and the use
of abbreviations, symbols, and similar aids is consistent and well explained.
Key terms and those subject to misinterpretation are defined. Vague clinical
7 Clinical applicability and clinical flexibility could be grouped together as one attribute. Keeping
them separate emphasizes the distinctions among the populations or settings that are covered by
guidelines and those that are not so covered.
OCR for page 70
70
CLINICAL PRACTICE GUIDELINES
language, such as "severe bleeding," should be avoided in favor of more
precise language, such as "a drop in hematocrit of more than 6 percent
in less than eight hours." Similarly, guidelines must be specific about
what populations and clinical circumstances are covered and what specific
elements of care are appropriate, inappropriate, and (if relevant) equivocal,
as those terms were defined earlier.
For practical reasons, assessments of language and modes of presen-
tation may have to be largely subjective. Depending on the audience,
somewhat different standards for assessing clarity may be needed. Materi-
als for consumers might be subject to the "readability" measures that have
been variously applied to regulations, consumer warranties, and similar
materials. Materials for practitioners may be more technical but should not
be burdened by needless jargon, awkward writing, or "unfriendly" software.
Software itself may soon allow organizations to apply computer-based `'style
manuals" or "templates" to help standardize writing for different purposes
(Franker, 1990)e
MULTIDISCIPLINARY PROCESS
One of the committee's strongest recommendations is that guidelines
development include participation by representatives of key affected groups
and disciplines.8 The rationale for this position is threefold. First, multidis-
ciplinary participation increases the probability that all relevant scientific
evidence will be located and critically evaluated, thereby strengthening
the scientific grounding, scope, and flexibility of the guidelines. Second,
such participation increases the likelihood that practical problems with us-
ing guidelines will be identified and addressed, thus constructing a firmer
foundation for successful application of the guidelines in real-world situ-
ations. Third, participation helps build a sense of involvement or "own-
ership" among different audiences for the guidelines, thereby improving
the prospect for cooperation in implementing them. Figure 3-2 summa-
rizes these rationales and other key issues in developing or assessing a
participation strategy.
Among clinicians, multidisciplinary participation may call for the use
of clinicians with and without full-time academic ties, for the inclusion of
specialists and generalists, and for participation by relevant nonphysician
practitioners. Optometrists, for instance, could well have an important role
to play on panels to develop guidelines for cataract surgery. Experts in
research and analytic methods also need lo be represented on guidelines
ache term nu~kidi~ciplinary is used broadly here rather than narrowly; it does not refer only to
academic and professional disciplines.
OCR for page 71
AlTRIBUTES OF GOOD GUIDELINES
71
WHY MULTIDISCIPLINARY PARTICIPATION IS NECESSARY
Strengthens scientific base of guidelines
Increases real-world utility
Creates sense of Ownership
Elands foresight about conceptual and practical problems
WHO SHOULD PARTICIPATE
Clinicians
Academic and nonacademic
Specialists and generalists
Physicians, nurses, and others as appropriate
Methodologists
Experts in analyzing science bases
Experts in group judgment methods
Nonclinician users
Patients, potential patients, and families
Payers and health plan sponsors
Peer review and quality assurance experts
Public policyrnakers
HOW PARTICIPATION MIGHT OCCUR
Membership on development panel
Testimony at public hearings
Remew of draft guidelines
Focus groups
Consulting and contracting arrangements
WHEN PARTICIPATION MIGHT OCCUR
Early-setting the goals, determining the processes
Late-reviewing the results
Throughout-beginning to end and into implementation and evaluation
WHAT PARTICIPANTS MIGHT CONSIDER
Scope, quality, and evaluation of scientific evidence
Mode of presentation
Likely ease of application
Identification of user problems and needs
FIGURE 3-2 Multidisciplinary participation in guidelines development.
development panels; that is, methodological expertise should not be ob-
tained only on a contractual basis or from specialized technical advisory
panels.
User groups in addition tO clinicians include health care admin-
istrators, members of peer review organizations, payers, and patients or
consumers. If guidelines are expected tO pertain tO groups distinguished
OCR for page 72
72
CLINICAL PRACTICE GUIDELINES
mainly by sociodemographic characteristics (for example, age or minority
ethnic groups I, special efforts are warranted to involve representatives of
those groups at some early stage of development. Successful involvement
of patients or consumers is a challenge that may require multiple strategies,
as described below.
Documentation for this attribute will need to describe the parties
involved, their credentials and potential biases, and the methods used to
solicit their views and arrive at group judgments. The committee does not
recommend, however, that the Forum develop detailed, rigid definitions
of what constitutes a consumer or other participant category. (The often
unproductive troubles such definitions created for federally funded health
planning agencies were cited during the committee discussion.)
A frequent although not necessarily valid criticism of guidelines is that
their content can be improperly manipulated by selecting group participants
for their known opinions rather than on the basis of their expertise. The
position taken here is that all participants in the guideline-setting process
are likely to have personal opinions, biases, and preferences about the
clinical problem or service at issue, and no amount of effort will expunge
those factors. What is critical Is that those factors be known and balanced
insofar as possible.9
The committee discussed at some length the question of who should
develop guidelines. Some members felt quite strongly that the Forum
should not contract with medical specialty societies for guidelines develop-
ment services. Others felt that establishing such a blanket prohibition was
not the right approach. Instead, decisions should be based on a compara-
tive assessment of potential developers' track records and capacities. These
capabilities include, for example, related work that the groups or individ-
ual participants have already done, existing documentation of participants'
credentials and biases, and the methods and evidence with which they have
experience. Although the committee did not reach a specific consensus
that the Forum should completely exclude specialty societies as potential
direct contractors or subcontractors, the agency should be sensitive to the
credibility concerns raised by this question. Physician organizations in any
case should be extensively consulted by developers of guidelines, involved
in reviewing draft guidelines, and used to help disseminate guidelines.
Another debate arose during the committee's meetings over the ques-
tion of who should chair a guidelines development group. Again, some
9 The procedures of the National Academy of Sciences might serve as a model for the panel se-
lection process. These procedures require that membem of study committees submit bias state-
ments and that an official of the Academy lead each committee through a member->y-member
discussion of possible biases. Major funders of a study cannot be represented on a study com-
mittee, and every committee report must be reviewed by a panel of outside experts under the
oversight of the National Research Council.
OCR for page 73
ATlRIBVTES OF GOOD GUIDELINES
73
felt that a specialist user of a particular technology (for example, a cardiac
surgeon who performs coronary artery bypass surgery) should never chair a
group developing guidelines on the use of that technology. Others felt that
exceptions to the general principle might sometimes be warranted. There
was considerable agreement that a physician should chair the development
of any clinical practice guidelines. Again, explicit attention to questions of
bias is essential.
Participation by affected groups in the process of guidelines develop-
ment can be achieved in several ways. The strongest form of participation
is membership on the panel charged with developing guidelines, but the
benefits of this approach have to be balanced against the practical man-
agemerlt problems created by too large a panel. Participation may also be
achieved through mechanisms other than the panel for example, public
hearings, circulation of draft guidelines for review and comment by a wide
variety of groups, and contracts with particular interests for specific analy-
ses. Focus groups and pretests may uncover confusing language or highlight
the "hassle factor" associated with draft guidelines and allow practitioners
or patients to suggest more acceptable alternatives.
Different types of guidelines are likely to require different mechanisms
for participation, and the benefits of participation need to be balanced
against resource limitations and other constraints. Therefore, this report
stresses the principle and value of participation rather than the specific
vehicles. Creativity and experimentation should, in fact, be encouraged.
SCHEDULED REVIEW
Clinical evidence and judgment are not static. Therefore, guidelines
should designate a review date to determine whether they should be up-
dated or, potentially, withdrawn. In a clinical area where technologies are
changing rapidly and new research findings can be expected to accumulate
quickly, a relatively short timetable may be appropriate. More stable clin-
ical areas may permit a longer period before scheduled review. In every
case, however, a guideline should contain a specific review date or time
frame for review (for example, within three years of initial publication).
The greater the amount of change in a clinical area, the more the revision
process will resemble the initial development process in scope, cost, and
intensity.
Follow-up on review schedules is part of the implementation process
(see Chapter 4) as is determination of whether review is needed before
the scheduled date. Unscheduled revisions may be prompted by major new
clinical evidence or by emerging or disintegrating professional consensus.
16 oversee both scheduled and unscheduled reviews, an organization re-
sponsible for the development of multiple sets of guidelines should subject
OCR for page 74
74
CLINICAL PRACTICE GUIDELINES
TABLE ~2 Provisional Documentation Checklist for Practice Guidelines
Attribute
Item
Validity
Reliability/
reproducibility
Projected health outcomes if guidelines are followed.
Information required to evaluate outcomes.
Projected costs if guidelines are followed. Information
required to evaluate costs.
Description of data, methods, and assumptions used to
make projections.
Explicit description of the relationship between the
scientific evidence and the guidelines and explanations
for any differences between the guidelines and the
evidence. Explanations for any important differences
between the guidelines in question and those
developed by others.
Thorough literature review describing scientific research
including sponsors, settings, methodologies, findings
and qualifications.
Description of methodology for evaluating the scientific
literature and the results.
Explicit assessment of the quality, consistency, clarity,
and strength of the scientific evidence.
Description of methodology for using expert or group
judgment as a basis for evaluating scientific evidence
or, in the absence of evidence, reaching a consensus
based on expert opinion.
Explicit description of the strength of expert consensus.
Description of procedures, participants, and findings of
review by experts and others not involved in the
original development process.
Description of methods, settings, and results of any
pretests of the guidelines.
Description of methods and results of testing (1) the
reliability of the development method and (2) the
reproducibility of the clinical decisions reached by
users of the guidelines.
all of its guidelines to some kind of yearly examination to flag particu-
lar guidelines for either scheduled or unscheduled review. As described
in the next chapter, the mechanisms for disseminating and administering
guidelines need to provide for guidelines updating or withdrawal.
DOCUMENTATION
For the purposes of emphasis, the committee lists documentation as a
separate attribute even though it has already been referred to repeatedly in
the discussion of other attributes. As a practical matter, a documentation
checklist, such as the preliminary version presented in Bible 3-2, may be
helpful for contractors and review panels.
OCR for page 75
AITRIBIJTES OF GOOD GUIDELINES
TABLE ~2 continued
75
Attribute
Item
Clinical
applicability
Clinical
flexibility
Clarity
Multidisciplinary
process
Scheduled review
Specification by age, sex, race, clinical diagnosis, and
other factom of the populations to which a set of
guidelines apply.
Description and analysis of the scientific literature or
expert consensus that forms the basis for statements
about the age, sex, and other factors of the
populations to which a set of guidelines apply.
Description and analysis of the scientific literature or
expert consensus that forms the basis for statements
about major foreseeable exceptions to application of
the guidelines.
Listing of the basic information to be provided to
patients and the kinds of patient preferences that may
be appropriately considered.
Listing of the data needed to document exceptions based
on clinical circumstances, patient preferences, or
deliver system characteristics.
Methods and results of any testing of readability, logic, or
understanding.
Description of the parties involved in developing the
guidelines, their credentials and interests, and the
methods used to solicit their views or to arrive at
group judgments.
Description of the procedures used to subject guidelines
to review and criticism by experts not involved in the
original development process, with summary of results.
Timetable and method for the scheduled review.
Description of the basis for arriving at the timetable or
specific date.
CONCLUSION
This chapter has proposed eight attributes of practice guidelines that
the Forum should employ in advising its contractors and expert panels and
in assessing the quality of the guidelines it receives. The attributes are
validity, reliability (reproducibility), clinical applicability, clinical flexibil-
ity, clarity, multidisciplinary process, scheduled review, and documentation.
Definitions of these terms and some examples that may aid in their op-
erationalization are also given. Operationalization, that is, turning these
eight concepts into a practical instrument for the Forum to use in prospec-
tively assessing guidelines, is one task in a broader project that the IOM is
currently conducting (Appendix C).
Several issues about guidelines development need to be kept in mind
as the Forum proceeds. First, neither existing guidelines nor those likely to
be developed by the agency in the foreseeable future will "score well" on
all eight properties simultaneously; indeed, near-perfect scores may always
OCR for page 76
76
CONICAL PRACTICE GUIDELINES
lie in the realm of aspiration rather than attainment. Second, a balance
needs to be maintained between an ideal process and a feasible one. For
example, this committee, and others, could design a very meticulous process
tO take into account the views of all interested groups. At some level, that
process would consume more resources in time, professional input, and
money-than the outputs would warrant. That is, it would be too slow, too
cumbersome to administer, and too costly to meet the needs of providers,
third-party payers, or patients. It undoubtedly would not conform to the
congressional deadlines of OBRA 89.
The third point to stress is that guidelines development must be an
evolutionary process, especially at the national (or federal) level. There
is no proven "right way" to conduct this endeavor, even if there clearly
are some "better ways." Guidelines that satisfactorily reflect the eight
attributes proposed here may not be products of an ideal process, but in
the committee's view they will be defensible.
Two other themes should be reiterated: the need for credibility among
practitioners, patients, payers, and policymakers, and the need for ac-
countability. The entire practice guidelines enterprise will not fulfill its
promise (and certainly the federal program will not) if the products lack
solid scientific grounding and widespread understanding and support from
the provider and patient communities. The significance accorded such at-
tnbutes as validity and reliability, clarity, multidisciplinary approach, and
documentation reflects the committee's concerns with these needs. A1-
though in the first instance the themes of credibility and accountability
apply to the procedures followed in guidelines development, they also early
through to the procedures of implementation and evaluation, which are the
subjects of the next chapter.
REFERENCES
American College of Physicians. Clinical Efficacy Assessment Project: Procedural Manual.
Philadelphia, Pa.: 1986.
American Medical Association. Attributes to Guide the Development of Practice Parameters.
Chicago, Ill.: American Medical Association, 1990a.
American Medical Association. Preliminary Worksheet for the Evaluation of Practice
Parameters. Draft of ad hoc review panel. Chicago, Illinois, May 1990b.
Canadian Task Force on the Periodic Health Examination. Canadian Medical Association
Journal 121:119~1254, 1979.
Battista, R., and Fletcher, S. Malting Recommendations on Preventive Practices: Method-
ological Issues. American Journal of Preventive Medicine 4:53 67, 1988 (Supplement).
Brock, D., and Wartman, S. When Competent Patients Make Irrational Choices. New
England Joumal of Medicine 32;2:1595~1599, 1990.
Chassin, Mark. Presentation to the IOM Committee to Advise the Public Health Service
on Practice Guidelines. Washington, O.C., April 2, 1990.
Eddy, D. Companng Benefits and Harms: The Balance Sheet. Journal of the American
Medical Association 263:249~2505, 1990a.
OCR for page 77
ATTRIBUTES OF GOOD GUIDELINES
77
Eddy, D Guidelines for Poligy Statements: The Explicit Approach. Joumal of the American
Medical Association 263:22302240, l990b.
Eddy, D. Practice Policies~uidelines for Methods. Journal of the American Medical
Association 263:1839 - 1841, 1990c.
Eddy, D. Practice Policie~What Are They? Journal of size American Medical Association
263:877 880, 1990d.
Eddy, D. Practice Policie~Where Do hey Come From? Journal of the American Medical
Association 263:1265-1275, 1990e.
Eddy, D. Designing a Practice Policy Standards, Guidelines, and Options. Journal of the
American Medical Association, forthcoming (a).
Eddy, D. A Manual for Assessing Health Practices and Designing Practice Policies. American
College of Physicians, forthcoming (b).
Eddy, D., and Billings, ]. The Quality of Medical Evidence and Medical Practice. Paper
prepared for the National Leadership Commission on Health, Washington, D.C, 1988.
Fink, A, Kosecod, J., Chassin, M., et al. Consensus Methods: Characteristics and
Guidelines for Use. Amencan Journal of Public Health 74:97~983, 1984.
Frankel, S. Hello, Mr. Chips: PCs Learn English. Washington Past, April 29, 1990, p. D3.
Gottlieb, L., Margolis, C., and Schoenbaum, S. Clinical Practice Guidelines at an HMO:
Development and Implementation in a Quality Improvement Model. Duality Review
Budetzn 16:80~6, 1990.
Institute of Medicine. Effects of Clinical Evaluation on the Diffusion of Medical Technology.
Chapter 4 in Assessing Medical Technologies. Washington, D.C.: National Academy
Press, 1985.
Institute-of Medicine. Acute Myocard~l Infarction: Setting Priorities for Effectiveness Research.
Washington, D.C.: National Academy Press, 1990a.
Institute of Medicine. Breast Cancer: Setting Priorities for Effectiveness Research. Washington,
D.C.: National Academy Press, 1990b.
Institute of Medicine. Hip Fracture: Setting Priorities for Effecsweness Research. Washington,
D.C.: National Academy Press, l9~c.
Institute of Medicine. Medicare: A Strategy for Duality Assurance, Bohr, K., ed. Washington,
D.C.: National Academy Press, 1990d.
Institute of Medicine. National Trioxides for the Assessrnens of Chnical Conditions and
Medical Technologies, Lara, M., and Goodman, C., eds. Washington, D.C.: National
Academy Press, 1990e.
Institute of Medicine. Workshop to Improve Group Judgment for Medical Practice and
Technology Assessment, Washington, D.C., May 15-16, 1990f.
Lomas, J. Words Without Action? The Production, Dissemination and Impact of Consensus
Recommendations. Draft paper (dated May 1990) prepared for the Annual Rewew of
Public Health, Vol. 12, Omenn, G., ed. Palo Alto, Calif., forthcoming.
Mulley, ~ Presentation to the Workshop to Improve Group Judgment for Medical Practice
and Technology Assessment, Washington, D.C, May 15, 1990.
National Research Council. Improving Risk Communication. Washington, D.C.: National
Academy Press, 1989.
Park, R., Fink, A., Brook, R., et al. Physician Ratings of Appropnate Indications for Six
Medical and Surgical Procedures. R-3280-CWF/HF/PMT/RWJ. Santa Monica, Calif.:
Lee RAND Corporation, 1986. See also the same authors and same title in the
American Joumal of Public Health 76:76~772, 1986.
U.S. Preventive Services Task Force. Guide lo Clinical Preverz~i~e Services: An Assessment of
the Electiveness of 169 interventions. Baltimore, Md.: Williams & Wilkins, 1989.
Representative terms from entire chapter:
clinical practice