Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 109
Clinical Practice Guidelines We Can Trust 5 Current Best Practices and Standards for Development of Trustworthy CPGs: Part II, Traversing the Process Abstract: This chapter is devoted to the remaining steps in the guideline development process, including standards for establishing evidence foundations for and rating of strength of recommendations, articulation of recommendations, external review, and updating. The committee believes clinical practice guidelines (CPGs) should comply with all eight proposed standards contained within Chapters 4 and 5 to be considered trustworthy. The committee recommends that CPG developers adhere to these standards, and that CPG users adopt CPGs compliant with these standards. However, the committee is sympathetic to the time and other resource requirements the standards require. Complying with the full body of standards may not be feasible immediately for guideline developers, and a process of evolutionary adoption over time may be more practicable. Importantly, whether evidence is lacking or abundant in a particular clinical domain, the committee expects guideline development groups to aspire to meet all standards. INTRODUCTION Like Chapter 4, Chapter 5 arose from the committee’s adoption of standards-setting methodologies elaborated in Chapter 1. This chapter is devoted to the remaining domains of the guideline
OCR for page 110
Clinical Practice Guidelines We Can Trust development process: establishing evidence foundations for and rating strength of recommendations, articulation of recommendations, external review, and updating. ESTABLISHING EVIDENCE FOUNDATIONS FOR AND RATING STRENGTH OF RECOMMENDATIONS Appraising Evidence Quality and Recommendation Strength: Fundamentals Clinical practice guidelines (CPGs) fundamentally rest on appraisal of the quality of relevant evidence, comparison of the benefits and harms of particular clinical recommendations, and value judgments regarding the importance of specific benefits and harms. Historically, value judgments regarding potential outcomes have been made implicitly rather than explicitly, and the basis for judgments regarding the quality of evidence and strength of a recommendation has often been unclear. As a result, many CPG developers now apply formal approaches to appraising both the evidence quality and the strength of recommendations (Ansari et al., 2009; Schünemann et al., 2006a; Shekelle et al., 2010). Although much has been written about the concept of “quality of evidence,” there continues to be considerable variability in what the term is used to describe. Ultimately the term “quality of evidence” is used to describe the level of confidence or certainty in a conclusion regarding the issue to which the evidence relates. And, historically, as detailed hereafter, the notion of quality has emphasized research design, so that evidence quality evaluations arose from the inherent rigor (e.g., RCT vs. uncontrolled case series) of study designs. This certainty or confidence is frequently expressed by assigning a score, rating, or grade (typically in the form of numerals, letters, symbols, or words) to the quality of evidence. Although critically important, it must be underscored that evidence quality as it often has been construed, is not the only factor that needs to be considered when drawing a conclusion regarding optimal clinical practice. Other considerations include the relevance of available evidence to a patient with particular characteristics; the quantity (i.e., volume and completeness) and consistency (i.e. conformity of findings across investigations) of available evidence; and the nature and estimated magnitude of particular impacts of an individual clinical practice and value judgments regarding the relative importance of those different impacts (Verkerk et al., 2006).
OCR for page 111
Clinical Practice Guidelines We Can Trust Clinical practice recommendations typically are based on consideration of a body of evidence, as well as clinical judgments extending from experience and potential variation in patient preferences. For example, high-quality evidence from well-designed and -conducted clinical trials demonstrates that administration of oral anticoagulants to patients with a first spontaneous deep vein thrombosis reduces risk of recurrent thromboembolic events. Yet, differences in patient risk of bleeding complications and in patient value judgments regarding harms associated with oral anticoagulation therapy, including bleeding risk and the inconvenience related to taking medication and monitoring anticoagulation levels, permit only a weak recommendation regarding whether all patients with a first spontaneous deep vein thrombosis should be treated with oral anticoagulants (Buller et al., 2004). Economic value also can be included in the strength of recommendation decision process, as it relates to patients’ out-of-pocket costs or overall healthcare spending. For a health care intervention to have value, clinical and economic benefits need to be greater than clinical harms and economic costs. Although value is a common term in health care, it has not been defined or studied in a way that is accepted well by the majority of members of the health care evidence community. Value rarely is considered in CPGs, yet the committee acknowledges that patient preferences are often based in part on out-of-pocket costs that may affect their personal decisions about alternative care options (Luce et al., 2010). Consideration of these latter factors, as well as the fact that evidence regarding several different issues needs to be considered by CPG developers, has given rise to the concept of strength of a recommendation regarding a particular patient management issue. Strength of a recommendation needs to reflect the degree of confidence that all patients would have so they would conclude that desirable outcomes of a recommendation outweigh the undesirable. Like evidence quality, this certainty or confidence is captured by a score, rating, or grading (commonly taking the form of numerals, letters, symbols, or words) assigned to the clinical recommendation (Swiglo et al., 2008). The appraisal of CPG evidence and recommendations presents considerable complexity, and a number of alternative strategies have been developed for these purposes. The literature demonstrates variability in rating the same evidence when employing varying appraisal systems, and variability in rating when identical systems are applied to identical evidence by different individuals (Ferreira et
OCR for page 112
Clinical Practice Guidelines We Can Trust al., 2002). Judgments employed in translating evidence into a clinical recommendation are even more variable than those applied to evidence quality because their subjectivity (e.g., comparing disparate benefits and harms) is even greater (Calonge and Harris, 2010). Yet, the literature also suggests that a reduction in variability may be achieved by employment of structured, explicit approaches (Uhlig et al., 2006). Additionally, there is a consensus among most guideline developers that standardized rating of evidence quality facilitates the balancing of benefits and harms requisite to healthcare decision making and guideline recommendation formulation. Furthermore, some have argued that an explicit, systematic scheme for assessing evidence quality and strength of recommendations likely results in reduced errors in judgment, increased facility in evaluating such judgments, and improved communication of related content by guideline developers (Atkins et al., 2004). CPG users need to understand the evidentiary basis of and value judgments associated with particular recommendations (Schünemann et al., 2006a). Over the past decade, guideline developers have recognized the value of providing an efficient summary of the strength of recommendations, and quality of evidence buttressing them, in enhancing clinicians’ comprehension of a CPG’s basic clinical message (Swiglo et al., 2008). Moreover, a small empirical literature suggests that adopters of clinical guidelines’ healthcare recommendations prefer detailed, explicit knowledge about the underlying quality of evidence and strength of recommendations (Akl et al., 2007; Shekelle et al., 2010). Rating Quality of Evidence and Strength of Recommendation: State of the Art Rating of healthcare recommendations, specifically, began with the Canadian Task Force on the Periodic Health Examination more than three decades ago (Anonymous, 1979). The scheme was founded on study design exclusively, with randomized controlled trials (RCTs) classified as good (Level 1) evidence; cohort and case control studies as fair (Level II); and expert opinion classified as poor (Level III) evidence. Recommendation strength was derived from the quality of evidence so that a strong recommendation (i.e., A) was based on good (i.e., Level I) evidence. The attractiveness of the Canadian Task Force approach was its simplicity and attendant ease of comprehension, application, and presentation. However, this approach did not consider how well a particular type of study (e.g., RCT) was designed or executed or the number of patients included in particular studies. Furthermore, the rating applied only to the
OCR for page 113
Clinical Practice Guidelines We Can Trust quality of evidence. The Canadian Task Force made no effort to rate the strength of their recommendations (e.g., balance of benefits and harms) (Atkins et al., 2004). Numerous systems for appraising quality of evidence and strength of recommendations have evolved since, representing efforts of multiple, varied entities involved in guideline development. These systems range from the simple, founded exclusively on research design, and ignoring methodological details of studies, consistency of effects, and clinical relevance and generalizability of the patient population that was studied, to the more structured, which move beyond research design to the complexity of methods and the subjectivity of their appraisal. These schemes also vary with respect to the audiences and clinical foci they address. However, overall, the approaches include a strategy for rating the evidence, resulting in the assignment of an ordinal score (e.g., good, fair, poor; A, B, C; 1++, 1+, 1−, 2++, 2+, 2−, 3, 4) driven by methodological quality (e.g., RCTs without important limitations, RCTs with important limitations, observational studies, case series) of the available evidence. The second component they share is a strategy for rating recommendation strength resulting in assignment of a dichotomous or ordinal score (e.g., strong recommendation, weak recommendation; A, B, C, D; GRADE I, GRADE II, GRADE III) derived from consideration of evidence quality and the trade-offs between recommendation benefits and harms. In general, when CPG developers are confident that the beneficial effects of a recommendation adherence outweigh the harms, a strong recommendation can be made. A strong recommendation commonly depends on high- or moderate-quality evidence regarding important patient outcomes. Much less often, CPG developers may offer strong recommendations on the basis of low- to very low-quality evidence. This occurrence is the result of guideline development group (GDG) confidence that benefits of a recommendation outweigh harms or vice versa. On the other hand, a weak recommendation commonly arises from development group judgment that the benefits of a recommendation outweigh harms; however, their confidence in this balance is not high (e.g., benefits and harms closely balanced, uncertain balance of benefits and harms). Hence, low or very low, or even very high evidence quality may result in weak recommendations due to a complex or uncertain benefits/harms trade-off (Swiglo et al., 2008). Further specifications of rating schemes are captured within a selection of prominent approaches provided in Appendix D. Although the literature argues in support of a mechanism for scoring quality of evidence and strength of recommendations, and
OCR for page 114
Clinical Practice Guidelines We Can Trust a vast majority of GDGs apply one, we noted earlier the specific challenges in their application (Schünemann et al., 2006a). In addition, there is widespread agreement that the area of appraisal overall is “besieged with problems” (Kavanagh, 2009). In 2004, Atkins and colleagues conducted a comparison of six well-respected systems, those of the American College of Chest Physicians, Australian National Health and Medical Research Council, Oxford Center for Evidence-Based Medicine, Scottish Intercollegiate Guidelines Network, U.S. Preventive Services Task Force, and U.S. Task Force on Community Preventive Services (Atkins et al., 2004). Atkins and colleagues (2004) identified a number of additional systems in use by 51 organizations, which have developed from 2 to greater than 10 CPGs and applied an explicit scheme to assess the quality of evidence or strength of recommendations. These additional systems reflect those six approaches fully investigated by the authors, with slight variations. The authors’ findings are based on assessments of all 6 systems by 12 independent evaluators applying 12 indicators of system “sensibility” or overall utility. These authors’ analyses uncovered poor agreement among assessors (Atkins et al., 2004), and still others claim the discord is indicative of the questionable validity of any unique scheme (Kavanagh, 2009). Atkins and coauthors (2004) offer detailed qualitative insight into the state of the art of evidence quality and recommendation strength assessment. Their evaluation indicates the following: No one system was uniformly endorsed as clear and simple, and the clearer a system, the less likely it was simple to apply. For most approaches data necessary to employ them would at least sometimes be unavailable. All systems were missing at least one critical dimension. Although certain systems were considered to have some ability to discriminate, none of the systems was regarded as likely to clearly articulate the difference between quality of evidence and strength of recommendations. There was uncertainty regarding the reproducibility of the assessment using any of the tools (Atkins et al., 2004). Based on these findings and in pursuit of an improved strategy for evidence quality and strength of recommendations appraisal, the Grades of Recommendation Assessment, Development and Evaluation (GRADE) was published in 2004 (Atkins, 2004). GRADE has been adopted “unchanged or with only minor modifications” by a
OCR for page 115
Clinical Practice Guidelines We Can Trust large number and variety of organizations, including governments, professional medical societies, and UpToDate, a medical resource accessed online that is used by a majority of U.S. academic medical centers (Schünemann et al., 2006b). GRADE’s advantages include its (1) applicability across a great variety of clinical areas (e.g., prevention and therapy); (2) accounting for individual preferences and values; and (3) treatment of the quality of evidence and the strength of recommendation in a transparent, explicit manner. CPGs and recommendations applying the approach typically increase users’ understanding of the rationale for CPG recommendations’ derivation (Calonge and Harris, 2010). However, as in the case of the larger body of appraisal tools, criticism has been directed at GRADE, much of it reflecting the issues raised herein. As indicated above, a feature common to all rating systems is the part played by individual judgment, and although judgment criteria are well specified in GRADE, the identical body of evidence can be appraised differently by judges with different individual biases or values. Furthermore, although GRADE explicitly describes the means by which a recommendation is achieved, the system may result in discordance in translating evidence into recommendations among GDGs and potentially within a single group across varying clinical actions (Calonge and Harris, 2010). In fact, empirical assessment of the reliability of GRADE, conducted by the authors of the system, has resulted in findings of very low inter-rater agreement for quality of evidence judgments. Furthermore, although theoretical underpinnings of GRADE are provided in multiple publications (Atkins, 2004; Atkins et al., 2004, 2005; Guyatt et al., 2006a,b, 2008b; Schünemann et al., 2006b), empirical assessment of the validity of GRADE is absent from the literature. Derived from GRADE is the American College of Physicians (ACP) system for appraising evidence quality and strength of recommendations. The ACP judges evidence to be of high quality when it is based on one or more well-designed and well-conducted RCTs, giving rise to consistent findings directly applicable to the target population (Qaseem et al., 2010). Moderate-quality evidence is that derived from RCTs characterized by significant deficiencies (e.g., large losses to follow-up, lack of blinding); indirect evidence arising from similar populations; and RCTs that include a small number of subjects or observed events. Additionally, well-designed and non-randomized controlled trials, well-designed cohort or case control analytic studies, and multiple time-series designs comprise moderate-quality evidence. Low-quality evidence commonly derives from observational investigations; yet such evidence may be regarded as
OCR for page 116
Clinical Practice Guidelines We Can Trust moderate or perhaps high, as determined by specifics of research methods (e.g., dose–response relationship, large observed effect). ACP guideline recommendations are graded as strong or weak. A strong recommendation indicates that benefits clearly outweigh harms, or harms clearly outweigh benefits. Weak recommendations result from precariously balanced benefits and harms or a high level of uncertainty regarding magnitude of benefits and harms. Lastly, in the case of a dearth of, conflicting, or poor quality of evidence driving support of or opposition to clinical action, the ACP rates the recommendation as “insufficient evidence to determine net benefits or risks” because the balance of benefits and harms cannot be achieved (Qaseem et al., 2010, p. 196). The ACP’s detailed interpretation of its system for grading the quality of evidence and strength of recommendations, provided in Table 5-1 below, depicts and defines elements basic to appraisal and understanding relationships between evidence quality and recommendation strength. It also highlights the implications of those relationships for clinical practice. Currently available approaches to rating evidence quality and strength of recommendation are of utility, but not adequate. They provide transparent, systematic frameworks for clinical recommendations’ derivation extending from consideration of evidence quality, in contrast to an unsystematic, implicit, non-transparent, intuitive approach. With this, these strategies allow for inspection of the methods and judgments involved in translating evidence into clinical recommendations, thereby increasing trustworthiness of CPGs (Ansari et al., 2009; Calonge and Harris, 2010; Kavanagh, 2009). As one aspect of establishing evidence foundations for, and ultimately deriving, evidence-based, clinically valid recommendations, the committee supports adoption of systematic methods for rating quality of evidence and strength of recommendations, which include the elements discussed above. Integrating Guideline Development Group Values Explaining Variation in Evidence Interpretation CPG development usually requires interpretation of evidence regarding many different issues. Therefore, recommendations addressing the same topic may vary among guidelines. This is especially the case in the setting of low-quality evidence because judgment is more likely to come into play when evidence is limited or of low quality (Burgers et al., 2002).
OCR for page 117
Clinical Practice Guidelines We Can Trust TABLE 5-1 Interpretation of the American College of Physicians’ Guideline Grading System Grade of Recommendation Benefit Versus Risks and Burdens Methodological Quality of Supporting Evidence Interpretation Implications Strong recommendation; high-quality evidence Benefits clearly outweigh risks and burden or vice versa Randomized Clinical Trials (RCTs) without important limitations or overwhelming evidence from observational studies Strong recommendation; can apply to most patients in most circumstances without reservation For patients, would want the recommended course of action and only a small proportion would not; a person should request discussion if the intervention was not offered Strong recommendation; moderate-quality evidence Benefits clearly outweigh risks and burden or vice versa RCTs with important limitations (inconsistent results; methodological flaws; indirect, imprecise, or exceptionally strong evidence from observational studies) For clinicians, most patients should receive the recommended course of action For policy makers, the recommendation can be adopted as a policy in most situations Strong recommendation; low-quality evidence Benefits clearly outweigh risks and burden or vice versa Observational studies or case series Strong recommendation, but may change when higher quality evidence becomes available Weak recommendation; high-quality evidence Benefits closely balanced with risks and burden RCTs without important limitations or overwhelming evidence from observational studies Weak recommendation; best action may differ depending on circumstances or patients’ or societal values For patients, most would want the recommended course of action, but some would not—a decision may depend on an individual’s circumstances
OCR for page 118
Clinical Practice Guidelines We Can Trust Grade of Recommendation Benefit Versus Risks and Burdens Methodological Quality of Supporting Evidence Interpretation Implications Weak recommendation; moderate-quality evidence Benefits closely balanced with risks and burden RCTs with important limitations (inconsistent results, methodological flaws, indirect, or imprecise) or exceptionally strong evidence from observational studies For clinicians, different choices will be appropriate for different patients, and a management decision consistent with a patient’s values, preferences, and circumstances should be reached For policy makers, policy making will require substantial debate and involvement of many stakeholders Weak recommendation; low-quality evidence Uncertainty in the estimates of benefits, risks, and burden; benefits, risks, and burden may be closely balanced Observational studies or case Series Very weak recommendations; other alternatives may be equally reasonable Insufficient Balance of benefits and risks cannot be determined Evidence is conflicting, poor quality, or lacking Insufficient evidence to recommend for or against routinely providing the service For patients, decisions based on evidence from scientific studies cannot be made; for clinicians, decisions based on evidence from scientific studies cannot be made; for policy makers, decisions based on evidence from scientific studies cannot be made SOURCE: Qaseem et al. (2010).
OCR for page 119
Clinical Practice Guidelines We Can Trust Eisinger and coauthors (1999) investigated U.S. and French consensus statements regarding breast and ovarian cancer that identified important distinctions in clinical recommendations, particularly given clinical uncertainty. Both consensus statements indicated that mastectomy and oophorectomy are reasonable options for women at high cancer risk, even given inadequate evidence and demonstrations of later breast or ovarian cancer development in women undergoing the procedures. However, the recommendations are vastly different. The French guidelines assert that physicians should “oppose” prophylactic mastectomy in women under age 30 and prophylactic oophorectomy under age 35, and these treatment options should be considered only when a breast cancer risk is greater than 60 percent and an ovarian cancer risk is greater than 20 percent. In the United States, informed choice is adequate justification to perform both surgeries. Eisinger and coauthors (1999) suggested that clinician opposition to delegating decision making to patients is less palatable to the French medical community. Simultaneously, this viewpoint would be perceived as paternalistic to American patients and providers who are embedded in a context where patient preferences and participatory decision making are highly valued. However, even within national borders, credible guideline development groups reach contrasting conclusions despite a common evidence base, as Box 5-1 illustrates. Burgers and colleagues investigated 15 Type 2 diabetes CPGs from 13 countries in an attempt to identify variables influential to clinical recommendations (Burgers et al., 2002). In essence, the authors corroborated prior findings in determining that research evidence is not always the most important contributor to practice guideline recommendation content. Instead their results demonstrate there is little consistency in studies selected for review. References serving as evidentiary foundations for recommendations were highly variable across 15 guidelines investigated. Specifically, when considering a single CPG, only 18 percent of citations were consistent with those of any other guideline. Only 1 percent of citations were overlapping across six or more guidelines. In spite of this, the level of guideline recommendation concordance was strong, with a high degree of international consensus on the clinical care of Type 2 diabetes. Burgers and coauthors assert that “Guideline development is a social as well as technical process that is affected by access to and choice of research evidence and decisions about the interpretation of evidence and formulation of recommendations … guidelines go beyond simple reviews of available evidence and necessarily reflect value judgments in considering all the issues
OCR for page 134
Clinical Practice Guidelines We Can Trust 7. External Review 7.1 External reviewers should comprise a full spectrum of relevant stakeholders, including scientific and clinical experts, organizations (e.g., health care, specialty societies), agencies (e.g., federal government), patients, and representatives of the public. 7.2 The authorship of external reviews submitted by individuals and/or organizations should be kept confidential unless that protection has been waived by the reviewer(s). 7.3 The GDG should consider all external reviewer comments and keep a written record of the rationale for modifying or not modifying a CPG in response to reviewers’ comments. 7.4 A draft of the CPG at the external review stage or immediately following it (i.e., prior to the final draft) should be made available to the general public for comment. Reasonable notice of impending publication should be provided to interested public stakeholders. UPDATING Clinical practice guideline recommendations often require updating, although how often and by what process are debated. For certain clinical areas, frequent updating may be necessary given a preponderance of new evidence affecting treatment recommendations. Johnston et al. concluded that for purposes of updating cancer care guidance, a quarterly literature search was appropriate, although the product of this varied across cancer guideline topical emphases (Johnston et al., 2003). A review process detailed on the National Comprehensive Cancer Network (NCCN) website includes a continuous institutional review, whereby each NCCN panel member is sent the current year’s guideline for distribution to institutional experts for comment. Additionally, an annual panel review consisting of a full-day meeting takes place every 3 years and conference calls or in-person meetings are conducted for updates between meetings (NCCN, 2003). However, as alluded to above, there is evidence that recurrent updating may not be an efficient activity in all clinical areas. In a 2002 study of updated (from 1994/95 to 1998/99) primary care evidence-based guidelines of angina and asthma in adults, Eccles stated, The fact that recommendations were not overturned and only one new drug treatment emerged suggests that, over the 3-year period
OCR for page 135
Clinical Practice Guidelines We Can Trust from initial development to updating, the evidence base for both guidelines was relatively stable. This, plus the fact that there were few financial savings to be made within the updating process, highlights the questions of how frequently the updating process should be performed and whether or not it should be performed in its entirety or only in new areas. (Eccles, 2002, p. 102) Shekelle et al. (2001) argued there are six situations (termed the “situational” approach) that may necessitate the updating of a clinical practice guideline: Changes in evidence on the existing benefits and harms of interventions Changes in outcomes considered important Changes in available interventions Changes in evidence that current practice is optimal Changes in values placed on outcomes Changes in resources available for health care Changes in values placed on outcomes often reflect societal norms. Measuring values placed on outcomes and how these change over time is complex and has not been systematically studied. When changes occur in the availability of resources for health care or the costs of interventions, a generic policy on updating is unlikely to be helpful because policy makers in disparate healthcare systems consider different factors in deciding whether services remain affordable. Most empirical effort in this area has been directed to defining when new evidence on interventions, outcomes, and performance justifies updating guidelines. This process includes two stages: (1) identifying significant new evidence, and (2) assessing whether new evidence warrants updating. Within any individual guideline, some recommendations may be invalid while others remain current. A guideline on congestive heart failure, for example, includes 27 individual recommendations related to diagnosis (Jessup et al., 2009). How many must be invalid to require updating the entire guideline? Clearly a guideline requires updating if a majority of recommendations is out of date, with current evidence demonstrating that recommended interventions are inappropriate, ineffective, superseded by new interventions, or no longer or newly generalizable to a particular population. In other cases a single, outdated recommendation could invalidate an entire document. Typically, Eccles reported in 2002, no systematic process exists to help determine whether, and in what areas, researchers have published significant new evidence (Eccles, 2002). Judgments about whether a guideline’s recommendation(s)
OCR for page 136
Clinical Practice Guidelines We Can Trust requires updating typically are inherently subjective and reflect the clinical importance and number of invalid recommendations. In a relatively unusual empirical exercise, Shekelle and colleagues (2001) applied the six situational criteria presented above to assessment of need for updating 17 clinical guidelines published by the Agency for Healthcare Research and Quality. They found seven guidelines were so out of date a major update was required; six guidelines required a minor update; three guidelines remained valid; and one guideline’s update needs were inconclusive. The authors concluded that, as a general rule, guidelines should be reevaluated no less frequently than every 3 years. Perhaps not coincidentally, in an evaluation of the need for updating systematic reviews, Shojania and colleagues found that nearly one quarter of systematic reviews are likely to be outdated 2 years after publication (Shojania et al., 2007). Shekelle and coauthors’ (2001) methods provide for a balancing of guideline updating costs and benefits from the perspective that a full redevelopment is not always appropriate. Gartlehner and colleagues (2004) directly addressed this issue in comparing the Shekelle et al. “situational” approach to a “traditional” updating strategy (comparable to de novo guideline development) across six topics from the 1996 U.S. Preventive Services Task Force Guide to Clinical Preventive Services (USPSTF, 1996). The authors examined completeness of study identification, importance of studies missed, and resources required. Gartlehner and coauthors demonstrated that “Although the [Shekelle] approach identified fewer eligible studies than the traditional approach, none of what the studies missed was rated as important by task force members acting as liaisons to the project with respect to whether the topic required an update. On average, the [Shekelle] approach produced substantially fewer citations to review than the traditional approach. The effort involved and potential time savings depended largely on the scope of the topic.” On the basis of these findings, Gartlehner and coauthors concluded that, “The [Shekelle] approach provides an efficient and acceptable method for judging whether a guideline requires updating” (Gartlehner et al., 2004, p. 399). From the time it publishes a CPG, the ACC/AHA Guidelines Task Force requires that a research analyst and committee chair monitor significant new clinical trials and peer-reviewed literature, and compare current guideline recommendations against latest topical evidence. At the behest of the entire guideline-writing committee, a full revision of the guideline is required when at least two previous focused updates and/or new evidence suggests that a significant number of recommendations require revision. Revisions
OCR for page 137
Clinical Practice Guidelines We Can Trust are managed as new guidelines, except for writing committee selection, where half of the previous writing committee is rotated off to allow for the inclusion of new members (ACCF and AHA, 2008). Similar methods have been enshrined within the processes of other guideline development programs. In the United Kingdom, NICE recommends a combination of literature searching and professional opinion to inform the need for “full” or “partial” updates and describes related processes. Changes in relevant evidence as well as guideline scope (outcomes of important or available interventions) are emphasized. The assessment of update need occurs every 3 years. In the National Guideline Clearinghouse, admitted guidelines are required to have been reexamined every 5 years (NGC, 2010a). Overall, another point to emphasize is that “Many guidelines in current use were developed before criteria were available to evaluate guideline quality. Efforts to improve quality should not be limited to frequent updates of the underlying evidence review, but should incorporate other guideline improvements during the revision process” (Clark et al., 2006, p. 166). Moreover, attempts at harmonization of guidelines from different development groups may also be an appropriate consideration at the time of updating. 8. Updating 8.1 The CPG publication date, date of pertinent systematic evidence review, and proposed date for future CPG review should be documented in the CPG. 8.2 Literature should be monitored regularly following CPG publication to identify the emergence of new, potentially relevant evidence and to evaluate the continued validity of the CPG. 8.3 CPGs should be updated when new evidence suggests the need for modification of clinically important recommendations. For example, a CPG should be updated if new evidence shows that a recommended intervention causes previously unknown substantial harm; that a new intervention is significantly superior to a previously recommended intervention from an efficacy or harms perspective; or that a recommendation can be applied to new populations. CONCLUSION For a clinical practice guideline to be deemed trustworthy, the committee believes that adherence to the proposed development
OCR for page 138
Clinical Practice Guidelines We Can Trust standards articulated within Chapters 4 and 5 is essential, and thus recommends the following: RECOMMENDATION: TRUSTWORTHINESS OF CPG DEVELOPMENT PROCESS To be trustworthy, a clinical practice guideline should comply with proposed standards 1–8. Optimally, CPG developers should adhere to these proposed standards and CPG users should adopt CPGs compliant with these proposed standards. In total, the committee’s standards reflect best practices across the entire development process and thus comprise those relevant to establishing transparency, management of conflict of interest, development team composition and process, clinical practice guideline–systematic review intersection, establishing evidence foundations for and rating strength of recommendations, articulation of recommendations, external review, and updating. Although the committee strongly supports that CPGs comply with the eight standards proposed herein, it is also sympathetic to the time and other resource requirements the standards imply. It may not be feasible, for example, for guideline developers to immediately comply with the full body of standards, and a process of evolutionary adoption over time may be more practicable. Additionally, certain standards, such as those directed to patient and public involvement in the CPG development process and external review, may appear particularly resource intensive. The committee urges developers to comply with such standards while taking care to adopt each of their key elements (e.g., adoption of strategies to increase effective participation of patient and consumer representatives) so that efficiencies may be increased. Finally, the committee understands that the uniqueness of guideline development contexts may seemingly preclude certain developers from fully adhering to the standards the committee has proposed. For example, certain clinical areas (e.g., rare malignant tumors) are characterized by an exceptional dearth of scientific literature and an urgent need to deliver patient care. The committee recognizes that developers in this instance may conclude they are unable to comply with Standard 4.1: “Clinical practice guideline developers should use systematic reviews that meet standards set by the Institute of Medicine’s Committee on Standards for Systematic Reviews of Comparative Effectiveness Research.” However, SRs that conclude there are no high-quality RCTs or observational studies
OCR for page 139
Clinical Practice Guidelines We Can Trust on a particular clinical question would still fulfill Standard 4. In all cases, whether evidence is limited or abundant, GDGs should comply with the complementary Standard 5: “Establishing Evidence Foundations for and Rating Strength of Recommendations,” by providing a summary of relevant available evidence (and evidentiary gaps), descriptions of the quality (including applicability), quantity (including completeness), and consistency of the aggregate available evidence; an explanation of the part played by values, opinion, theory, or clinical experience in deriving recommendations; a judgment regarding the level of confidence in (certainty regarding) the evidence underpinning the recommendations; and a rating of the strength of recommendations. REFERENCES ACCF and AHA (American College of Cardiology Foundation and American Heart Association). 2008. Methodology manual for ACCF/AHA guideline writing committees. In Methodologies and policies from ACCF/AHA taskforce on practice guidelines. ACCF and AHA. AGREE (Appraisal of Guidelines for Research & Evaluation). 2001. Appraisal of Guidelines for Research & Evaluation (AGREE) Instrument. AHRQ (Agency for Healthcare Research and Quality). 2008. U.S. Preventive Services Task Force procedure manual. AHRQ Publication No. 08-05118-EF. http://www.ahrq.gov/clinic/uspstf08/methods/procmanual.htm (accessed February 16, 2009). Akl, E. A., N. Maroun, G. Guyatt, A. D. Oxman, P. Alonso-Coello, G. E. Vist, P. J. Devereaux, V. M. Montori, and H. J. Schünemann. 2007. Symbols were superior to numbers for presenting strength of recommendations to health care consumers: A randomized trial. Journal of Clinical Epidemiology 60(12):1298–1305. Anonymous. 1979. Canadian task force on the periodic health examination: The periodic health examination. CMAJ 121:1193–1254. Ansari, M. T., A. Tsertsvadze, and D. Moher. 2009. Grading quality of evidence and strength of recommendations: A perspective. PLoS Medicine 6(9):e1000151. Atkins, D. 2004. Grading quality of evidence and strength of recommendations. BMJ 328(7454):1490. Atkins, D., M. Eccles, S. Flottorp, G. H. Guyatt, D. Henry, S. Hill, A. Liberati, D. O’Connell, A. D. Oxman, B. Phillips, H. Schünemann, T. T. T. Edejer, G. E. Vist, and J. W. Williams, Jr. 2004. Systems for grading the quality of evidence and the strength of recommendations I: Critical appraisal of existing approaches. BMC Health Services Research 4(38):1–7. Atkins, D., P. Briss, M. Eccles, S. Flottorp, G. Guyatt, R. Harbour, S. Hill, R. Jaeschke, A. Liberati, N. Magrini, J. Mason, D. O’Connell, A. Oxman, B. Phillips, H. Schünemann, T. Edejer, G. Vist, J. Williams, and The Grade Working Group. 2005. Systems for grading the quality of evidence and the strength of recommendations II: Pilot study of a new system. BMC Health Services Research 5(1):25. Bogardus, S. T., Jr., E. Holmboe, and J. F. Jekel. 1999. Perils, pitfalls, and possibilities in talking about medical risk. JAMA 281(11):1037–1041.
OCR for page 140
Clinical Practice Guidelines We Can Trust Boyd, C. 2010. CPGs for people with multimorbidities. Presented at the IOM Committee on Standards for Developing Trustworthy Clinical Practice Guidelines meeting, on November 11, Washington, DC. Boyd, C. M., J. Darer, C. Boult, L. P. Fried, L. Boult, and A. W. Wu. 2005. Clinical practice guidelines and quality of care for older patients with multiple comorbid diseases: Implications for pay for performance. JAMA 294(6):716–724. Boyd, C. M., C. O. Weiss, J. Halter, K. C. Han, W. B. Ershler, and L. P. Fried. 2007. Framework for evaluating disease severity measures in older adults with comorbidity. Journal of Gerontology: Series A Biologic Sciences: Medical Sciences 62(3):286–295. Braddock, C. H., III, K. A. Edwards, N. M. Hasenberg, T. L. Laidley, and W. Levinson. 1999. Informed decision making in outpatient practice: Time to get back to basics. JAMA 282(24):2313–2320. Bruera, E., J. S. Willey, J. L. Palmer, and M. Rosales. 2002. Treatment decisions for breast carcinoma: Patient preferences and physician perceptions. Cancer 94(7):2076–2080. Buller, H. R., G. Agnelli, R. D. Hull, T. M. Hyers, M. H. Prins, and G. E. Raskob. 2004. Antithrombotic therapy for venous thromboembolic disease. Chest 126(Suppl 3):401S–428S. Burgers, J. S., J. V. Bailey, N. S. Klazinga, A. K. van der Bij, R. Grol, and G. Feder. 2002. Inside guidelines: Comparative analysis of recommendations and evidence in diabetes guidelines from 13 countries. Diabetes Care 25(11):1933–1939. Burgers, J. S., R. P. Grol, J. O. Zaat, T. H. Spies, A. K. van der Bij, and H. G. Mokkink. 2003. Characteristics of effective clinical guidelines for general practice. British Journal of General Practice 53(486):15–19. Calonge, N., and R. Harris. 2010. United States Preventive Research Task Force (USPSTF). Presented at IOM Committee on Standards for Developing Trustworthy Clinical Practice Guidelines, April 19, 2010. Irvine, CA. Calonge, N., and G. Randhawa. 2004. The meaning of the U.S. Preventive Services Task Force grade I recommendation: Screening for hepatitis C virus infection. Annals of Internal Medicine 141(9):718–719. Carlsen, B., and O. F. Norheim. 2008. “What lies beneath it all?”—An interview study of GPs’ attitudes to the use of guidelines. BMC Health Services Research 8(218). Clark, E., E. F. Donovan, and P. Schoettker. 2006. From outdated to updated, keeping clinical guidelines valid. International Journal of Quality Health Care 18(3): 165–166. Cluzeau, F., J. Burgers, M. Brouwers, R. Grol, M. Mäkelä, P. Littlejohns, J. Grimshaw, C. Hunt, J. Asua, A. Bataillard, G. Browman, B. Burnand, P. Durieux, B. Fervers, R. Grilli, S. Hanna, P. Have, A. Jovell, N. Klazinga, F. Kristensen, P. B. Madsen, J. Miller, G. Ollenschläger, S. Qureshi, R. Rico-Iturrioz, J. P. Vader, and J. Zaat. 2003. Development and validation of an international appraisal instrument for assessing the quality of clinical practice guidelines: The AGREE project. Quality and Safety in Health Care 12(1):18–23. Coulter, A. 2002. The autonomous patient: Ending paternalism in medical care. London, UK: Nuffield Trust. Cuervo, L. G., and M. Clarke. 2003. Balancing benefits and harms in health care. BMJ 327(7406):65–66. Deber, R. B., N. Kraetschmer, and J. Irvine. 1996. What role do patients wish to play in treatment decision making? Archives of Internal Medicine 156(13):1414–1420. Eccles, M., N. Rousseau, and N. Freemantle. 2002. Updating evidence-based clinical guidelines. Journal of Health Services Research and Policy 7(2):98–103.
OCR for page 141
Clinical Practice Guidelines We Can Trust Eisinger, F., G. Geller, W. Burke, and N. A. Holtzman. 1999. Cultural basis for differences between U.S. and French clinical recommendations for women at increased risk of breast and ovarian cancer. The Lancet 353(9156):919–920. Elson, C. O., M. Ballew, J. A. Barnard, S. J. Bernstein, I. J. Check, M. B. Cohen, S. Fazio, J. F. Johanson, N. M. Lindor, E. Montgomery, L. H. Richardson, D. Rogers, and S. Vijan. 2004. National Institutes of Health Consensus Development Conference Statement June 28–30, 2004. Paper presented at NIH Consensus Development Conference on Celiac Diseas, Bethesda, MD. Ferreira, P. H., M. L. Ferreira, C. G. Maher, K. Refshauge, R. D. Herbert, and J. Latimer. 2002. Effect of applying different “levels of evidence” criteria on conclusions of Cochrane reviews of interventions for low back pain. Journal of Clinical Epidemiology 55(11):1126–1129. Frosch, D. L., and R. M. Kaplan. 1999. Shared decision making in clinical medicine: Past research and future directions. American Journal of Preventive Medicine 17(4):285–294. Gartlehner, G., S. L. West, K. N. Lohr, L. Kahwati, J. G. Johnson, R. P. Harris, L. Whitener, C. E. Voisin, and S. Sutton. 2004. Assessing the need to update prevention guidelines: A comparison of two methods. International Journal for Quality in Health Care 16(5):399–406. Gochman, D. S. 1988. Health behavior: Emerging research perspectives. New York: Plenum Press. Greenfield, S., J. Billimek, F. Pellegrini, M. Franciosi, G. De Berardis, A. Nicolucci, and S. H. Kaplan. 2009. Comorbidity affects the relationship between glycemic control and cardiovascular outcomes in diabetes: A cohort study. Annals of Internal Medicine 151(12):854–860. Grol, R., J. Dalhuijsen, S. Thomas, C. Veld, G. Rutten, and H. Mokkink. 1998. Attributes of clinical guidelines that influence use of guidelines in general practice: Observational study. BMJ 317(7162):858–861. Guyatt, G., D. Gutterman, M. H. Baumann, D. Addrizzo-Harris, E. M. Hylek, B. Phillips, G. Raskob, S. Z. Lewis, and H. Schünemann. 2006a. Grading strength of recommendations and quality of evidence in clinical guidelines: Report from an American College of Chest Physicians task force. Chest 129(1):174–181. Guyatt, G., G. Vist, Y. Falck-Ytter, R. Kunz, N. Magrini, and H. Schünemann. 2006b. An emerging consensus on grading recommendations? Evidence Based Medicine 11(1):2–4. Guyatt, G. H., A. D. Oxman, R. Kunz, Y. Falck-Ytter, G. E. Vist, A. Liberati, and H. J. Schünemann. 2008a. Going from evidence to recommendations. BMJ 336(7652):1049–1051. Guyatt, G. H., A. D. Oxman, R. Kunz, G. E. Vist, Y. Falck-Ytter, H. J. Schünemann, and GRADE Working Group. 2008b. What is “quality of evidence” and why is it important to clinicians? BMJ 336(7651):995–998. Hibbard, J. H. 2003. Engaging health care consumers to improve the quality of care. Medical Care 41(Suppl 1):I61–I70. Hill, I. D., M. H. Dirks, G. S. Liptak, R. B. Colletti, A. Fasano, S. Guandalini, E. J. Hoffenberg, K. Horvath, J. A. Murray, M. Pivor, and E. G. Seidman. 2005. Guideline for the diagnosis and treatment of celiac disease in children: Recommendations of the North American Society for Pediatric Gastroenterology, Hepatology and Nutrition. Journal of Pediatric Gastroenterology and Nutrition 40(1):1–19. Holmes, M. M., D. R. Rovner, M. L. Rothert, A. S. Elstein, G. B. Holzman, R. B. Hoppe, W. P. Metheny, and M. M. Ravitch. 1987. Women’s and physicians’ utilities for health outcomes in estrogen replacement therapy. Journal of General Internal Medicine 2(3):178–182.
OCR for page 142
Clinical Practice Guidelines We Can Trust Hussain, T., G. Michel, and R. N. Shiffman. 2009. The Yale guideline recommendation corpus: A representative sample of the knowledge content of guidelines. International Journal of Medical Informatics 78(5):354–363. IOM (Institute of Medicine). 1992. Guidelines for clinical practice: From development to use. Edited by M. J. Field and K. N. Lohr. Washington, DC: National Academy Press. Jessup, M., W. T. Abraham, D. E. Casey, A. M. Feldman, G. S. Francis, T. G. Ganiats, M. A. Konstam, D. M. Mancini, P. S. Rahko, M. A. Silver, L. W. Stevenson, and C. W. Yancy. 2009. 2009 Focused update: ACCF/AHA guidelines for the diagnosis and management of heart failure in adults: A report of the American College of Cardiology Foundation/American Heart Association task force on practice guidelines: Developed in collaboration with the International Society for Heart and Lung Transplantation. Circulation 119(14):1977–2016. Johnston, M. E., M. C. Brouwers, and G. P. Browman. 2003. Keeping cancer guidelines current: Results of a comprehensive prospective literature monitoring strategy for twenty clinical practice guidelines. International Journal of Technology Assessment in Health Care 19(4):646–655. Kaplan, S. H., and J. E. Ware. 1995. The patient’s role in health care and quality assessment. In Providing quality care: Future challenges, 2nd ed., edited by N. Goldfield and D. B. Nash. Ann Arbor, MI: Health Administration Press. Kassirer, J. P. 1994. Incorporating patients’ preferences into medical decisions. New England Journal of Medicine 330(26):1895–1896. Kassirer, J. P., and S. G. Pauker. 1981. The toss-up. New England Journal of Medicine 305(24):1467–1469. Kavanagh, B. P. 2009. The GRADE system for rating clinical guidelines. PLoS Medicine 6(9):e1000094. Krahn, M., and G. Naglie. 2008. The next step in guideline development: Incorporating patient preferences. JAMA 300(4):436–438. Laine, C., and F. Davidoff. 1996. Patient-centered medicine: A professional evolution. JAMA 275(2):152–156. Lomotan, E. A., G. Michel, Z. Lin, and R. N. Shiffman. 2010. How “should” we write guideline recommendations? Interpretation of deontic terminology in clinical practice guidelines: Survey of the health services community. Quality and Safety in Health Care. 19:509–513. Luce, B. R., M. Drummond, B. Jonsson, P. J. Neumann, J. S. Schwartz, U. Siebert, and S. D. Sullivan. 2010. EBM, HTA, and CER: Clearing the confusion. Milbank Quarterly 88(2):256–276. McNutt, R. A. 2004. Shared medical decision making: Problems, process, progress. JAMA 292(20):2516–2518. Merenstein, D. 2004. Winners and losers. JAMA 291(1):15–16. Michie, S., and M. Johnston. 2004. Changing clinical behaviour by making guidelines specific. BMJ 328:343–345. Michie, S., J. Berentson-Shaw, S. Pilling, G. Feder, P. Dieppe, R. Raine, F. Cluzeau, P. Alderson, and S. Ellis. 2007. Turning evidence into recommendations: Protocol of a study of guideline development groups. Implementation Science 2:29. NCCN (National Comprehensive Cancer Network). 2003. About the NCCN clinical practice guidelines in oncology (NCCN Guidelines™). http://www.nccn.org/professionals/physician_gls/about.asp (accessed June 30, 2010). Nelson, H. D., K. Tyne, A. Naik, C. Bougatsos, B. K. Chan, and L. Humphrey. 2009. Screening for breast cancer: An update for the U.S. Preventive Services Task Force. Annals of Internal Medicine 151(10):727–737, W237–W742. NGC (National Guideline Clearinghouse). 2010a. Inclusion criteria: National Guideline Clearinghouse. http://ngc.gov/submit/inclusion.aspx (accessed April 5, 2010).
OCR for page 143
Clinical Practice Guidelines We Can Trust NGC. 2010b. National Guideline Clearinghouse. http://www.guideline.gov/ (accessed April 7, 2010). NICE (National Institute for Health and Clinical Excellence). 2009. Methods for the development of NICE public health guidance, 2nd ed. London, UK: NICE. O’Connor, A. M., I. D. Graham, and A. Visser. 2005. Implementing shared decision making in diverse health care systems: The role of patient decision aids. Patient Education and Counseling 57(3):247–249. O’Connor, A. M., C. L. Bennett, D. Stacey, M. Barry, N. F. Col, K. B. Eden, V. A. Entwistle, V. Fiset, M. Holmes-Rovner, S. Khangura, H. Llewellyn-Thomas, and D. Rovner. 2009. Decision aids for people facing health treatment or screening decisions. Cochrane Database of Systematic Reviews (3):CD001431. Ogan, K., H. G. Pohl, D. Carlson, A. B. Belman, and H. G. Rushton. 2001. Parental preferences in the management of vesicoureteral reflux. Journal of Urology 166(1):240–243. Oxman, A. D., H. J. Schünemann, and A. Fretheim. 2006. Improving the use of research evidence in guideline development: Reporting guidelines. Health Research Policy and Systems 4(1):26. Pauker, S. G., and J. P. Kassirer. 1997. Contentious screening decisions: Does the choice matter? New England Journal of Medicine 336(17):1243–1244. Pignone, M., D. Bucholtz, and R. Harris. 1999. Patient preferences for colon cancer screening. Journal of General Internal Medicine 14(7):432–437. Qaseem, A., V. Snow, D. K. Owens, and P. Shekelle. 2010. The development of clinical practice guidelines and guidance statements of the American College of Physicians: Summary of methods. Annals of Internal Medicine 153(3):194–199. Raine, R., C. Sanderson, A. Hutchings, S. Carter, K. Larkin, and N. Black. 2004. An experimental study of determinants of group judgments in clinical guideline development. The Lancet 364(9432):429–437. Rosenfeld, R., and R. N. Shiffman. 2009. Clinical practice guideline development manual: A quality-driven approach for translating evidence into action. Otolaryngology–Head & Neck Surgery 140(6)(Suppl 1):1–43. Rostom, A., J. A. Murray, and M. F. Kagnoff. 2006. American Gastroenterological Association (AGA) institute technical review on the diagnosis and management of celiac disease. Gastroenterology 131(6):1981–2002. Schünemann, H. J., A. Fretheim, and A. D. Oxman. 2006a. Improving the use of research evidence in guideline development: Grading evidence and recommendations. Health Research Policy and Systems 4:21. Schünemann, H. J., R. Jaeschke, D. J. Cook, W. F. Bria, A. A. El-Solh, A. Ernst, B. F. Fahy, M. K. Gould, K. L. Horan, J. A. Krishnan, C. A. Manthous, J. R. Maurer, W. T. McNicholas, A. D. Oxman, G. Rubenfeld, G. M. Turino, and G. Guyatt. 2006b. An official ATS statement: Grading the quality of evidence and strength of recommendations in ATS guidelines and recommendations. American Journal of Respiratory and Critical Care Medicine 174(5):605–614. Shekelle, P., R. Kravitz, J. Beart, M. Marger, M. Wang, and M. Lee. 2000. Are nonspecific practice guidelines potentially harmful? A randomized trial of the effect of nonspecific versus specific guidelines on physician decision making. Health Services Research 34:1429–1448. Shekelle, P., M. P. Eccles, J. M. Grimshaw, and S. H. Woolf. 2001. When should clinical guidelines be updated? BMJ 323(7305):155–157. Shekelle, P. G., H. Schünemann, S. H. Woolf, M. Eccles, and J. Grimshaw. 2010. State of the art of CPG development and best practice standards. In Committee on Standards for Trustworthy Clinical Practice Guidelines commissioned paper.
OCR for page 144
Clinical Practice Guidelines We Can Trust Shojania, K. G., M. Sampson, M. T. Ansari, J. Ji, S. Doucette, and D. Moher. 2007. How quickly do systematic reviews go out of date? A survival analysis. Annals of Internal Medicine 147(4):224–233. Shrier, I., J.-F. Boivin, R. Platt, R. Steele, J. Brophy, F. Carnevale, M. Eisenberg, A. Furlan, R. Kakuma, M. Macdonald, L. Pilote, and M. Rossignol. 2008. The interpretation of systematic reviews with meta-analyses: An objective or subjective process? BMC Medical Informatics and Decision Making 8(1):19. Sox, H. C., and S. Greenfield. 2010. Quality of care—how good is good enough? JAMA 303(23):2403–2404. Strull, W. M., B. Lo, and G. Charles. 1984. Do patients want to participate in medical decision making? JAMA 252(21):2990–2994. Swiglo, B. A., M. H. Murad, H. J. Schünemann, R. Kunz, R. A. Vigersky, G. H. Guyatt, and V. M. Montori. 2008. A case for clarity, consistency, and helpfulness: State-of-the-art clinical practice guidelines in endocrinology using the grading of recommendations, assessment, development, and evaluation system. Journal of Clinical Endocrinology and Metabolism 93(3):666–673. Teno, J. M., R. B. Hakim, W. A. Knaus, N. S. Wenger, R. S. Phillips, A. W. Wu, P. Layde, A. F. Connors, Jr., N. V. Dawson, and J. Lynn. 1995. Preferences for cardiopulmonary resuscitation: physician-patient agreement and hospital resource use. The SUPPORT Investigators. Journal of General Internal Medicine 10(4):179–186. Tinetti, M. E., S. T. Bogardus, Jr., and J. V. Agostini. 2004. Potential pitfalls of disease-specific guidelines for patients with multiple conditions. New England Journal of Medicine 351(27):2870–2874. Uhlig, K., A. Macleod, J. Craig, J. Lau, A. S. Levey, A. Levin, L. Moist, E. Steinberg, R. Walker, C. Wanner, N. Lameire, and G. Eknoyan. 2006. Grading evidence and recommendations for clinical practice guidelines in nephrology. A position statement from Kidney Disease: Improving Global Outcomes (KDIGO). Kidney International 70(12):2058–2065. USPSTF (US Preventive Services Task Force). 1996. Guide to Clinical Preventive Services. 2nd edn. Alexandria, VA: International Medical Publishing. Verkerk, K., H. Van Veenendaal, J. L. Severens, E. J. M. Hendriks, and J. S. Burgers. 2006. Considered judgement in evidence-based guideline development. International Journal for Quality in Health Care 18(5):365–369. Wills, C. E., and M. Holmes-Rovner. 2003. Patient comprehension of information for shared treatment decision making: State of the art and future directions. Patient Education and Counseling 50(3):285–290. Woloshin, S., L. M. Schwartz, M. Moncur, S. Gabriel, and A. N. Tosteson. 2001. Assessing values for health: Numeracy matters. Medical Decision Making 21(5):382–390. Woolf, S. H. 1997. Shared decision-making: The case for letting patients decide which choice is best. Journal of Family Practice 45(3):205–208. Woolf, S. H., and J. N. George. 2000. Evidence-based medicine: Interpreting studies and setting policy. Hematology/Oncology Clinics of North America 14(4):761–784. Woolf, S. H. 2010. The 2009 breast cancer screening recommendations of the U.S. Preventive Services Task Force. JAMA 303(2):162–163.