Need help?

800-5315-2751 Hours: 8am-5pm PST M-Th;  8am-4pm PST Fri
Medicine Lakex

Strength of Recommendation Taxonomy (SORT):A Patient-Centered Approach to Grading Evidencein the Medical Literature Mark H. Ebell, MD, MS, Jay Siwek, MD, Barry D. Weiss, MD,Steven H. Woolf, MD, MPH, Jeffrey Susman, MD, Bernard Ewigman, MD, MPH, andMarjorie Bowman, MD, MPA A large number of taxonomies are used to rate the quality of an individual study and the strength of a
recommendation based on a body of evidence. We have developed a new grading scale that will be used
by several family medicine and primary care journals (required or optional), with the goal of allowing
readers to learn one taxonomy that will apply to many sources of evidence. Our scale is called the
Strength of Recommendation Taxonomy. It addresses the quality, quantity, and consistency of evidence
and allows authors to rate individual studies or bodies of evidence. The taxonomy is built around the
information mastery framework, which emphasizes the use of patient-oriented outcomes that measure
changes in morbidity or mortality. An A-level recommendation is based on consistent and good quality
patient-oriented evidence; a B-level recommendation is based on inconsistent or limited quality patient-
oriented evidence; and a C-level recommendation is based on consensus, usual practice, opinion,
disease-oriented evidence, or case series for studies of diagnosis, treatment, prevention, or screening.
Levels of evidence from 1 to 3 for individual studies also are defined. We hope that consistent use of
this taxonomy will improve the ability of authors and readers to communicate about the translation of
research into practice. (J Am Board Fam Pract 2004;17:59 – 67.)

Review articles (or overviews) are highly valued by improve the quality of review articles through the physicians as a way to keep up to date with the use of more explicit grading of the strength of medical literature. Sometimes, though, these arti- evidence on which recommendations are based.1–4 cles are based more on the authors' personal expe- Several journals, including American Family Phy- rience, or anecdotes, or incomplete surveys of the sician and Journal of Family Practice, have adopted literature than on a comprehensive collection of the evidence-grading scales that are used in some of the best available evidence. As a result, there is an articles published in those journals. Other organi- ongoing effort in the medical publishing field to zations and publications have also developed evi-dence-grading scales. The diversity of these scalescan be confusing for readers. More than 100 grad- Submitted, revised 20 November 2003.
ing scales are in use by various medical publica- From the Michigan State University College of Human Medicine, East Lansing (MHE), Georgetown University tions.5 Alevel B recommendation in one journal Medical Center, Washington, DC (JS), University of Ari- may not mean the same thing as a level B recom- zona College of Medicine, Tucson (BDW), Virginia Com-monwealth University School of Medicine, Richmond mendation in another. Even within journals, differ- (SHW), University of Cincinnati College of Medicine, Cin- ent evidence-grading scales sometimes are used in cinnati, Ohio (JS), University of Chicago, Pritzker School ofMedicine, Chicago, Illinois (BE), and University of Penn- different articles within the same issue of a journal.
sylvania Health System, Philadelphia (MAB). Address cor- Journal readers do not have the time, energy, or respondence to Mark Ebell, MD, MS, 300 Snapfinger Dr.,Athens, GA 30605 (e-mail: [email protected]).
interest to interpret multiple grading scales, and Simultaneously published in print and online by American more complex scales are difficult to integrate into Family Physician, Journal of Family Practice, Journal of theAmerican Board of Family Practice, and online by Family daily practice.
Practice Inquiries Network. Copyright 2004 American Fam- Therefore the editors of the US family medicine ily Physician, a publication of the American Academy ofFamily Physicians. All rights reserved.
and primary care journals (ie, American Family Phy- Strength of Recommendation Taxonomy 59 sician, Family Medicine, Journal of Family Practice, Strength of Recommendation Journal of the American Board of Family Practice, and The strength (or grade) of a recommendation for BMJ-USA) and the Family Practice Inquiries Net- clinical practice is based on a body of evidence work (FPIN) came together to develop a unified (typically more than one study). This approach taxonomy for the strength of recommendations takes into account the level of evidence of individ- based on a body of evidence. The new taxonomy ual studies, the type of outcomes measured by these should include the following attributes: (1) be uni- studies (patient-oriented or disease-oriented), the form in most family medicine journals and elec- number, consistency, and coherence of the evi- tronic databases; (2) allow authors to evaluate the dence as a whole, and the relationship between strength of recommendation of a body of evidence; benefits, harms, and costs.
(3) allow authors to rate the level of evidence for anindividual study; (4) be comprehensive and allow Practice Guideline (Evidence-Based) authors to evaluate studies of screening, diagnosis, These guidelines are recommendations for practice therapy, prevention, and prognosis; (5) be easy to that involve a comprehensive search of the litera- use and not too time-consuming for authors, re- ture, an evaluation of the quality of individual stud- viewers, and editors who may be content experts ies, and recommendations that are graded to reflect but not experts in critical appraisal or clinical epi- the quality of the supporting evidence. All search, demiology; and (6) be straightforward enough that critical appraisal, and grading methods should be primary care physicians can readily integrate the described explicitly and be replicable by similarly recommendations into daily practice.
skilled authors.
Practice Guideline (Consensus) Anumber of relevant terms must be defined for Consensus guidelines are recommendations for practice based on expert opinions that typically donot include a systematic search, an assessment of the quality of individual studies, or a system to label These outcomes include intermediate, histopatho- the strength of recommendations explicitly.
logic, physiologic, or surrogate results (ie, bloodsugar, blood pressure, flow rate, coronary plaque Research Evidence thickness) that may or may not reflect improve- This evidence is presented in publications of orig- ments in patient outcomes.
inal research, involving collection of original dataor the systematic review of other original research publications. It does not include editorials, opinion These are outcomes that matter to patients and pieces, or review articles (other than systematic help them live longer or better lives, including reviews or meta-analyses).
reduced morbidity, reduced mortality, symptomimprovement, improved quality of life, or lower Review Article Anonsystematic overview of a topic is a reviewarticle. In most cases, it is not based on an exhaus- Level of Evidence tive, structured review of the literature and does The validity of an individual study is based on an not evaluate the quality of included studies system- assessment of its study design. According to some methodologies,6 levels of evidence can refer notonly to individual studies but also to the quality of Systematic Reviews and Meta-Analyses evidence from multiple studies about a specific Asystematic review is a critical assessment of ex- question or the quality of evidence supporting a isting evidence that addresses a focused clinical clinical intervention. For purposes of maintaining question, includes a comprehensive literature simplicity and consistency in this proposal, we use search, appraises the quality of studies, and reports the term level of evidence to refer to individual results in a systematic manner. If the studies report comparable quantitative data and have a low degree 60 JABFP January–February 2004 Vol. 17 No. 1
of variation in their findings, a meta-analysis can be After considering these criteria and reviewing performed to derive a summary estimate of effect.
the existing taxonomies for grading the strength ofa recommendation, we decided that a new taxon-omy was needed to reflect the needs of our spe- Existing Strength-of-Evidence Scales
cialty. Existing grading scales were focused on a In March 2002, the Agency for Healthcare Re- particular kind of study (ie, prevention or treat- search and Quality (AHRQ) published a report that ment), were too complex, or did not take into summarized the state of the art in methods of rating account the type of outcome.
the strength of evidence.5 The report identified a Our proposed taxonomy is called the Strength of large number of systems for rating the quality of Recommendations Taxonomy (SORT). It is shown individual studies: 20 for systematic reviews, 49 for in Figure 1. The taxonomy includes ratings of A, B, randomized controlled trials, 19 for observational or C for the strength of recommendation for a body studies, and 18 for diagnostic test studies. It also of evidence. The table in the center of Figure 1 identified 40 scales that graded the strength of a explains whether a body of evidence represents body of evidence consisting of one or more studies.
good or limited-quality evidence and whether evi- The authors of the AHRQ report proposed that dence is consistent or inconsistent. The quality of any system for grading the strength of evidence individual studies is rated 1, 2, or 3; numbers are should consider 3 key elements: quality, quantity, used to distinguish ratings of individual studies and consistency. Quality is the extent to which the from the letters A, B, and C used to evaluate the identified studies minimize the opportunity for bias strength of a recommendation based on a body of and is synonymous with the concept of validity.
evidence. Figure 2 provides information about how Quantity is the number of studies and subjects to determine the strength of recommendation for included in those studies. Consistency is the extent management recommendations, and Figure 3 ex- to which findings are similar between different plains how to determine the level of evidence for an studies on the same topic. Only 7 of the 40 systems individual study. These 2 algorithms should be identified and addressed all 3 of these key ele- helpful to authors preparing manuscripts for sub- mission to family medicine journals. The algo-rithms are to be considered general guidelines, and Strength of Recommendation Taxonomy
special circumstances may dictate assignment of a different strength of recommendation (eg, a single, The authors of this article represent the major large, well-designed study in a diverse population family medicine journals in the United States and a may warrant an A-level recommendation).
large family practice academic consortium. Our Recommendations based only on improvements process began with a series of electronic mail ex- in surrogate or disease-oriented outcomes are al- changes, was developed during a meeting of the ways categorized as level C, because improvements editors, and continued through another series of in disease-oriented outcomes are not always asso- electronic mail exchanges.
ciated with improvements in patient-oriented out- We decided that our taxonomy for rating the comes, as exemplified by several well-known find- strength of a recommendation should address the 3 ings from the medical literature. For example, key elements identified in the AHRQ report: qual- doxazosin lowers blood pressure in black pa- ity, quantity, and consistency of evidence. We also tients—a seemingly beneficial outcome— but it also were committed to creating a grading scale that increases mortality rates.12 Similarly, encainide and could be applied by authors with varying degrees of flecainide reduce the incidence of arrhythmias after expertise in evidence-based medicine and clinical acute myocardial infarction, but they also increase epidemiology and interpreted by physicians with mortality rates.13 Finasteride improves urinary flow little or no formal training in these areas. We be- rates, but it does not significantly improve urinary lieved that the taxonomy should address the issue of tract symptoms in patients with benign prostatic patient-oriented evidence versus disease-oriented hypertrophy,14 whereas arthroscopic surgery for evidence explicitly and be consistent with the in- osteoarthritis of the knee improves the appearance formation mastery framework proposed by Slawson of cartilage but does not reduce pain or improve and Shaughnessy.2 joint function.15 Additional examples of clinical sit- Strength of Recommendation Taxonomy 61 Figure 1. The Strength of Recommendation Taxonomy (SORT). SR, systematic review; RCT, randomized controlled

62 JABFP January–February 2004 Vol. 17 No. 1
Figure 2. Algorithm for determining the strength of a recommendation based on a body of evidence (applies to
clinical recommendations regarding diagnosis, treatment, prevention, or screening). Although this algorithm
provides a general guideline, authors and editors may adjust the strength of recommendation based on the
benefits, harms, and costs of the intervention being recommended. USPSTF, US Preventive Services Task Force.

uations where disease-oriented evidence disagrees cians, and explicitly addresses the issue of patient- with patient-oriented evidence are shown in Table oriented versus disease-oriented evidence. The 1.12–24 Examples of how to apply the taxonomy are latter attribute distinguishes SORT from most given in Table 2.
other evidence grading scales. These strengths also We believe there are several advantages to our create some limitations. Some clinicians may be proposed taxonomy. It is straightforward and com- concerned that the taxonomy is not as detailed in its prehensive, is easily applied by authors and physi- assessment of study designs as others, such as that Strength of Recommendation Taxonomy 63 Figure 3. Algorithm for determining the level of evidence for an individual study.
of the Centre for Evidence-Based Medicine concluded that the advantages of a system that (CEBM).25 However, the primary difference be- provides the physician with a clear recommenda- tween the 2 taxonomies is that the CEBM version tion that is strong (A), moderate (B), or weak (C) in distinguishes between good and poor observational its support of a particular intervention outweighs studies whereas the SORT version does not. We the theoretic benefit of distinguishing between 64 JABFP January–February 2004 Vol. 17 No. 1
Table 1. Examples of Inconsistency between Disease-Oriented and Patient-Oriented Outcomes
Disease Or Condition Disease-Oriented Outcome Patient-Oriented Outcome Doxazosin for blood pressure12 Reduces blood pressure Increases mortality in blacks Lidocaine for arrhythmia after acute Suppresses arrhythmias Increases mortality myocardial infarction13 Finasteride for benign prostatic Improved urinary flow rate No clinically important change in Sleeping infants on their stomach or Knowledge of anatomy and physiology Increased risk of sudden infant death suggests that this will decrease the risk Vitamin E for heart disease17 Reduces levels of free radicals No change in mortality Histamine antagonists and proton-pump Significantly reduce gastric pH levels Little or no improvement in symptoms inhibitors for nonulcer dyspepsia18 in patients with nongastroesophagealreflux disease, nonulcer dyspepsia Arthroscopic surgery for osteoarthritis of Improved appearance of cartilage after No change in function or symptoms at 1 Hormone therapy19 Reduced low-density lipoprotein No decrease in cardiovascular or all- cholesterol, increased high-density cause mortality and an increase in lipoprotein cholesterol cardiovascular events in women olderthan 60 years (Women's HealthInitiative) with combined hormonetherapy Insulin therapy in type 2 diabetes Keeps blood sugar below 120 mg/dL (6.7 Does not reduce overall mortality Sodium fluoride for fracture prevention21 Increases bone density Does not reduce fracture rate Lidocaine prophylaxis after acute Suppresses arrhythmias Increases mortality myocardial infarction22 Clofibrate for hyperlipidemia23 Does not reduce mortality ␤-blockers for heart failure24 Reduce cardiac output Reduce mortality in moderate to severe lower quality and higher quality observational stud- search of MEDLINE alone, or a more focused ies, particularly because there is no objective evi- search of MEDLINE plus secondary evidence- dence that the latter distinction carries important based sources of information.
differences in clinical recommendations.
Any publication applying SORT (or any other evidence-based taxonomy) should describe care- Walkovers: Creating Linkages with SORT
fully the search process that preceded the assign- Some organizations, such as the CEBM,25 the ment of a SORT rating. For example, authors Cochrane Collaboration,7 and the US Preventive could perform a comprehensive search of MED- Services Task Force (USPSTF),6 have developed LINE and the gray literature, a comprehensive their own grading scales for the strength of recom- Table 2. Examples of How to Apply the SORT in Practice
Example 1: Although a number of observational studies (level of evidence—2) suggested a cardiovascular benefit from vitamin E, a large, well-designed, randomized trial with a diverse patient population (level of evidence—1) showed the opposite. Thestrength of recommendation against routine, long-term use of vitamin E to prevent heart disease, based on the best availableevidence, should be A.
Example 2: ACochrane review finds 7 clinical trials that are consistent in their support of a mechanical intervention for low back pain, but the trials were poorly designed (ie, unblinded, nonrandomized, or with allocation to groups unconcealed). In this case,the strength of recommendation in favor of these mechanical interventions is B (consistent but lower quality clinical trials).
Example 3: Ameta-analysis finds 9 high-quality clinical trials of the use of a new drug in the treatment of pulmonary fibrosis.
Two of the studies find harm, 2 find no benefit, and 5 show some benefit. The strength of recommendation in favor of thisdrug would be B (inconsistent results of good-quality, randomized controlled trials).
Example 4: Anew drug increases the forced expiratory volume in 1 second (FEV1) and peak flow rate in patients with an acute asthma exacerbation. Data on symptom improvement is lacking. The strength of recommendation in favor of using this drug isC (disease-oriented evidence only).
Strength of Recommendation Taxonomy 65 Table 3. Suggested Walkovers between Taxonomies for Assessing the Strength of a Recommendation Based on a
Body of Evidence

BMJ's Clinical Evidence A. Recommendation based on consistent A. Consistent level 1 studies and good quality patient-orientedevidence B. Recommendation based on B. Consistent level 2 or 3 studies or Likely to be beneficial inconsistent or limited-quality patient- extrapolations from level 1 studies Likely to be ineffective or harmful oriented evidence (recommendation against) C. Level 4 studies or extrapolations from Unlikely to be beneficial level 2 or 3 studies (recommendation against) C. Recommendation based on consensus, D. Level 5 evidence or troublingly Unknown effectiveness usual practice, disease-oriented inconsistent or inconclusive studies of evidence, case series for studies of treatment or screening, and/or opinion SORT, Strength Of Evidence Taxonomy; CEBM, Centre for Evidence-Based Medicine; BMJ, BMJ Publishing Group.
Table 4. Suggested Walkover between the CEBM and the SORT Taxonomies for Assessing the Level of Evidence of
an Individual Study

Level 4 or 5 and any study that measures intermediate Level 5 and any study that measures intermediate or or surrogate outcomes surrogate outcomes CEBM, Centre for Evidence-Based Medicine; SORT, Strength of Recommendation Taxonomy.
mendations based on a body of evidence and are the results of research in their practice through the unlikely to abandon them. Other organizations, information mastery approach and to incorporate such as the FPIN,26 publish their work in a variety evidence-based medicine into their patient care.
of settings and must be able to move between Like any such grading scale, it is a work in taxonomies. We have developed a set of optional progress. As we learn more about biases in study walkovers that suggest how authors, editors, and design, and as the authors and readers who use the readers might move from one taxonomy to an- taxonomy become more sophisticated about prin- other. Walkovers for the CEBM and USPSTF ciples of information mastery, evidence-based med- taxonomies are shown in Table 3. icine, and critical appraisal, it is likely to evolve. We Many authors and experts in evidence-based remain open to suggestions from the primary care medicine use the "Level of Evidence" taxonomy community for refining and improving SORT.
from the CEBM to rate the quality of individualstudies.25 Awalkover from the 5-level CEBM scaleto the simpler 3-level SORT scale for individual We thank Lee Green, MD, MPH, John Epling, MD, Kurt studies is shown in Table 4. Stange, MD, PhD, and Margaret Gourlay, MD, for helpfulcomments on the manuscript.
The SORT is a comprehensive taxonomy for 1. Anonymous. Evidence-based medicine. A new ap- evaluating the strength of a recommendation based proach to teaching the practice of medicine. Evi- on a body of evidence and the quality of an indi- dence-Based Medicine Working Group. JAMA vidual study. If applied consistently by authors and 1992;268:2420 –5.
editors in the family medicine literature, it has the 2. Slawson DC, Shaughnessy AF, Bennett JH. Becom- potential to make it easier for physicians to apply ing a medical information master: feeling good about 66 JABFP January–February 2004 Vol. 17 No. 1
not knowing everything. J Fam Pract 1994;38: Studies Benign Prostatic Hyperplasia Study Group.
N Engl J Med 1996;335:533–9.
3. Shaughnessy AF, Slawson DC, Bennett JH. Becom- 15. Moseley JB, O'Malley K, Petersen NJ, et al. Acon- ing an information master: a guidebook to the trolled trial of arthroscopic surgery for osteoarthritis medical information jungle. J Fam Pract 1994;39: of the knee. N Engl J Med 2002;347:81– 8.
16. Dwyer T, Ponsonby AL. Sudden infant death syn- 4. Siwek J, Gourlay ML, Slawson DC, Shaughnessy drome: after the "back to sleep" campaign. BMJ AF. How to write an evidence-based clinical review 1996;313:180 –1.
article. Am Fam Physician 2002;65:251– 8.
17. Yusuf S, Dagenais G, Pogue J, Bosch J, Sleight P.
Vitamin E supplementation and cardiovascular 5. Systems to rate the strength of scientific evidence.
events in high-risk patients. N Engl J Med 2000;342: Evidence report/technology assessment: number 47.
AHRQ publication no. 02-E015. Rockville (MD):Agency for Healthcare Research and Quality; 2002.
18. Moayyedi P, Soo S, Deeks J, Delaney B, Innes M, Available at: URL: .
Forman D. Pharmacological interventions for non-ulcer dyspepsia. Cochrane Database Syst Rev 2003; 6. Harris RP, Helfand M, Woolf SH, et al. Current methods of the US Preventive Services Task Force: 19. Rossouw JE, Anderson GL, Prentice RL, et al. Risks a review of the process. Am J Prev Med 2001;20(3 and benefits of estrogen plus progestin in healthy postmenopausal women: principal results from the 7. Clarke M, Oxman AD. Cochrane reviewer's hand- Women's Health Initiative randomized controlled book 4.0. The Cochrane Collaboration; 2003. Avail- trial. JAMA 2002;288:321–33.
able at: URL: 20. Anonymous. Intensive blood-glucose control with sulphonylureas or insulin compared with conven- 8. Gyorkos TW, Tannenbaum TN, Abrahamowicz M, tional treatment and risk of complications in patients et al. An approach to the development of practice with type 2 diabetes (UKPDS 33). UK Prospective guidelines for community health interventions. Can Diabetes Study (UKPDS) Group [published erratum J Public Health 1994;85(Suppl 1):S8 –13.
appears in Lancet 1999;354:602]. Lancet 1998;352: 9. Briss PA, Zaza S, Pappaioanou M, et al. Developing an evidence-based guide to community preventive 21. Meunier PJ, Sebert JL, Reginster JY, et al. Fluoride services—methods. Am J Prev Med 2000;18(1 salts are no better at preventing new vertebral frac- Suppl):35– 43.
tures than calcium-vitamin D in postmenopausal os-teoporosis: the FAVOStudy. Osteoporos Int 1998;8: 10. Greer N, Mosser G, Logan G, Halaas GW. Aprac- tical approach to evidence grading. Jt Comm J QualImprov 2000;26:700 –12.
22. MacMahon S, Collins R, Peto R, Koster RW, Yusuf S. Effects of prophylactic lidocaine in suspected 11. Guyatt GH, Haynes RB, Jaeschke RZ, et al. Users' acute myocardial infarction. An overview of results Guides to the Medical Literature: XXV. Evidence- from the randomized, controlled trials. JAMA 1988; based medicine: principles for applying the users' 260:1910 – 6.
guides to patient care. JAMA 2000;284:1290 – 6.
23. Grumbach K. How effective is drug treatment of 12. Anonymous. Major cardiovascular events in hyper- hypercholesterolemia? Aguided tour of the major tensive patients randomized to doxazosin vs clinical trials for the primary care physician. J Am chlorthalidone: the antihypertensive and lipid-low- Board Fam Pract 1991;4:437– 45.
ering treatment to prevent heart attack trial (ALL- 24. Heidenreich PA, Lee TT, Massie BM. Effect of HAT). JAMA 2000;283:1967–75.
beta-blockade on mortality in patients with heart 13. Echt DS, Liebson PR, Mitchell LB, et al. Mortality failure: a meta-analysis of randomized clinical trials.
and morbidity in patients receiving encainide, fle- J Am Coll Cardiol 1997;30:27–34.
cainide, or placebo. The Cardia Arrhythmia Sup- 25. Centre for Evidence-Based Medicine. Levels of ev- pression Trial. N Engl J Med 1991;324:781– 8.
idence and grades of recommendation. Available at: 14. Lepor H, Williford WO, Barry MJ, et al. The effi- cacy of terazosin, finasteride, or both in benign pros- 26. Family Practice Inquiries Network (FPIN). Avail- tatic hyperplasia. Veterans Affairs Cooperative able at: URL:
Strength of Recommendation Taxonomy 67


Personalisierte Medizin Wie ist es möglich, dass zwei Menschen mit der gleichen Krankheit unterschiedlich auf die Behandlung mit demselben Medikament reagieren? Die Antwort liegt in den Genen. 1. Weniger Nebenwirkungen dank Pharmakogenomik Vergleicht man das Erbgut zweier Menschen, zum Beispiel das Erbgut einer Schülerin und ihres Banknachbarn, so wird man feststellen, dass sich die beiden Genome an etwa 30 bis 60 Millionen Basenpaaren, den «Buchstaben» des Erbguts, unterscheiden (einzige Ausnahme: der Banknachbar ist zugleich der eineiige Zwilling). Das entspricht etwa 1 bis 2 Prozent des gesamten Erbguts. Noch vor fünf Jahren meinten Wissenschafter, dass sich zwei Menschen nur etwa zu 0,1 Prozent genetisch voneinander unterscheiden.

Validity of four pain intensity rating scales

PAINÒ 152 (2011) 2399–2404 Validity of four pain intensity rating scales Maria Alexandra Ferreira-Valente , José Luís Pais-Ribeiro , Mark P. Jensen a Faculdade de Psicologia e Ciências da Educação da Universidade do Porto, Porto, Portugalb Portuguese Foundation for Science and Technology, Lisbon, Portugalc Unidade de Investigação em Psicologia e Saúde (Psychology and Health Unit), Lisbon, Portugald Department of Rehabilitation Medicine, University of Washington School of Medicine, Seattle, WA, USA