evidencebasedradiology.net
EBR home
EBR overview
The EBR process
Ask - an answerable question
Search - For the Best Current Evidence
Appraise - Using Standardised Methods
Introduction
 
 
 
 
 
Apply - Conclusions to Patients
Evaluate - Self Evaluation
Ongoing EBR - Getting Research into Practice
Ongoing EBR - Teaching materials

Appraise - Using Standardised Methods

 

Introduction

The next stage in the EBM process is to separate the wheat from the chaff - that is to extract the best papers from the list you have found with your search. We do this by analysing the "Methods" section of a paper for validity and strength.

Diagnostic Tests

"Validity" is a measure of how close a study result is to the truth. For example, was the study free of bias? Was it performed in an appropriate group of patients? Was an independent gold standard used? Were the results subsequently validated in an appropriate prospective group of patients? Is enough detail included for the study to be replicated? All these criteria need to be considered. Analysis of the Materials and Methods section achieves this [1].

This allows us to assign a "level of evidence" to each paper. The levels of evidence have been defined in detail by the Oxford EBM group for papers from different domains, including diagnosis and treatment. Once we have assigned a level of evidence to all the papers in our comprehensive search, we essentially use those from the best (and sometimes the next best) level of evidence and ignore those from lower levels. This reduces the amount of work that we have to do in evaluating all retrieved manuscripts.

"Strength" refers to the ability of a diagnostic test to reliably differentiate between disease and normality. When we are looking at test properties and comparing tests, this will depend on the sensitivity, specificity, confidence intervals, predictive values and likelihood ratios. Analysis of the Results section gives us this information. Then, when we try and apply a diagnostic test to specific patients / groups of patients, we move on to integrating the test properties with the pre- and post-test probabilities [2, 3]. Test properties are dealt with in this section. Probabilities are dealt with in detail in the APPLY section of this website. Thanks to computer software, it’s not as complicated as it sounds!

For anybody who wants to read about 'how we do it' in more detail, we published an article in European Radiology (May 2004) that comprehensively describes the appraisal of diagnostic radiology literature using 'EBR' methods [4].

Interventional Procedures

The same basic principles of evaluation of validity and strength apply but we must use different criteria and calculations [5-8]. These are briefly introduced here and discussed in detail within the links below.

Validity: when we assess the validity of the paper we are looking for systematic bias which might influence the results of the study. The study design, as described in the ‘materials and methods’ section of the paper, holds the key to determining the validity of a study. A few straightforward questions can expose major biases; for example -Was a control group used? Was allocation of patients randomised? Were all patients who entered the study accounted for at its completion?

Strength: The strength of a study is an expression of the statistical power of its results. Note that a ‘strong’ study may be worthless if the study is not ‘valid’.

Measures of diagnostic accuracy - sensitivity and specificity - are familiar. The strength of at therapeutic intervention is measured in terms of risk. Risk refers to the probability of an event occurring in a given time period.

Expressed as a proportion, risk varies between zero and one e.g. the 10 year risk of myocardial infarction may be 0.1 or 10%. This is the absolute risk and is the preferred index in expressing risk. Other indices such as relative risk are dependent on disease prevalence and so do not tell the whole story (a bit like predictive value measurements of diagnostic accuracy).

If we consider ‘Benefit’, interventional procedures aim to reduce the risk of an adverse outcome. The question ‘how much does a patient benefit from a procedure?’ may be rephrased as ‘by how much has the procedure altered the patients risk of an adverse outcome ?’ This is expressed as the ‘absolute risk reduction’ and is simply the difference in risk between the treatment and control group of a prospective controlled study.

Risks can also be assessed in retrospective case-control studies. These studies differ fundamentally from prospective studies. One of the consequences of these differences is the use of relative odds (also called odds ratio) when reporting the results of case-control studies.

Statistical significance

When interpreting the results of non-blinded non-randomised trials reporting the effects of a therapeutic intervention, one should be satisfied that the reported change in patient outcome was so great that it could not be accounted for by the various biases prevalent in these trails.

Like sensitivity and specificity, risk measures are estimates. The accuracy of the estimate, how close the estimate is to the truth, is expressed as the confidence interval. The 95% confidence interval is the standard. This is the range of values around the reported value, in which we are 95% certain the true population value lies. The wider the confidence interval the less precise the reported value or point estimate. If the 95% confidence interval around an estimate of absolute risk reduction does not include zero, then the result is statistically significant.

Clinical Application of the Results

Statistical and clinical significance are not equivalent. A useful expression of the clinical significance of an intervention is the number needed to treat (NNT). The NNT is number of patients who would need to undergo the procedure to prevent one adverse outcome. This is simply the reciprocal of the absolute risk reduction. For example above if the absolute risk reduction of an adverse event achieved by an intervention was 0.05 (5%), the NNT to prevent one additional adverse event is 20.

Harm

Similar calculations can be made about the harmful results of any therapy, including interventional procedures. These result in Absolute Risk Increase, Relative Risk Increase and Number Needed to Harm (NNH).

What next?

You now have several options if you want to appraise a study:

1. If you are the independent type -
Diagnostic: Read some ‘evidence-based literature on critical appraisal, [1-3], then download a critical appraisal worksheet for diagnostic test studies (produced by the Canadian Association of Radiologists ) and fill it in for the study in question.
Interventional: Read some ‘evidence-based’ literature on critical appraisal [5-8], then try your hand at it without further assistance.
Until recently, this was all that was available.

2. Try the interactive tools for different types of paper below:
I have a diagnostic paper and want to use an online tutorial to appraise it
I have an interventional paper and want to use an online tutorial to appraise it

3. If you already know how to appraise validity and strength of diagnostic / interventional papers, and assign a level of evidence, here are some web-based tools to use for the calculations. The first time you read them we suggest you read the abstract of the orginal publication. These are references [3] and [8] below.
View and use a spreadsheet for diagnostic test data analysis [3]
View and use a spreadsheet for interventional radiology data analysis [8]

4. If the study that you wish to appraise is relevant to radiologists, but not primary literature about a diagnostic test or interventional procedure, you currently have to revert to the ‘independent type’ mode and may find the following non-interactive links to EBM ‘User’s Guides’ useful.
Appraise a Clinical Practice Guideline
Appraise a Review article (Overview)
Appraise an article on Prognosis
Appraise an article on Screening

No, the study I wish to appraise is not one of these; I would like to link to ‘All Users Guides

References

1. Jaeschke R, Guyatt G, Sackett DL, Users' guides to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? Evidence-Based Medicine Working Group. Jama 1994; 271 (5):389-391. [ link ]

2. Jaeschke R, Guyatt GH ,Sackett DL, Users' guides to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? The Evidence-Based Medicine Working Group. Jama 1994; 271 (9):703-707. [ link ]

3. MacEneaney PM ,Malone DE, The meaning of diagnostic test results: a spreadsheet for swift data analysis. Clin Radiol 2000; 55 (3):227-235. [ link ]

4. Dodd JD, MacEneaney PM ,Malone DE, Evidence-Based Radiology: how to quickly assess the validity and strength of publications in the diagnostic radiology literature. Eur Radiol 2004; 14:915-922. [ link ]

5. Guyatt GH, Sackett DL ,Cook DJ, Users' guides to the medical literature. II. How to use an article about therapy or prevention. A. Are the results of the study valid? Evidence-Based Medicine Working Group. Jama 1993; 270 (21):2598-2601. [ link ]

6. Guyatt GH, Sackett DL ,Cook DJ, Users' guides to the medical literature. II. How to use an article about therapy or prevention. B. What were the results and will they help me in caring for my patients? Evidence-Based Medicine Working Group. JAMA 1994; 271 (1):59-63. [ link ]

7. Malone DE ,MacEneaney PM, Applying 'technology assessment' and 'evidence based medicine' theory to interventional radiology. Part 1: Suggestions for the phased evaluation of new procedures. Clin Radiol 2000; 55 (12):929-937. [ link ]

8. MacEneaney PM ,Malone DE, Applying 'evidence-based medicine' theory to interventional radiology. Part 2: a spreadsheet for swift assessment of procedural benefit and harm. Clin Radiol 2000; 55 (12):938-945. [ link ]

   
  EBR  
© 2002 evidencebasedradiology.net