|
Introduction
The next stage in the EBM process is
to separate the wheat from the chaff - that is to extract
the best papers from the list you have found with your search.
We do this by analysing the "Methods" section of
a paper for validity and strength.
Diagnostic Tests
"Validity" is a measure
of how close a study result is to the truth. For example,
was the study free of bias? Was it performed in an appropriate
group of patients? Was an independent gold standard used?
Were the results subsequently validated in an appropriate
prospective group of patients? Is enough detail included for
the study to be replicated? All these criteria need to be
considered. Analysis of the Materials and Methods section
achieves this [1].
This allows us to assign a "level
of evidence" to each paper. The levels
of evidence have been defined in detail by the Oxford
EBM group for papers from different domains, including diagnosis
and treatment. Once we have assigned a level of evidence to
all the papers in our comprehensive search, we essentially
use those from the best (and sometimes the next best) level
of evidence and ignore those from lower levels. This reduces
the amount of work that we have to do in evaluating all retrieved
manuscripts.
"Strength" refers to
the ability of a diagnostic test to reliably differentiate
between disease and normality. When we are looking at test
properties and comparing tests, this will depend on the sensitivity,
specificity, confidence intervals, predictive values and likelihood
ratios. Analysis of the Results section gives us this information.
Then, when we try and apply a diagnostic test to specific
patients / groups of patients, we move on to integrating the
test properties with the pre- and post-test probabilities
[2, 3]. Test properties
are dealt with in this section. Probabilities are dealt with
in detail in the APPLY
section of this website. Thanks to computer software, its
not as complicated as it sounds! For anybody who wants to read about 'how we do it' in more detail, we published an article in European Radiology (May 2004) that comprehensively describes the appraisal of diagnostic radiology literature using 'EBR' methods [4].
Interventional Procedures
The same basic principles of evaluation
of validity and strength apply but we must use different criteria
and calculations [5-8].
These are briefly introduced here and discussed in detail
within the links below.
Validity: when we assess the validity
of the paper we are looking for systematic bias which might
influence the results of the study. The study design, as described
in the materials and methods section of the paper,
holds the key to determining the validity of a study. A few
straightforward questions can expose major biases; for example
-Was a control group used? Was allocation of patients randomised?
Were all patients who entered the study accounted for at its
completion?
Strength: The strength of a study
is an expression of the statistical power of its results.
Note that a strong study may be worthless if the
study is not valid.
Measures of diagnostic accuracy - sensitivity
and specificity - are familiar. The strength of at therapeutic
intervention is measured in terms of risk. Risk refers to
the probability of an event occurring in a given time period.
Expressed as a proportion, risk varies
between zero and one e.g. the 10 year risk of myocardial infarction
may be 0.1 or 10%. This is the absolute risk and is the preferred
index in expressing risk. Other indices such as relative risk
are dependent on disease prevalence and so do not tell the
whole story (a bit like predictive value measurements of diagnostic
accuracy).
If we consider Benefit, interventional
procedures aim to reduce the risk of an adverse outcome. The
question how much does a patient benefit from a procedure?
may be rephrased as by how much has the procedure altered
the patients risk of an adverse outcome ? This is expressed
as the absolute risk reduction and is simply the
difference in risk between the treatment and control group
of a prospective controlled study.
Risks can also be assessed in retrospective
case-control studies. These studies differ fundamentally from
prospective studies. One of the consequences of these differences
is the use of relative odds (also called odds ratio) when
reporting the results of case-control studies.
Statistical significance
When interpreting the results of non-blinded
non-randomised trials reporting the effects of a therapeutic
intervention, one should be satisfied that the reported change
in patient outcome was so great that it could not be accounted
for by the various biases prevalent in these trails.
Like sensitivity and specificity, risk
measures are estimates. The accuracy of the estimate, how
close the estimate is to the truth, is expressed as the confidence
interval. The 95% confidence interval is the standard. This
is the range of values around the reported value, in which
we are 95% certain the true population value lies. The wider
the confidence interval the less precise the reported value
or point estimate. If the 95% confidence interval around an
estimate of absolute risk reduction does not include zero,
then the result is statistically significant.
Clinical Application of the Results
Statistical and clinical significance
are not equivalent. A useful expression of the clinical significance
of an intervention is the number needed to treat (NNT). The
NNT is number of patients who would need to undergo the procedure
to prevent one adverse outcome. This is simply the reciprocal
of the absolute risk reduction. For example above if the absolute
risk reduction of an adverse event achieved by an intervention
was 0.05 (5%), the NNT to prevent one additional adverse event
is 20.
Harm
Similar calculations can be made about
the harmful results of any therapy, including interventional
procedures. These result in Absolute Risk Increase, Relative
Risk Increase and Number Needed to Harm (NNH).
What next?
You now have several options if you want
to appraise a study:
1. If you are the independent type -
Diagnostic: Read some evidence-based literature
on critical appraisal, [1-3],
then download a critical appraisal worksheet for diagnostic
test studies (produced by the
Canadian Association of Radiologists ) and fill it in
for the study in question.
Interventional: Read some evidence-based
literature on critical appraisal [5-8],
then try your hand at it without further assistance.
Until recently, this was all that was available.
2. Try the interactive tools for different
types of paper below:
I
have a diagnostic paper and want to use an online tutorial
to appraise it
I
have an interventional paper and want to use an online tutorial
to appraise it
3. If you already know how to appraise
validity and strength of diagnostic / interventional papers,
and assign a level of evidence, here are some web-based tools
to use for the calculations. The first time you read them
we suggest you read the abstract of the orginal publication.
These are references [3] and [8]
below.
View
and use a spreadsheet for diagnostic test data analysis
[3]
View
and use a spreadsheet for interventional radiology data analysis
[8]
4. If the study that you wish to appraise
is relevant to radiologists, but not primary literature about
a diagnostic test or interventional procedure, you currently
have to revert to the independent type mode and
may find the following non-interactive links to EBM Users
Guides useful.
Appraise
a Clinical Practice Guideline
Appraise
a Review article (Overview)
Appraise
an article on Prognosis
Appraise
an article on Screening
No, the study I wish to appraise is not
one of these; I would like to link to All
Users Guides
References
1. Jaeschke R, Guyatt
G, Sackett DL, Users' guides to the medical literature. III.
How to use an article about a diagnostic test. A. Are the
results of the study valid? Evidence-Based Medicine Working
Group. Jama 1994; 271 (5):389-391. [ link
]
2. Jaeschke R, Guyatt
GH ,Sackett DL, Users' guides to the medical literature. III.
How to use an article about a diagnostic test. B. What are
the results and will they help me in caring for my patients?
The Evidence-Based Medicine Working Group. Jama 1994; 271
(9):703-707. [ link
]
3. MacEneaney PM ,Malone
DE, The meaning of diagnostic test results: a spreadsheet
for swift data analysis. Clin Radiol 2000; 55 (3):227-235.
[ link
]
4. Dodd JD, MacEneaney PM ,Malone
DE, Evidence-Based Radiology: how to quickly assess the validity and strength of publications in the diagnostic radiology literature. Eur Radiol 2004; 14:915-922.
[ link
]
5. Guyatt GH, Sackett
DL ,Cook DJ, Users' guides to the medical literature. II.
How to use an article about therapy or prevention. A. Are
the results of the study valid? Evidence-Based Medicine Working
Group. Jama 1993; 270 (21):2598-2601. [ link
]
6. Guyatt GH, Sackett
DL ,Cook DJ, Users' guides to the medical literature. II.
How to use an article about therapy or prevention. B. What
were the results and will they help me in caring for my patients?
Evidence-Based Medicine Working Group. JAMA 1994; 271 (1):59-63.
[ link
]
7. Malone DE ,MacEneaney
PM, Applying 'technology assessment' and 'evidence based medicine'
theory to interventional radiology. Part 1: Suggestions for
the phased evaluation of new procedures. Clin Radiol 2000;
55 (12):929-937. [ link
]
8. MacEneaney PM ,Malone
DE, Applying 'evidence-based medicine' theory to interventional
radiology. Part 2: a spreadsheet for swift assessment of procedural
benefit and harm. Clin Radiol 2000; 55 (12):938-945. [ link
]
|