Explanation of Columns in Natural Standard Evidence Table
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
Condition:
Refers to the medical condition or disease targeted by a therapy.
Study
Design:
Common types include:
Randomized controlled trial
(RCT): An experimental trial in which participants are assigned randomly
to receive either an intervention being tested or placebo. Note that Natural
Standard defines RCTs as being placebo-controlled, while studies
using active controls are classified as equivalence trials (see below).
In RCTs, participants and researchers are often blinded (i.e., unaware of
group assignments), although unblinded and quasi-blinded RCTs are also often
performed. True random allocation to trial arms, proper blinding, and sufficient
sample size are the basis for an adequate RCT.
Equivalence trial: An RCT
which compares two active agents. Equivalence trials often compare new treatments
to usual (standard) care, and may not include a placebo arm.
Before and after comparison:
A study that reports only the change in outcome in each group of a study,
and does not report between-group comparisons. This is a common error in
studies that claim to be RCTs.
Case series: A description
of a group of patients with a condition, treatment, or outcome (e.g., 20
patients with migraine headache underwent acupuncture and 17 reported feeling
better afterwards). Case series are considered weak evidence of efficacy.
Case-control study: A study
in which patients with a certain outcome are selected and compared to similar
patients (without the outcome) to see if certain risk factors/predictors
are more common in patients with that outcome. This study design is not
common in the complementary & alternative medicine literature.
Cohort study: A study which
assembles a group of patients with certain baseline characteristics (for
example, use of a drug), and follows them forward in time for outcomes.
This study design is not common in the complementary & alternative medicine
literature.
Meta-analysis: A pooling
of multiple trials to increase statistical power (often used to pool data
from a number of RCTs with small sample sizes, none which demonstrates significance
alone but in aggregate can achieve significance). Multiple difficulties
are encountered when designing/reviewing these analyses; in particular,
outcomes measures or therapies may differ from study to study, hindering
direct comparison.
Review: An authors
description of his or her opinion based on personal, non-systematic review
of the evidence.
Systematic review: A review conducted according to pre-specified criteria in an attempt to limit bias from the investigators. Systematic reviews often include a meta-analysis of data from the included studies.
P: Pending verification.
Author,
Year:
Identifies the study being described in a row of the table.
N:
The total number of subjects included in a study (treatment group plus placebo
group). Some studies recruit a larger number of subjects initially, but do not
use them all because they do not meet the studys entry criteria. In this
case, it is the second, smaller number that qualifies as N. N includes all subjects
that are part of a study at the start date, even if they drop out, are lost
to follow-up, or are deemed unsuitable for analysis by the authors. Trials with
a large number of drop-outs that are not included in the analysis are considered
to be weaker evidence for efficacy. (For systematic reviews the number of studies
included is reported. For meta-analyses, the number of total subjects included
in the analysis or the number of studies may be reported.) P = pending verification.
Statistically
Significant?:
Results are noted as being statistically significant if a studys authors
report statistical significance, or if quantitative evidence of significance
is present (such as p values). P = pending verification.
Quality
of Study:
A numerical score between 0-5 is assigned as a rough measure of study
design/reporting quality (0 being weakest and 5 being strongest). This number
is based on a well-established, validated scale developed by Jadad et al. (Jadad
AR, Moore RA, Carroll D, et al. Assessing the quality of reports of randomized
clinical trials: is blinding necessary? Controlled Clinical Trials 1996;17[1]:1-12).
This calculation does not account for all study elements that may be used to
assess quality (other aspects of study design/reporting are addressed in the
"Evidence Discussion" sections of monographs).
A Jadad score is calculated using the seven items in the table below. The first five items are indications of good quality, and each counts as one point towards an overall quality score. The final two items indicate poor quality, and a point is subtracted for each if its criteria are met. The range of possible scores is 0 to 5.
|
P = pending verification.
[back to top]Magnitude
of Benefit:
This summarizes how strong a benefit is: small, medium, large, or none. If results
are not statistically significant "NA" for "not applicable"
is entered. In order to be consistent in defining small, medium, and large benefits
across different studies and monographs, Natural Standard
defines the magnitude of benefit in terms of the standard deviation (SD) of
the outcome measure. Specifically, the benefit is considered:
Large: if >1 SD
Medium: if 0.5 to 0.9 SD
Small: if 0.2 to 0.4 SD
In many cases, studies do not report the standard deviation of change of the outcome measure. However, the change in the standard deviation of the outcome measure (also known as effect size) can be calculated, and is derived by subtracting the mean (or mean difference) in the placebo/control group from the mean (or mean difference) in the treatment group, and dividing that quantity by the pooled standard deviation (Effect size=[Mean Treatment Mean Placebo]/SDp).
[back to top]Absolute
Risk Reduction:
This describes the difference between the percent of people in the control/placebo
group experiencing a specific outcome (control event rate), and the percent
of people in the experimental/therapy group experiencing that same outcome (experimental
event rate). Mathematically, Absolute risk reduction (ARR) equals experimental
event rate minus control event rate. ARR is better able to discriminate between
large and small treatment effects than relative risk reduction (RRR), a calculation
that is often cited in studies ([control event rate experimental event
rate]/control event rate). Many studies do not include adequate data to calculate
the ARR, in which cases "NA" is entered into this column. P = pending verification.
Number
Needed to Treat:
This is the number of patients who would need to use the therapy under investigation,
for the period of time described in the study, in order for one person to experience
the specified benefit. It is calculated by dividing the Absolute Risk Reduction
into 1 (1/ARR). P = pending verification.
Comments:
When appropriate, this brief section may comment on design flaws (inadequately
described subjects, lack of blinding, brief follow-up, not intention-to treat,
etc.), notable study design elements (crossover, etc.), dosing, and/or specifics
of study group/sub-groups (age, gender, etc). More detailed description of studies
is found in the "Evidence Discussion" section that follows the "Evidence
Table" in Natural Standard monographs.