Choosing an effect measure
From what we have seen to date we know that if a randomised trial measures dichotomous outcomes, we can compare the event rates in the two groups using several different summary statistics:
- a risk ratio (of either the good and bad outcome)
- the risk difference
- the NNT
- the odds ratio.
Unfortunately there is no easy way to decide which statistic to use in your review. You will need to make two decisions,
- which statistic to choose for the analysis
- which statistic to choose to present the results.
Within RevMan, the choice you make for expressing the results of an individual study will also apply to the meta-analysis if you choose to do one. Whatever statistic you choose for your analysis, you always have the option of re-interpreting the results using another measure in the text of your review. For example, you might perform a meta-analysis by selecting risk ratio in RevMan, then interpret the results by converting it to an NNT in the results section of your review. Remember that NNT is useful for presenting results, but not for analysis purposes. The other three statistics (OR, RR, RD) can be used for either. But also remember that we have two quite different RRs depending on which way the events are coded.
There are three principal issues to consider when choosing a summary statistic:
- communication, i.e. a straightforward and clinically useful interpretation
- consistency of the statistic across different studies
- reasonable mathematical properties
Communication
We have seen in the section on putting statistics into words that it can be quite hard to explain an odds ratio. Most clinicians and consumers have less difficulty with understanding a risk ratio, however we need to be careful to also give some idea of the absolute difference between the two groups, as relative measures can be misleading. Take the example of buying two lottery tickets instead of one. We could say you are doubling your chances of winning, or we could say your chances of winning have gone up by 1 in 400,000. Both versions give you incomplete information because neither tells us clearly what the chance of winning is in the first place. The statements are likely to be interpreted differently, because many people would think an increase of 1 in 400,000 sounds a lot less attractive than a doubling of the chance of winning.
NNT is a very useful way to express effect in clinical terms ( for example, "I have to treat x number of my patients with this treatment for y number of weeks in order to help 1 patient who would not have got better anyway"), as is risk difference (for example, "for every 100 treated, x% will benefit").
Consistency
The results of your review are probably drawn from many trials, and will be applied in many populations, so it is desirable that the statistic you chose is consistent, i.e. stays nearly the same, or is stable, when applied in different places. In any meta-analysis there is likely to be variation in event rates between trials. The risk ratio, risk difference and odds ratio all vary to some extent in different situations, but one may be more stable than the others. Let's have a look at a hypothetical example.
| Two hypothetical trials with varying event rates with consistent OR, RRx2 and RD |
| |
Treatment group |
| Trial |
Relation to Trial 1 |
Control |
Treatment |
OR |
RR(E) |
RD |
RR(NE) |
| 1 |
|
24/100 |
16/100 |
0.60 |
0,67 |
-0.08 |
1.11 |
| 2A |
Same OR |
42/100 |
30/100 |
0.60 |
0.71 |
-0.12 |
1.21 |
| 2B |
Same RR(E) |
42/100 |
28/100 |
0.54 |
0.67 |
-0.14 |
1.24 |
| 2C |
Same RD |
42/100 |
34/100 |
0.71 |
0.81 |
-0.08 |
1.14 |
| 2D |
Same RR(NE) |
42/100 |
36/100 |
0.78 |
0.86 |
-0.06 |
1.11 |
In this table we take trial 1 as the reference trial. In trial 2 the events are more common - 36% of all patients have events compared to 20% in trial 1. Despite this, in trial 2A, the effect size is the same as trial 1 if we used odds ratio as the summary statistic. But if we were to use risk ratio (either of the event or the non-event) or if we used risk difference, the effect measure would not be the same. Now consider trial 2B. Here the effect is the same in the 2 trials if we choose risk ratio of the event as the effect measure, but it varies when the other measures are chosen. Trial 2C and 2D show the same phenomenon, with risk difference and risk ratio of the non-event respectively being consistent.
So, as event rates vary, OR, RR and RD may vary to different degrees. When we are choosing which one to use, it would be helpful to choose the one which is most likely to be consistent across the event rates in the studies we have, as it is also the one most likely to be consistent in clinical practice. To try to help make this decision, some researchers looked at many meta-analyses in Cochrane reviews, calculating the results with OR, RR and RD. Of these, RD varied most across trials included in the meta-analyses, and OR and RR less so. This suggests that the relative measures are more likely to be consistent in Cochrane reviews than RD. The RR of the bad event varied less than the RR of the good event.
Mathematical Properties
In order for a summary statistic to be able to be used in a meta-analysis it needs one mathematical property. That is the ability to reliably estimate its variance. This is because the way in which we assign weight to studies within the meta-analysis is inversely proportional to variance (we will cover this in more detail in the next module). We cannot use NNT in a meta-analysis because we don't have a usable estimate of its variance.
There are two other properties, which are not essential but are mathematically desirable. We've already seen that it is easier to switch between odds ratios of 'good' and 'bad' outcomes, than it is with risk ratios. This is sometimes argued to be a helpful mathematical property of odds ratios.
Another issue that arises when applying results is called 'bounding', as we can get predictions outside the bounds of possibility. For example, we calculated earlier on that the risk ratio of cure of UTI for antibiotic use in pregnant women was 6.6. What would happen if we tried to apply this to a group of women where we thought half would get better without antibiotics?
Risk without treatment x risk ratio = risk with treatment
0.5 x 6.6 = 3.3
This result is nonsense as it predicts that 330% (i.e. more than all) of the women will be cured!
A similar thing can happen with risk difference. The risk difference we calculated for risk of not being cured of UTI with antibiotics was -0.76. Let's try to apply that in a situation where we think that, without antibiotics, 30% of people would not be cured.
Risk without treatment + risk difference = risk with treatment
0.3 + -0.76 = -0.46
Again this is nonsense as it predicts that using antibiotics will mean that -46% of women will not be cured.
In practice, this is less important than it may appear, as what we are doing in these examples is applying a result from one situation to a very different situation. In reality, we would not expect to apply results from very high risk populations to very low risk populations or vice versa.
|