Combining studies

Weighted averages

The solution is to calculate a weighted average.

A weighted average is an average where the results of some of the studies make a greater contribution to the total than others. All of the methods available for conducting meta-analyses in Cochrane reviews use forms of weighted averages. The various methods do this in different ways, and we will cover these methods in this module. In all methods, the underlying principle is to give more weight to studies that give us more information about the treatment effect.

Sample size is the main factor in determining the weight for a trial. However, the event rate also makes a difference. This is because effects are generally estimated more precisely when there are lots of events. So, trials with higher event rates get more weight. At the extreme, a trial with no events tells us nothing about the effect of the intervention, and so gets no weight at all. The exact relationship between event rates and study weights is complex, and depends on the summary statistic being used.

A statistical concept which takes into account both size of the study’s population, and its event rate, is variance. The box below provides an outline of the concept of variance.

What is variance?

Standard errors, confidence intervals and variances

Imagine trying to estimate the proportion of females in the population. If we took a sample of ten people from a list of all the people in a country, we may, by chance, find there were 2, or 7, or even 10 females. We wouldn’t be very confident from this sample to say what the true proportion of females in the population is. Whenever we take samples from populations, there is uncertainty about the estimates we make of the true value in the whole population.

The same is true of trials – each trial involves taking a sample of the possible participants. The basic result of an individual trial is an estimate of treatment effect. The estimate is incomplete without a measure of how certain we can be about it. We’d be much more certain about an estimate from a mega-trial of tens of thousands of patients than we would an estimate from a small trial of less then a hundred. Uncertainty is often described using a confidence interval. For example, a 95% confidence interval gives a range within which we can be 95% confident the true effect lays.

Confidence intervals are calculated from a number known as a standard error. Standard errors are companions of all estimates. They describe the extent to which an estimate might be wrong due to random error. The smaller the standard error the more certain we are about the estimate. To get a feel for standard errors it is helpful to know that 95% confidence intervals are obtained by taking the estimate and creating limits that are 1.96 standard errors below it and 1.96 standard errors above it. Thus an estimate may be wrong by about a standard error, but to be 95% confident about where the true effect lies, we go roughly 2 standard errors either side.

Statistics as a discipline has more uses for the ‘standard error squared’ than for the ‘standard error’, so statisticians have a word for it, the variance. The variance of an estimate is just the square of its standard error. The standard error and variance are interchangeable in terms of the information they convey, but their numerical values are different. It’s the same as describing the size of a square: you could say either that the length of each side is 4 metres or that the area is 16 square metres; you end up with an identical shape. Quoting the length is like using the standard error; quoting the area is like using the variance.

One important point to note is that different treatment effects (OR, RR, RD) calculated for the same trial will have different variances.

We could assume that variance is inversely proportional to importance, i.e. the less variance in the study, the more weight it should contribute. One method, planned for RevMan 4.2 but not in earlier versions of RevMan, the inverse variance method, calculates study weights directly based on this assumption.

There are other methods, called Mantel-Haenszel methods, which attribute weight in a manner closely related to inverse variance. In this module we will expand a little on the various available methods and look at some of the differences between them, finishing with some guidance on which method to use in your review.

Within RevMan, the methods available are:

  • For dichotomous data:
    • Fixed effect assumption
      • Mantel-Haenszel risk ratio (RR)
      • Mantel-Haenszel odds ratio (OR)
      • Mantel-Haenszel risk difference (RD)
      • Peto odds ratio (Peto OR)
    • Random effects assumption (DerSimonain and Laird)
      • RR
      • OR
      • RD
  • For continuous data
    • Fixed effect inverse variance model
      • Weighted mean difference (WMD)
      • Standardised mean difference (SMD)
    • Random effects assumption (DerSimonian and Laird)
      • WMD
      • SMD
  • For generic data (available in Revman 4.2)
    • Fixed effect inverse variance
    • Random effects inverse variance
Read section 8.4 of the Reviewers’ Handbook There is some information about the statistical techniques available in RevMan in Section 8.4 of the Reviewers’ Handbook and you should read it now.

In order to choose the method you are going to use in your meta-analysis, the first concept to understand is the difference between a fixed effect model and a random effects model.

What does ‘fixed effect’ mean?

To come up with any statistical model, or method for meta-analysis, we first need to make some assumptions. It is these assumptions that form the differences between all the methods listed above.

A fixed effect model of meta-analysis is based on a mathematical assumption that every study is evaluating a common treatment effect. That means the effect of treatment, allowing for the play of chance, was the same in all studies. Another way of explaining this is to imagine that if all the studies were infinitely large they’d give identical results.

The summary treatment effect estimate resulting from this method of meta-analysis is this one ‘true’ or ‘fixed’ treatment effect, and the confidence interval describes how uncertain we are about the estimate.

Sometimes this underlying assumption of a fixed effect meta-analysis (i.e. that diverse studies can be estimating a single effect) is too simplistic. Therefore, the alternative approaches to meta-analysis are (i) to try to explain the variation or (ii) to use a random effects model.

Random effects meta-analyses (DerSimonian and Laird)

As we discussed above, fixed effect meta-analysis assumes that there is one identical true treatment effect common to every study.The random effects model of meta-analysis is an alternative approach to meta-analysis that does not assume that a common (‘fixed’) treatment effect exists. The random effects model assumes that the true treatment effects in the individual studies may be different from each other. That means there is no single number to estimate in the meta-analysis, but a distribution of numbers. The most common random effects model also assumes that these different true effects are normally distributed. The meta-analysis therefore estimates the mean and standard deviation of the different effects.

By selecting ‘random effects’ in the analysis part of RevMan you can calculate an odds ratio, risk ratio or a risk difference based on this approach.

The Mantel-Haenszel approach

The Mantel-Haenszel approach was developed by Mantel and Haenszel over 40 years ago to analyse odds ratios, and has been extended by others to analyse risk ratios and risk differences. It is unnecessary to understand all the details, but is sufficient to say that the Mantel-Haenszel method assumes a fixed effect and combines studies using a method similar to inverse variance approaches to determine the weight given to each study.

The Peto method

The Peto method works for odds ratios only. Focus is placed on the observed number of events in the experimental intervention. We call this O for ‘observed’ number of events, and compare this with E, the ‘expected’ number of events. Hence an alternative name for this method is the ‘O – E‘ method. The expected number is calculated using the overall event rate in both the experimental and control groups. Because of the way the Peto method calculates odds ratios, it is appropriate when trials have roughly equal number of participants in each group and treatment effects are small. Indeed, it was developed for use in mega-trials in cancer and heart disease where small effects are likely, yet very important.

The Peto method is better than the other approaches at estimating odds ratios when there are lots of trials with no events in one or both arms. It is the best method to use with rare outcomes of this type.

The Peto method is generally less useful in Cochrane reviews, where trials are often small and some treatment effects may be large.