Diversity and heterogeneity

You can investigate heterogeneity with sub-group analyses or meta-regression

Investigating sources of heterogeneity

Most meta-analyses aim to summarize the size of an effect across studies or to establish with greater power whether an effect exists. When different studies give different results, an alternative aim is to examine reasons why effects differ across studies. Subgroup analyses and meta-regression are techniques for trying to work out whether particular characteristics of studies are related to the sizes of the treatment effect. One example may be dose/intensity. In many reviews you might be able to determine some measure of how “intensely” an intervention was given in different studies. For drugs this might be dose; for personal contact therapies this might be the amount of contact time. The ideal way of looking at the effect of dose would be to have randomised trials comparing the doses (head-to-head comparisons), but they often don’t exist. Within your review, it may be of interest to determine whether the dose or intensity of the intervention is related to the extent of benefit of treatment in different studies. You could look at this by meta-regression. An alternative way to get an idea about the effect of the drug when given in different doses is to look at trials using subgroups of varying doses.

Subgroup analyses

Subgroup analyses are meta-analyses on subgroups of the studies. There are problems with subgroup analyses, and they can result in misleading conclusions if not undertaken with care. Some of the most important points are

  1. restrict the number of subgroup analyses to a minimum (to reduce the possibility of finding a “positive” or significant result by chance)
  2. pre-specify subgroup analyses whenever possible in order to minimise spurious findings (apparent differences between subgroups that are purely due to chance variation). If there were a good clinical reason why a particular group of participants or studies needed to be looked at separately, you should have thought about that in your protocol. Deciding on subgroups after you have the results of the review may lead to bias through putting a subgroup together on the basis of a particular result.
  3. have a scientific rationale for all subgroup analyses
  4. remember that a difference between subgroups is based on an observational comparison, and may exist due to confounding by other factors

To help explain subgroup analysis, think of the question of whether training reviewers results in higher quality reviews. Imagine we had 15 trials looking at training versus no training, and, of these, seven used a self-directed learning module such as this one, and in eight the intervention was face-to-face training. You may decide to look at the method of delivery of training (self directed or face-to-face) as separate subgroups. There are good reasons for doing this as the effect of the intervention may differ in these two groups and it may not be appropriate to combine them (there is ‘clinical’ heterogeneity).

Although the use of subgroups in this review will give you some information about the effect of each method of training delivery compared to no training, it does not give you direct information about how each method of training delivery compares to each other. This is because no trial in this example has directly compared self directed to face-to-face training within the same sample. An indirect estimate of the difference between methods can be obtained by comparing the overall effects between the two subgroups. However, differences in the results of the two subgroups compared to no training could be explained by other differences in the trials, not just the intervention. For example, the self-directed training could have been given to people from a different background to those given face-to-face training, and it might be this difference that is really responsible for the observed difference in treatment effects.

One common error in interpreting differences between subgroups is to note that the overall effect in one subgroup is statistically significant whilst the effect in the other subgroup is non- significant, and then to conclude that there is a significant difference between subgroups. The significance of a result depends on both the size of effect and the amount of data. Consider an analysis where subgroup 1 has a statistically significant RR of 2.0, 95%CI (1.5, 3.0) and subgroup 2 has a statistically non-significant RR of 2.0, 95%CI (0.1, 100). The effect in subgroup 2 is not different in magnitude but is obviously based on fewer data. It would be wrong to conclude that there is any difference in treatment efficacy between subgroup 1 and subgroup 2, despite the difference in statistical significance.


Meta-regression can formally test whether there is evidence of different effects in different subgroups of trials. For example, you can use meta-regression to test whether treatment effects are bigger in low quality studies than in high quality studies.

Meta-regression is potentially a very useful technique, however it can’t be done in RevMan and, if used inappropriately, its interpretation can be misleading. This is again because differences between studies, even if they are well-performed randomized trials, are entirely observational in nature and are prone to ‘bias’ and ‘confounding’. If you summarize patient characteristics at a trial level, you run the risk of completely failing to detect genuine relationships between these characteristics and the size of treatment effect. Further, the risk of obtaining a spurious ‘explanation’ for variable treatment effects is high when you have a small number of studies and many characteristics that differ. Meta-regression is rarely performed in Cochrane reviews and not an available option in Cochrane software, so should you have strong reason to include a meta-regression in your review, you will need the help of a statistician.

Summary of this module

  • Heterogeneity is simply diversity in characteristics of trials. Not all trials addressing the same question will be identical with respect to clinical components (participants, interventions and outcomes); methodological components (blinding, sample size, method of randomisation) or with respect to their results
  • When trials are ‘too different’, either in clinical, methodological or statistical components, it may be best not to combine them in a meta-analysis, and you need to consider this carefully
  • When doing or interpreting a meta-analysis you can identify heterogeneity graphically and by use of a statistical test
  • When you are combining trials in a meta-analysis there are several methods available to do this. With respect to meta-analysis when there is statistical heterogeneity, there is debate about whether a random or fixed effect analysis is best. The safest option is to look at both sets of results and be conservative in your conclusions. In reality, it is unlikely that trials you consider alike enough in clinical and methodological terms to combine will result in a very different point estimate, regardless of your choice of method.
  • Subgroup analysis is a method available in RevMan to look at the results of different subgroups of trials. Subgroup analyses should be planned at the protocol stage, based on good scientific reasoning, and kept to a minimum. Conclusions from subgroup analyses should be drawn cautiously, remembering that these conclusions are based on subdivision of studies and indirect comparisons, and not on formal statistical tests.
  • Metaregression is a method that is not available in RevMan to formally test whether there is evidence for different effects related to different characteristics of trials. It needs to be used with great care.