The Cochrane Collaboration open learning material

 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 A1 A2
 Module contents:
Meta-analysis of continuous data
 Learning objectives What are continuous outcome data? What information do I need? Measuring the effect of treatment On skew Deciding on a change (from baseline) Next module

On skew

As we have said above, skewed data are not bad data. They are simply data that create a few complications because the distribution of likely measurements is asymmetrical and less convenient for statistical analysis. The main problem is that means and standard deviations are not very useful summaries of skewed data. Having said this, many investigators still report means and standard deviations even when data are skewed.

Detecting skew

There is a handy trick to check the results of your included studies to see if they are skewed, even if they present a mean and standard deviation. This if often used in Cochrane reviews. The trick works if (i) you have a mean and standard deviation and (ii) there is an absolute minimum possible value for the outcome. Consider blood concentrations. These cannot be less than zero, so have an absolute minimum. Weight also has an absolute minimum possible value, as do scores on most psychometric scales. But weight loss and change-from-baseline measures can be negative and usually don't have an absolute minimum, so the trick won't work on these. Here's the trick. Divide the mean by the standard deviation. If this is less than 2 then there is some indication of skewness. If it is less than 1 (i.e. the standard deviation is bigger than the mean) then there is almost certainly skewness.

There are a number of ways to deal with skewed data, but unfortunately few of them tend to be useful in meta-analysis. It is worth remembering that methods for meta-analysis (being based on t-tests) are quite robust to a little bit of skewness, especially if sample sizes are large.

Options for dealing with skewed data

The strategies that you might consider using with skewed data depend on the way the original trialists analyse and report results. The options you might encounter include:

(a) The trialists have ignored (or not noticed) the skewness and simply report means, standard deviations, and sample sizes.

This appears to be the simplest situation, as you can directly enter these numbers into RevMan. However, as we have noted, there is a possibility that these 'improperly' analysed data may be misleading. So, we will be unsure of the validity of our findings.

(b) The trialists have log-transformed the data for analysis, and report geometric means.

When a positively skewed distribution is log-transformed the skewness will be reduced. This is a recommended method of analysis for skewed data. In some fields, such as analysing antibody concentrations after vaccination, this approach is the norm. The data we wish to analyze in RevMan should also be on the log scale: the mean of the logged data will be the log of the geometric data. The standard deviation can be obtained from the confidence interval for the geometric mean, as described in section 7.7.3.2 of the Cochrane Handbook for Systematic Reviews of Interventions.

(c) The trialists use non-parametric tests (e.g. Mann-Whitney) and describe averages using medians.

Non-parametric tests are a satisfactory alternative for analysing skewed data in trials. But as we cannot obtain means and standard deviations, we cannot include results of such analyses directly in a meta-analysis. This is, of course, unsatisfactory, especially when the inappropriately analysed results described in (a) can be used. One suggestion is that results of all studies are reported in a table in your review, regardless of the method of analysis used in the trials. This means that such data will not be lost from the review, and their results can be considered when drawing conclusions, even if they cannot be formally pooled.

Statistical methods do exist for combining p values from non-parametric tests, but not for estimating effects or detecting heterogeneity.

Fixed effect and random effects for continuous data

In Module 11 we covered differences between fixed effect and random effects meta-analysis of dichotomous data, and the issues are similar in continuous data. In a fixed effect inverse variance meta-analysis, the assumption is that all included studies are estimating one true or fixed effect and so variations between studies are due to random error. Studies are weighted according to the inverse of their variance, determined by the standard deviation. A potential problem therefore is that studies with restrictive eligibility criteria will have less variance (smaller standard deviation) and so will be given greater weight.

A random effects meta-analysis of continuous data assumes that all studies are estimating different effects (as they will all have differences to do with population, setting etc.) and these different effects are distributed according to a particular pattern. A random effects meta-analysis and fixed effect meta-analysis will therefore approximate each other in the absence of heterogeneity. Weight is attributed slightly differently when we use a random effects meta-analysis, however again studies with restrictive eligibility criteria will be given greater weight.

 © The Cochrane Collaboration 2002 Next: Deciding on a change (from baseline)