Issues related to the unit of analysis

Read the section on cluster randomized trials in Section 8 of the Reviewers’ Handbook

Cluster randomized trials

A cluster randomized trial is a trial in which individuals are randomized in groups (i.e. the group is randomized, not the individual). For example, in a rural area with an endemic disease, we might randomise whole villages to have the intervention or not, rather than individual people. We then say that the village is the unit of randomization. In other situations, general practices, hospitals, families or school classrooms may be randomized. Reasons for performing cluster randomized trials vary. Sometimes the intervention can only be administered to the group, for example an addition to the water supply or a public education campaign; sometimes the motivation is to avoid contamination (all participants in the trial are affected by the intervention, even if it is only given directly to some of them); sometimes the design is simply more convenient or economical.

A simple approach to dealing with cluster randomized trials is to assess outcomes only at the level of the group thereby keeping the unit of analysis the same as the unit of randomisation. One might measure a dichotomous outcome of whether the practice/classroom/village was a ‘success’ or a ‘failure’, or a continuous outcome of the percentage of individuals in the group who benefited. In this way, we obtain one outcome measurement from each randomized unit, and the analysis can proceed as if the groups were individuals – that is, using the techniques described elsewhere in Modules 11 and 12. It will probably strike you that there are limitations to this approach. First, cluster randomized trials are likely to randomize fewer groups than most simple trials randomize individuals. For example, a trial might randomise ten villages with a total of 15,000 inhabitants. Analysing by village, we would end up with only ten observations. So, we would end up with fewer data (and hence less statistical power) than a simple trial involving substantially fewer participants analysed as individuals. Second, not all groups will be the same size, and we would give the same weight to a village of 10,000 inhabitants as a village of 150 inhabitants.

An alternative possibility is to ignore the groupings and compare all the individuals in intervention groups with all those in control groups. This has been a common approach both to analysing individual cluster randomized trials and to representing them in systematic reviews. But it is problematic because it ignores the fact that individuals within a particular group tend to be more similar to each other than to members of other groups. Such analyses can spuriously overestimate the significance of differences, and should be avoided.

Think of the example where we randomize villages. Residents in one village may share the same climate, nutrition, education and health care, which make their outcomes more similar to each other than to residents in a different village. We use the term intra-cluster correlation coefficient to describe the extent to which two members of one cluster are more similar than two people from different clusters.

There are statistical techniques for appropriate analyses of cluster randomized trials. We can recognize that clusters are made up of individuals and that there may be more individuals in one cluster than in another. The intra-cluster correlation coefficient plays an important role in these techniques. Further details can be found in Section 8 of the Reviewers’ Handbook. Cluster randomized trials may be incorporated into meta-analyses using the generic inverse-variance method.

‘Body-part randomization’ and ‘body-part analysis’

We use the terms ‘body-part randomization’ and ‘body-part analysis’ to distinguish between two different types of study design involving parts of the body.

By ‘body-part randomization’ designs we mean those in which similar body parts are randomized to different interventions. For example, a person’s arms may be randomized so that each gets a different cream applied. Other examples include studies for eyes and teeth. One issue to think about is contamination – could a treatment in one part affect what happens in another? If so this raises the question of whether such designs are appropriate in the first place. You may notice a similarity between this design and the crossover design described above. In both designs, each person receives both interventions. In fact, the analysis of a body-part randomization trial should proceed in the same way as the analysis of a crossover trial, involving a paired analysis.

By ‘body-part analysis’ we mean a particular approach to the analysis of a standard parallel group design trial. Suppose a (whole) individual is randomized to receive a surgical intervention for cataracts. If he or she has cataracts in both eyes you might collect outcomes for the vision in each eye separately, and you might want the patient to therefore contribute two measurements to the data analysis. This is rather like a cluster randomized trial, where the person is the cluster and the eyes are the individuals within the cluster. In fact, the analysis of these types of trials should proceed in the same way as the analysis of a cluster randomized trial. However, if there are only one or two measurements for each individual, it may be preferable and simpler to select only one measurement per individual.