MEIO - Summer School - Introduction to Clustered Data with Examples from Repeated Measurements Studies and Cluster Randomised Trials

Títol del curs: Introduction to Clustered Data with Examples from Repeated Measurements Studies and Cluster Randomised Trials

Impartit per: Rumana Omar. Reader and Head of Biostatistics Group Department of Statistical Science. University College. London. rumana@stats.ucl.ac.uk

Llengua del curs: Anglès

Dates i horaris del curs: Del 3, 4, 5 de maig de 9 a 13 h.

Lloc: Aula 100 de l'FME

Tipus d'activitat i càrrega lectiva: Curs de 10 hores

Reconeixement acadèmic: 1,5 crèdits

Data de matrícula: Del 5 al 25 d’abril del 2010

Presentació: Summary Clustered data are commonly collected in medical research studies. Two frequently used study designs which produce clustered data are repeated measurements studies and cluster randomised trials. Health research studies often collect repeated measurements of outcomes over time on patients or other subjects. It may be planned that the measurements are taken at fixed time intervals or at variable time intervals. The pattern of times of measurement may be planned to be identical for all patients or may differ from patient to patient. Cluster randomised trials are trials in which groups of patients rather than individual patients are randomised to receive the different interventions of interest. Examples of interventions which are typically evaluated within a cluster randomised trial are health promotion programmes, training packages for health professionals, educational interventions and screening policies. Special methods are required for analysis of clustered data.

Summary Statistics Method: The principle underlying this approach is that each patient’s multiple measurements may be summarised by one summary measure of particular interest. Having reduced the information to one data point per patient, we can analyse the summary measures as if they were raw data. Choice of summary measure depends on the clinical questions to be answered and the nature of the study outcome. For example, the mean of the repeated measurements is commonly used when one is interested in each patient’s average response over time. Other possible summary measures include the area under the curve, time spent above a given level, or time to return to baseline level. This method is not able to handle clusters of unequal sizes easily.

Advanced methods for the analysis of clustered data: Statistical models are available which analyse all raw data measurements, while allowing for the fact that the measurements within clusters are correlated. By modelling individual data in this way, greater flexibility is gained. For example, in repeated measurements studies, these models allow inclusion of explanatory variables that are measured at the subject level as well as those which vary over time. In cluster randomised trials, we would need to analyse individual data points if we wished to include patient characteristics (e.g. age and gender) as explanatory variables. Individual data modelling approaches include linear mixed effects, multilevel or random effects models and marginal models based on generalised estimating equations (GEE). It should be noted that for non-continuous outcomes the regression coefficients from the GEE models have a population averaged interpretation whereas those from the random effects models have a cluster specific interpretation.

Sample size issues: The similarity of patients within clusters also has implications for the design of a clustered/longitudinal study, since the clustering must be taken into account when performing the sample size calculations. The sample size that would be required for the same study if it had one level must be inflated by a factor known as the design effect (based on the intracluster correlation coefficient), which reflects the extent of similarity expected between patients within the same cluster.

Temari:

Introduction to clustered data with examples from repeated measurements studies and cluster randomised trials.
Summary Statistics Method
Linear Mixed Effects/Random Effects/ Multilevel Models
Marginal Models (GEE)
Population average versus cluster specific interpretation
Sample size implications