Multi-state models: Rates, risks, and pseudo-values

Date:

June 17 to 21. MORNING: 9 to 12h.

Instructor

Henrik Ravn

Henrik Ravn is senior statistical director at Novo Nordisk A/S, Denmark. He graduated with an MSc in theoretical statistics in 1992 from University of Aarhus, Denmark and completed a PhD in biostatistics in 2002 from the University of Copenhagen, Denmark. He joined Novo Nordisk in late 2015 after more than 22 years of experience from biostatistical and epidemiological research, at Statens Serum Institut, Denmark and in Guinea-Bissau, West Africa. He has co-authored more than 160 papers, mainly within epidemiology and application of survival analysis and has taught several courses as external lecturer at Section of Biostatistics, University of Copenhagen.

Kragh Andersen

Per Kragh Andersen is professor of biostatistics at the Department of Public Health, University of Copenhagen, Denmark since 1998. He graduated in mathematical statistics from University of Copenhagen in 1978, got his PhD in 1982 and a DMSc degree in 1997. From 1993 to 2002 he worked half time as chief statistician at Danish Epidemiology Science. He is an author or co-author of more than 125 papers on statistical methodology and more than 250 papers in the medical literature. His research has concentrated on survival analysis and he is a co-author of the 1993 book ‘Statistical Models Based on Counting Processes’. He has taught several courses both nationally and internationally both for students with a mathematical background and for students in medicine or public health.

Language

English

Description

In many fields of quantitative science, subjects are followed over time for the occurrence of certain events.

Examples include randomized controlled trials where patients are followed from time of randomization until time of death, and epidemiological cohort studies where disease-free individuals are followed from a given calendar time until diagnosis of a certain disease.

Data from such studies may be represented as events occurring in continuous time and a mathematical framework in which to study such phenomena is that of multi-state models where an event is considered a transition between certain (discrete) states. We will denote the resulting data as multi-state survival data.

An important feature of multi-state survival data is that of incomplete observation. This means that observation of the event(s) of interest is precluded by the occurrence of another event, such as end-of-study, drop-out of study, or death of the individual.

An important distinction is between avoidable events (censoring) representing practical restrictions in data collection that prevent further observation of the subject (e.g., end-of-study or drop-out) and non-avoidable events (competing risks), such as the death of a patient.

In this course we will discuss two classes of statistical models for multi-state survival data: Intensity-based models and marginal models. Intensities ('rates') are parameters that describe the immediate future development of the multi-state process conditionally on past information on how it has developed, while marginal parameters, such as the probability ('risk') of being in a given state at a particular time, do not involve such a conditioning. Both classes of models often involve covariates.

Intensity-based models are inspired by hazard models for survival data, and we shall see that models such as the Cox (1972) proportional hazards model also play an important role for more general multi-state data.

For marginal models one approach is plug-in methods where the marginal parameter is estimated using intensity-based models. Another approach is models that directly target marginal parameters. Pseudo-values provide an alternative and useful approach for analyzing marginal models. The basic idea is that a random variable that is incompletely observed due to censoring is replaced by its so-called pseudo-value when analyzing a regression model for its mean value. The pseudo-value is obtained from an estimator of the marginal mean that takes the censored data properly into account, such as the Kaplan-Meier estimator. Thereby, censoring is dealt with ‘once and for all’ and standard generalized estimating equations may be used with the pseudo-value as response variable.

The course is based on the book 'Models for Multi-State Survival Data. Rates, Risks, and Pseudo-Values.' PK Andersen & H Ravn (2023, Chapman and Hall).

Course goals

Knowledge: Participants should know what multi-state models are and how they are used and interpreted.

Skills: Through exercises, participants will be able to analyze multi-state survival data using R.

Competences: Participants should be able to recognize analysis situations in which the use of multi-state models may be beneficial, distinguish between intensity-based and marginal models, and know how to carry out the analysis in practice.

Course contents

Introduction to multi-state models; survival analysis; competing risks; recurrent events; intensities; marginal parameters; censoring; non-parametric estimation (Nelson-Aalen, Kaplan-Meier, Aalen-Johansen, Cook-Lawless).
Regression models for intensities; Cox model.
Regression models for marginal parameters; plug-in; direct models.
Pseudo-values.
Exercises in R.

Prerequisites

None

Targeted at

Statisticians and data scientists with a basic understanding of survival analysis and with fundamental R knowledge.

Evaluation

To be determined.

Software requirements

R installed (version 4.3.1 or later).