Statistical Evaluation of Diagnostic and Predictive Models

Course title

Faculty

Thomas Alexander Gerds, Section of Biostatistics, University of Copenhagen, Oester Farimagsgade 5, 1014 Copenhagen, Denmark.

Thomas Alexander Gerds is Associate Professor at the Section of Biostatistics at the Department of Public Health, University of Copenhagen. He studied mathematics and biostatistics in Freiburg, Germany. His research has a focus on risk prediction models and survival analysis. He is developer and maintainer of several R-packages. Thomas Alexander Gerds has served as guest editor for a Special Issue in Statistics in Medicine and is currently working as Associated Editor for Scandinavian Journal of Statistics and Sankhya (Series A).

Course language

English

Course schedule

June 29 to July 1: 3:00pm to 7:00pm
July 2: 3:00pm to 6:00pm

Type of activity and class load

15 hours classroom course with laptop practicals

Description

Risk predictions are useful to inform patients and to guide medical decision making and the basis for developing personalized medicine. Prediction performance describes how well a model will do on future patients. Unfortunately, this is never knowable. The best we can do is to simulate the model being applied to future patients by repeatedly splitting the data set into training and validadion part.

The course deals with suitable criteria for evaluating models for the current status (diagnosis) and the future status (prediction) of subjects in a pupulation. A very simple diagnostic model is a medical test. Given a marker variable which can be measured for each subject, the medical test diagnoses the disease if the marker value exceeds a certain threshold. The logistic regression model is more complex as it adjusts for confounding factors. Thereby it can often achieve better diagnostic and predictive results. A rather different approach is a classification tree which can be achieved with a random forest that combines many classification trees. The diversity of possible modelling approaches requires objective and flexible criteria for evaluation and comparison.

In medical application the basis for risk prediction are epidemiological variables such as age, gender, smoking status, and also measurements from biotechnological platforms, such as blood tests and genetic markers. It is necessary to quantify the added value when a new marker is added to an existing model.

If the outcome is the time until an event occurs then follow-up can end in three different ways: (1) The event of interest occurs. (2) The patient is lost to follow-up event free (right censored). (3) A competing risk occurs.

Aim and content

The aim of the course is to provide insights into statistical analysis when the aim is prediction and diagnosis.
After finishing the course, the participants:

will have a general understanding of the role of statistical models for medical decision making
can compute and interpret common statistical measures of predictive accuracy:
- Sensitivity, specificity, prognostic values
- Reclassification tables
- Roc curves, AUC Brier score, c-index
have learned how to obtain risk predictions from logistic and Cox regression models
understand the difference between model checking (in its own data) and model performance ( in new data)
have basic knowledge of crossvalidation and bootstrap
have basic knowledge of tools for censored time outcome and competing risks

Evaluation

Practical exercise

Classroom

PC2