Support
← Back to ML modules

Module 3

Model Evaluation, Validation and Performance

This module focuses on how medical machine learning models should be assessed: train/test splits, resampling, cross-validation, bootstrap validation, classification metrics, calibration, clinical usefulness, leakage and reproducibility.

Module aim

Judge models before trusting them.

The purpose of this module is to show that performance is not a single number. A useful clinical model must be validated, calibrated, reproducible and relevant to the decision setting.

5

Lessons

R

Coding labs

Preparing

Module status

Validation

Core focus

Honest validation

Students learn why apparent performance is often too optimistic when models are assessed on the same data used to build them.

Clinical performance metrics

The module explains how discrimination, sensitivity, specificity, calibration and usefulness answer different clinical questions.

Reproducible assessment

Evaluation is treated as a transparent workflow involving resampling, leakage checks, reporting discipline and reproducibility.

Module lessons

Study model evaluation as a full workflow.

Each lesson adds a different evaluation layer: data splitting, resampling, classification metrics, calibration, clinical usefulness, leakage prevention and reproducible reporting.

Learning route

Complete validation before moving to modern prediction models.

Module 4 assumes students understand how to evaluate models honestly. Regularisation, forests and boosting are only useful when their performance is judged with careful validation.

Continue to Module 4 →

Course pathway

Return to the full ML in Biostatistics course.

Use the course homepage to review modules, scripts, datasets, case studies and the full clinical prediction learning pathway.

Back to course →