Evaluating machine learning models and their diagnostic value Chapitre d’ouvrage - Juin 2023

Gaël Varoquaux, Olivier Colliot

Gaël Varoquaux, Olivier Colliot, « Evaluating machine learning models and their diagnostic value  », in Olivier Colliot (ed.), Machine Learning for Brain Disorders, à paraître


This chapter describes model validation, a crucial part of machine learning whether it is to select the best model or to assess risk of a given model. We start by detailing the main performance metrics for different tasks (classification, regression), and how they may be interpreted, including in the face of class imbalance, varying prevalence, or asymmetric cost-benefit trade-offs. We then explain how to estimate these metrics in a unbiased manner using training, validation, and test sets. We describe cross-validation procedures –to use a larger part of the data for both training and testing– and the dangers of data leakage –optimism bias due to training data contaminating the test set. Finally, we discuss how to obtain confidence intervals of performance metrics, distinguishing two situations : internal validation or evaluation of learning algorithms, and external validation or evaluation of resulting prediction models.

Voir la notice complète sur HAL