Machine Learning for Survival Analysis

Getting Started

by Andreas Bender and Raphael Sonabend

This electronic version (including a PDF download) will always be free and open access (CC BY-NC-SA 4.0). Buying the book will be the greatest indicator to us that a second edition may be useful in the future. We will strive to update this version after publication to correct mistakes (big and small). If you notice any mistakes, please feel free to open an issue.

Licensing

This book is licensed under CC BY-NC-SA 4.0, so you can adapt and redistribute the contents however you like as long as you do cite this book (information below); do not use any material for commercial purposes; and do use a CC BY-NC-SA 4.0 compatible license if you adapt the material. If you have any questions about licensing, just open an issue and we will help you out.

Citation Information

Please cite this book as:

Bender, A., Sonabend, R. (2027). “Machine Learning for Survival Analysis”. CRC Press. https://www.mlsabook.com.

@book{MLSA2027,
    title = {Machine Learning for Survival Analysis},
    author = {Bender, Andreas and Sonabend, Raphael},
    url = {https://www.mlsabook.com/},
    year = {2027},
    isbn = {9781032537498},
    publisher = {CRC Press}
}

Contributing to this book

We welcome contributions to our book, whether you are pointing out typos, requesting content, or even adding your own text. Major contributions (adding or reviewing content) will be acknowledged in the preface of the book. Before you contribute, please read our code of conduct and then open an issue to discuss your proposed contribution.

Biographies

Dr. Andreas Bender is a Senior Lecturer at the Department of Statistics, Head of the Machine Learning Consulting Unit (MLCU) at the Munich Center for Machine Learning (MCML), and founder of the Open Science Initiative in Statistics at LMU Munich. Machine Learning Survival Analysis is one of Andreas’ main research areas. Andreas created several open-source packages and actively contributes to survival analysis software, including pammtools for piecewise exponential additive mixed models and mlr3proba for machine learning survival analysis.

Dr. Raphael Sonabend-Friend is an Associate Director at the National Institute for Health and Care Excellence (NICE) and the CEO and Co-Founder of OSPO Now. Raphael holds a PhD focused on the accessible and transparent use of machine learning for survival analysis. Raphael has over a decade of experience at the intersection of AI and healthcare, including work with large philanthropies, small local charities, governmental bodies, and private sector organizations in the United Kingdom and globally. Raphael has created and maintained several software packages for survival analysis and machine learning, including mlr3proba, survivalmodels, and SurvivalAnalysis.jl. Raphael co-edited and co-authored Applied Machine Learning Using mlr3 in R.

Authors are listed alphabetically; both authors contributed equally to the concepts, research, and writing of this book.

Preface

Time-to-event data refers to data where the outcome of interest is defined by the time until an event occurs. This type of data arises across almost all domains, from medicine and public health to engineering, economics, and finance. At first glance, analysis of time-to-event data might appear to be a standard regression problem with a non-negative outcome (as the time taken until an event must be non-negative). However, as stated by Dr. Terry Therneau, “it takes time to observe time” (Therneau 2024), which, among other things, implies that at the time of an analysis not all observations in the dataset will have experienced the event. Some observations will go on to experience the event outside the observation window; some observations will experience the event but, for example, may be lost to follow-up; and others may never experience the event. Rather than discarding these subjects, survival analysis treats their event times as censored. Censoring is one defining feature of survival analysis and the reason it is mathematically distinct from standard regression or classification. Survival analysis makes use of the censoring information by modeling for all observations: i) if the observation experienced the event or was censored; ii) the time until the event or censoring.

For illustration, consider the following examples.

Stage IV lung cancer
In a five-year randomized trial of a new therapy for advanced non-small cell lung cancer, patients are observed from randomization into the study until the study end and it is recorded if a patient dies within the observation window. Those that die during the trial are said to have experienced the event of interest and their event time is fully observed. For the patients alive at the end of the trial, their survival time is censored at the end of the observation window. Practically speaking, they are said to have survived at least five years, but no further assumptions are made about their time of death.

A regression model trained only on patients who died will be biased, overinflating the probability of mortality by throwing away information about the survivors. Alternatively, a regression model that treats deaths and censoring as equal would also overestimate mortality (though skewing towards the study end if the treatment is successful). Survival analysis uses both groups in a single coherent model. Typical questions in survival analysis include:

What fraction of patients on the new therapy survive three years?
How does mortality risk under the new therapy compare with the control arm?
For a given patient, what is the probability of them being alive after two years?

Unemployment durations
In a study of the labor market, workers are followed throughout a period of individual unemployment. A worker may exit unemployment in several mutually exclusive ways: 1) into a full-time job; 2) into one or more part-time jobs; or 3) out of the labor force entirely; each exit may have very different drivers. Workers still unemployed (but in the labor force) at the end are censored. Later this setting will be introduced as competing risks. The questions of interest mirror the medical trial:

How long until re-employment?
Which workers are at greatest risk of long-term unemployment?
What is the probability of re-employment within one year?

These two examples, medical and economic, share the structural features that motivate survival analysis, which also appear across component maintenance, credit risk, customer churn, and many more. Application-specific names (‘reliability analysis’ in engineering, ‘duration analysis’ in economics) differ but the underlying mathematics is identical. Survival analysis has been developed over several decades to correctly handle partially observed (censored) data, which may be subject to (temporal) sampling bias (known as truncation) (Collett 2014; Kalbfleisch and Prentice 1980; Klein and Moeschberger 2003).

Scope

This book focuses entirely on predictive survival analysis (referred to simply as ‘survival analysis’), which means forward-looking predictions for new subjects. Inference methods, which examine model parameters to learn information about a given dataset or model, are not covered. This book also does not cover residual lifetime prediction for ‘imputing’ survival times for in-sample censored observations. Bayesian and unsupervised learning methods are excluded as the literature is less developed for these areas in the machine learning survival analysis setting; future editions will endeavor to include these methods. Predictive survival analysis is applied across a wide variety of industries, for example,

Manufacturing: Predict the time to equipment failure;
Pharmaceutical: Predict a patient’s survival trajectory after novel treatment;
Healthcare: Predict a patient’s survival time after infection with meningitis;
Finance: Predict the time until a customer defaults on a loan;
Marketing: Predict the risk of a customer churning;
E-commerce: Predict the time until next purchase for personalized marketing.

Despite its importance in many real-world settings, adoption of machine learning in survival analysis has lagged behind predictive modeling for classification and regression. Partly this is due to the fact that in education, analysis of time-to-event data is not part of the canonical machine learning curriculum, while biostatistics curricula often include survival analysis, but not integrated with machine learning. In research, survival analysis literature often focuses on univariate, non-parametric techniques or regression models, traditionally due to sector-wide requirements for interpretability and uncertainty quantification (particularly in healthcare domains). Machine learning literature, on the other hand, focuses on regression and classification which cover the majority of predictive use cases. In recent years there has been increasing overlap between survival analysis and machine learning, but there appears to still be a substantial gap in the knowledge and skills required to integrate the two fields. This book aims to further bridge the gap by introducing the fundamental concepts of both fields before combining them into machine learning survival analysis.

Overview of the book

This book is intended to fill a gap in the literature by providing a comprehensive introduction to machine learning in the survival setting. If you are interested in machine learning or survival analysis separately, then you might consider James et al. (2013); Hastie et al. (2001); or Bishop (2006) for machine learning, and Collett (2014) or Kalbfleisch and Prentice (1980) for survival analysis. This book complements the above works and introduces machine learning terminology from settings such as regression and classification, but without replicating the detail found in other sources. Instead, the primary focus is the intersection of the above two areas and defining the suitability of different methods and models depending on the available data. A particular aim is to introduce the different concepts and terminology necessary to correctly specify the machine learning survival analysis task. For example, before developing any models, it is necessary to identify the presence of different types of censoring and truncation as well as potentially competing risks and other complexities that could arise from time-to-event data. Failure to do so will lead to bias and potentially meaningless results.

This book may be useful for master’s and PhD students who are specializing in machine learning in survival analysis, machine learning engineers looking to solve problems involving partially observed time-to-event data, or practitioners familiar with survival analysis but without machine learning knowledge. The book can be read cover to cover, but should also be useful as a reference book to dip into as required.

Following the introduction, this book is structured in four parts:

Part I: Machine Learning and Survival Analysis
Part I opens with a brief overview of machine learning, introducing key concepts that are universal to any application of machine learning, regardless of the specific setting. It then turns to the basic terminology and concepts of survival analysis, followed by more advanced concepts in the more general ‘event history analysis’ setting, which encompasses competing risks and multi-state models. This part concludes by unifying terminology between machine learning and survival analysis to define what it means to have different survival prediction problems and a machine learning survival analysis task.

Part II: Evaluation
Part II introduces measures for evaluating the different types of predictive tasks introduced in Part I. Evaluation is crucial for choosing between models and eventually trusting the predictions from a trained machine learning model. In each chapter, the measure class is introduced, specific metrics are listed, and commentary is provided on how and when to use the measures. Recommendations for choosing measures are discussed in 10 Choosing Measures. Because this book focuses on the predictive setting, the evaluation measures introduced in Part II are all ‘out-of-sample’ measures, to be used for evaluating models on new, unseen data, although some can also be used during training. This is in contrast to ‘in-sample’ measures, which evaluate how well a model is fit to data, and are usually preferred for inference tasks.

Part III: Models
Part III is a deep dive into models for solving survival analysis problems. This begins with the core models that may not be considered ‘machine learning’ by some; although, as will be shown, with only limited adaptation, these models can be exceptionally powerful. This part of the book continues by exploring different classes of machine learning models including random forests, support vector machines, gradient boosting machines, and neural networks. While this book does not go into full detail about deep learning, the final chapter of this part provides a foundation that can be complemented by works such as Goodfellow et al. (2016). Recommendations for choosing models are discussed in 16 Choosing Models.

Each model class is first introduced in a classification or regression setting, and its extension to survival analysis is then discussed. Differences between model implementations are not discussed, that is, there is no extensive discussion of whether one specific algorithm is superior to another. Instead, the focus is on understanding how these models are built for survival analysis. In this way, readers are well equipped to independently follow papers that introduce specific implementations.

Part IV: Reduction Techniques
The final part introduces reduction techniques, which are methods to transform survival tasks to more standard regression or classification tasks. Practitioners who are comfortable with machine learning in general but not necessarily with survival analysis, may find this part of the book most useful for quickly implementing familiar models within the survival analysis domain.

Competing risks are returned to throughout the book, both through model-specific extensions where available and, in Part IV, through reduction techniques that allow the single-event models introduced in Part III to be applied to competing risks tasks.

Reproducibility

This book includes simulations and figures generated in \(\textsf{R}\) and \(\textsf{Python}\); the code for any figures or experiments in this book is freely available at https://github.com/mlsa-book/MLSA under an MIT license.

Acknowledgments

We would like to gratefully acknowledge our colleagues who reviewed the content of this book, including: Lukas Burk, Dr. Cesaire Fouodo, Prof. Dr. Helmut Küchenhoff, Dr. Lea Orsini, Johannes Piller, Prof. Dr. David Rügamer, Prof. Dr. Matthias Schmid, as well as all the anonymous reviewers who took the time to review and provide detailed feedback.

Parts of this book were reviewed and revised using generative AI tools. The authors fact-checked all responses and rewrote any suggested text to ensure our own voice can be found throughout the book.