16 Choosing Models

In contrast to measure selection, selecting models is more straightforward and the same heuristics from regression and classification largely apply to survival analysis. First, for low-dimensional data, many experiments have demonstrated that machine learning may not improve upon more standard statistical methods (Christodoulou et al. 2019) and the same holds for survival analysis (Burk et al. 2026). Therefore, the cost that comes with using machine learning — lower interpretability and longer training times — is unlikely to provide any performance benefits when a dataset has relatively few covariates especially when combined with low sample size. In settings where machine learning is more useful, the choice largely falls into the four model classes discussed in this book: random survival forests, survival support vector machines, gradient boosting machines, and neural networks (deep learning). In benchmark experiments with sufficient data and computational resources, it is sensible to include models of varying complexity, from featureless and linear models to models that capture non-linear effects and interactions. However, without significant resources, the rules-of-thumb below can provide a starting point for smaller experiments.

Random survival forests and boosting methods are strong all-purpose methods that can handle different censoring types and competing risks settings. In single-event settings, both have been shown to perform well on high-dimensional data, outperforming other model classes (Spooner et al. 2020). Forests often work well without hyperparameter tuning and may therefore be a sensible first choice for high-dimensional data.

Survival support vector machines have generally shown limited empirical success and appear to have seen little real-world adoption; moreover, their runtime can be substantial, taking hours to produce models that may see little (if any) benefit over simpler models (Burk et al. 2026; Fouodo et al. 2018; Pölsterl et al. 2015). Consequently, they are not generally recommended as a first-choice method.

Among machine learning methods, deep learning has become the dominant paradigm across many domains, and current research trends point to a similar trajectory in survival analysis (Wiegrebe et al. 2024). Neural network performance remains highly data-dependent. There are clear situations in which they are preferred or required, for example when handling image data such as MRI scans, or when identifying patterns in very large datasets such as omics data, yet there are no firm heuristics for selecting one architecture over another.

While deep learning has traditionally required large amounts of data, this constraint is increasingly relaxed by transfer learning, in which large pre-trained models are fine-tuned to a specific context, making powerful architectures usable on smaller datasets, particularly for feature extraction. This is especially valuable in multimodal settings, where image or text inputs are combined with tabular covariates. The development of foundation models opens further avenues, potentially leveraging general representations to benefit low-resource time-to-event problems. A recent example is prior-data fitted networks (PFNs), which learn to approximate Bayesian inference so that predictions can be obtained in a single forward pass. PFNs have been adapted to right-censored time-to-event data, enabling individualized survival prediction without dataset-specific training or tuning (Seletkov et al. 2026; Qi et al. 2026).

In practice, many real-world applications go beyond one-off analyses and require custom architectures developed with frameworks such as PyTorch (Paszke et al. 2017) or TensorFlow (Abadi et al. 2015) rather than off-the-shelf implementations.

Interpreting survival models

Interpreting models is increasingly important as practitioners rely on more complex black box models (Molnar 2019). Classic methods for model comparison, such as the AIC and BIC, have been extended to survival models, though their application is limited to the core survival models (Chapter 11) only (Liang and Zou 2008; Volinsky and Raftery 2000). As a more flexible alternative, any of the calibration measures in Chapter 7 can be used to evaluate a model’s fit to data. To assess algorithmic fairness, the majority of measures discussed in Part II can be used to detect bias in a survival context (Sonabend et al. 2022). Widely used interpretability methods such as SHAP and LIME (Molnar 2019) can be extended to survival analysis off-the-shelf (Langbein et al. 2025), and time-dependent extensions also exist to interpret the impact of variables on the survival probability over time (Krzyziński et al. 2023; Langbein et al. 2025).

Abadi, Martín, Ashish Agarwal, Paul Barham, et al. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/.

Burk, Lukas, John Zobolas, Bernd Bischl, Andreas Bender, Marvin N. Wright, and Raphael Sonabend. 2026. “A large-scale neutral comparison study of survival models on low-dimensional data.” Bioinformatics 42 (5): btag186. https://doi.org/10.1093/bioinformatics/btag186.

Christodoulou, Evangelia, Jie Ma, Gary S Collins, Ewout W Steyerberg, Jan Y Verbakel, and Ben Van Calster. 2019. “A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models.” Journal of Clinical Epidemiology 110 (June): 12–22. https://doi.org/10.1016/j.jclinepi.2019.02.004.

Fouodo, Cesaire J K, I Konig, C Weihs, A Ziegler, and M Wright. 2018. “Support vector machines for survival analysis with R.” The R Journal 10 (July): 412–23.

Krzyziński, Mateusz, Mikołaj Spytek, Hubert Baniecki, and Przemysław Biecek. 2023. “SurvSHAP(t): Time-Dependent Explanations of Machine Learning Survival Models.” Knowledge-Based Systems 262: 110234. https://doi.org/10.1016/j.knosys.2022.110234.

Langbein, Sophie Hanna, Mateusz Krzyziński, Mikołaj Spytek, Hubert Baniecki, Przemysław Biecek, and Marvin N. Wright. 2025. “Interpretable Machine Learning for Survival Analysis.” Biometrical Journal 67 (6): e70089. https://doi.org/10.1002/bimj.70089.

Liang, Hua, and Guohua Zou. 2008. “Improved AIC Selection Strategy for Survival Analysis.” Computational Statistics & Data Analysis 52 (5): 2538–48. https://doi.org/10.1016/j.csda.2007.09.003.

Molnar, Christoph. 2019. Interpretable Machine Learning. https://christophm.github.io/interpretable-ml-book/.

Paszke, Adam, Sam Gross, Soumith Chintala, et al. 2017. Automatic differentiation in pytorch.

Pölsterl, Sebastian, Nassir Navab, and Amin Katouzian. 2015. “Fast Training of Support Vector Machines for Survival Analysis.” In Machine Learning and Knowledge Discovery in Databases, edited by Annalisa Appice, Pedro Pereira Rodrigues, Vítor Santos Costa, João Gama, Alípio Jorge, and Carlos Soares. Springer International Publishing.

Qi, Shi-ang, Vahid Balazadeh, Michael Cooper, Russell Greiner, and Rahul G. Krishnan. 2026. SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference. arXiv:2605.15488. arXiv. https://doi.org/10.48550/arXiv.2605.15488.

Seletkov, Dmitrii, Paul Hager, Georgios Kaissis, Rickmer Braren, Daniel Rueckert, and Raphael Rehms. 2026. Survival In-Context: Amortized Bayesian Survival Analysis via Prior-Fitted Networks. arXiv:2603.29475. arXiv. https://doi.org/10.48550/arXiv.2603.29475.

Sonabend, Raphael, Florian Pfisterer, Alan Mishler, et al. 2022. “Flexible Group Fairness Metrics for Survival Analysis.” DSHealth 2022 Workshop on Applied Data Science for Healthcare at KDD2022. http://arxiv.org/abs/2206.03256.

Spooner, Annette, Emily Chen, Arcot Sowmya, et al. 2020. “A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction.” Scientific Reports 10 (1): 20410. https://doi.org/10.1038/s41598-020-77220-w.

Volinsky, Chris T, and Adrian E Raftery. 2000. “Bayesian Information Criterion for Censored Survival Models.” International Biometric Society 56 (1): 256–62.

Wiegrebe, Simon, Philipp Kopper, Raphael Sonabend, Bernd Bischl, and Andreas Bender. 2024. “Deep learning for survival analysis: a review.” Artificial Intelligence Review 57 (3): 65. https://doi.org/10.1007/s10462-023-10681-3.