File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Machine learning prediction of survival in centenarians after age 100: a retrospective, population-based cohort study

TitleMachine learning prediction of survival in centenarians after age 100: a retrospective, population-based cohort study
Authors
KeywordsCentenarians
Machine learning
Mortality
Oldest-old
Prediction model
Issue Date9-Oct-2025
PublisherOxford University Press
Citation
The Journals of Gerontology, Series A: Biological Sciences and Medical Sciences, 2025, v. 80, n. 12 How to Cite?
Abstract

Background

Whether survival at extreme ages can be accurately predicted remains unclear. This study explored the feasibility of using machine learning (ML) and electronic health records (EHRs) to predict mortality in centenarians and identify key survival determinants.

Methods

We analyzed 9718 centenarians (83% women) from the population-based EHR database in Hong Kong (2004-2018). Data were randomly split into 70% training and 30% testing cohorts. Using 82 predictors, including demographics, diagnoses, prescriptions, and laboratory results, we trained stepwise logistic regression and four ML algorithms to predict 1-year, 2-year, and 5-year all-cause mortality after age 100. Model performance was evaluated using discrimination (area under the receiver operating characteristic curve [AUROC]) and calibration metrics. In an independent cohort of 174 606 oldest-old adults aged 85-105 years, we further compared AUROCs of models incorporating the identified predictors versus comorbidity and frailty scores across different age groups.

Results

Among the ML models, eXtreme Gradient Boosting algorithm provided the best performance, with AUROCs of 0.707 (95% CI = 0.685-0.730) for 1-year mortality and 0.704 (0.686-0.723) for 2-year mortality in the testing cohort. However, all models showed poor calibration for 5-year mortality. Top three predictors of mortality included lower albumin levels, more frequent hospitalizations, and higher urea levels. Models including these predictors consistently outperformed comorbidity and frailty for mortality prediction among oldest-old adults.

Conclusions

Utilizing ML models and routinely collected EHRs can predict short-term survival in centenarians with moderate accuracy. Further research is needed to determine whether mortality predictors differ across age in the oldest-old population.


Persistent Identifierhttp://hdl.handle.net/10722/368336
ISSN
2023 Impact Factor: 4.3
2023 SCImago Journal Rankings: 1.285

 

DC FieldValueLanguage
dc.contributor.authorMak, Jonathan K. L.-
dc.contributor.authorYue, Noel C.-
dc.contributor.authorLi, Gloria Hoi-Yee-
dc.contributor.authorYuen, Jacqueline K.-
dc.contributor.authorAuyeung, Tung Wai-
dc.contributor.authorTan, Kathryn Choon Beng-
dc.contributor.authorCheung, Ching-Lung-
dc.date.accessioned2025-12-24T00:37:42Z-
dc.date.available2025-12-24T00:37:42Z-
dc.date.issued2025-10-09-
dc.identifier.citationThe Journals of Gerontology, Series A: Biological Sciences and Medical Sciences, 2025, v. 80, n. 12-
dc.identifier.issn1079-5006-
dc.identifier.urihttp://hdl.handle.net/10722/368336-
dc.description.abstract<p>Background</p><p>Whether survival at extreme ages can be accurately predicted remains unclear. This study explored the feasibility of using machine learning (ML) and electronic health records (EHRs) to predict mortality in centenarians and identify key survival determinants.</p><p>Methods</p><p>We analyzed 9718 centenarians (83% women) from the population-based EHR database in Hong Kong (2004-2018). Data were randomly split into 70% training and 30% testing cohorts. Using 82 predictors, including demographics, diagnoses, prescriptions, and laboratory results, we trained stepwise logistic regression and four ML algorithms to predict 1-year, 2-year, and 5-year all-cause mortality after age 100. Model performance was evaluated using discrimination (area under the receiver operating characteristic curve [AUROC]) and calibration metrics. In an independent cohort of 174 606 oldest-old adults aged 85-105 years, we further compared AUROCs of models incorporating the identified predictors versus comorbidity and frailty scores across different age groups.</p><p>Results</p><p>Among the ML models, eXtreme Gradient Boosting algorithm provided the best performance, with AUROCs of 0.707 (95% CI = 0.685-0.730) for 1-year mortality and 0.704 (0.686-0.723) for 2-year mortality in the testing cohort. However, all models showed poor calibration for 5-year mortality. Top three predictors of mortality included lower albumin levels, more frequent hospitalizations, and higher urea levels. Models including these predictors consistently outperformed comorbidity and frailty for mortality prediction among oldest-old adults.</p><p>Conclusions</p><p>Utilizing ML models and routinely collected EHRs can predict short-term survival in centenarians with moderate accuracy. Further research is needed to determine whether mortality predictors differ across age in the oldest-old population.</p>-
dc.languageeng-
dc.publisherOxford University Press-
dc.relation.ispartofThe Journals of Gerontology, Series A: Biological Sciences and Medical Sciences-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subjectCentenarians-
dc.subjectMachine learning-
dc.subjectMortality-
dc.subjectOldest-old-
dc.subjectPrediction model-
dc.titleMachine learning prediction of survival in centenarians after age 100: a retrospective, population-based cohort study-
dc.typeArticle-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.1093/gerona/glaf218-
dc.identifier.scopuseid_2-s2.0-105021283573-
dc.identifier.volume80-
dc.identifier.issue12-
dc.identifier.eissn1758-535X-
dc.identifier.issnl1079-5006-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats