A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study

Ning, Yilin; Li, Siqi; Ong, Marcus Eng Hock; Xie, Feng; Chakraborty, Bibhas; Ting, Daniel Shu Wei; Liu, Nan

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1371/journal.pdig.0000062
Scopus: eid_2-s2.0-85176370869

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Biological Sciences: Journal/Magazine Articles

Article: A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study

Title	A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study
Authors	Ning, Yilin Li, Siqi Ong, Marcus Eng Hock Xie, Feng Chakraborty, Bibhas Ting, Daniel Shu Wei Liu, Nan
Issue Date	2022
Citation	PLOS Digital Health, 2022, v. 1, n. 6 June, article no. e0000062 How to Cite? DOI: http://dx.doi.org/10.1371/journal.pdig.0000062
Abstract	Risk scores are widely used for clinical decision making and commonly generated from logistic regression models. Machine-learning-based methods may work well for identifying important predictors to create parsimonious scores, but such ‘black box’ variable selection limits interpretability, and variable importance evaluated from a single model can be biased. We propose a robust and interpretable variable selection approach using the recently developed Shapley variable importance cloud (ShapleyVIC) that accounts for variability in variable importance across models. Our approach evaluates and visualizes overall variable contributions for in-depth inference and transparent variable selection, and filters out non-significant contributors to simplify model building steps. We derive an ensemble variable ranking from variable contributions across models, which is easily integrated with an automated and modularized risk score generator, AutoScore, for convenient implementation. In a study of early death or unplanned readmission after hospital discharge, ShapleyVIC selected 6 variables from 41 candidates to create a well-performing risk score, which had similar performance to a 16-variable model from machine-learning-based ranking. Our work contributes to the recent emphasis on interpretability of prediction models for high-stakes decision making, providing a disciplined solution to detailed assessment of variable importance and transparent development of parsimonious clinical risk scores.
Persistent Identifier	http://hdl.handle.net/10722/351482

DC Field	Value	Language
dc.contributor.author	Ning, Yilin	-
dc.contributor.author	Li, Siqi	-
dc.contributor.author	Ong, Marcus Eng Hock	-
dc.contributor.author	Xie, Feng	-
dc.contributor.author	Chakraborty, Bibhas	-
dc.contributor.author	Ting, Daniel Shu Wei	-
dc.contributor.author	Liu, Nan	-
dc.date.accessioned	2024-11-20T03:56:37Z	-
dc.date.available	2024-11-20T03:56:37Z	-
dc.date.issued	2022	-
dc.identifier.citation	PLOS Digital Health, 2022, v. 1, n. 6 June, article no. e0000062	-
dc.identifier.uri	http://hdl.handle.net/10722/351482	-
dc.description.abstract	Risk scores are widely used for clinical decision making and commonly generated from logistic regression models. Machine-learning-based methods may work well for identifying important predictors to create parsimonious scores, but such ‘black box’ variable selection limits interpretability, and variable importance evaluated from a single model can be biased. We propose a robust and interpretable variable selection approach using the recently developed Shapley variable importance cloud (ShapleyVIC) that accounts for variability in variable importance across models. Our approach evaluates and visualizes overall variable contributions for in-depth inference and transparent variable selection, and filters out non-significant contributors to simplify model building steps. We derive an ensemble variable ranking from variable contributions across models, which is easily integrated with an automated and modularized risk score generator, AutoScore, for convenient implementation. In a study of early death or unplanned readmission after hospital discharge, ShapleyVIC selected 6 variables from 41 candidates to create a well-performing risk score, which had similar performance to a 16-variable model from machine-learning-based ranking. Our work contributes to the recent emphasis on interpretability of prediction models for high-stakes decision making, providing a disciplined solution to detailed assessment of variable importance and transparent development of parsimonious clinical risk scores.	-
dc.language	eng	-
dc.relation.ispartof	PLOS Digital Health	-
dc.title	A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1371/journal.pdig.0000062	-
dc.identifier.scopus	eid_2-s2.0-85176370869	-
dc.identifier.volume	1	-
dc.identifier.issue	6 June	-
dc.identifier.spage	article no. e0000062	-
dc.identifier.epage	article no. e0000062	-
dc.identifier.eissn	2767-3170	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats