Predicting Overall Survival Using Machine Learning Algorithms in Oral Cavity Squamous Cell Carcinoma

Tan, JY; Adeoye, J; Thomson, P; Sharma, D; Ramamurthy, P; Choi, SW

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.21873/anticanres.16094
Scopus: eid_2-s2.0-85143181130
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Orthopaedics & Traumatology: Journal/Magazine Articles

Article: Predicting Overall Survival Using Machine Learning Algorithms in Oral Cavity Squamous Cell Carcinoma

Title	Predicting Overall Survival Using Machine Learning Algorithms in Oral Cavity Squamous Cell Carcinoma
Authors	Tan, JY Adeoye, J Thomson, P Sharma, D Ramamurthy, P Choi, SW
Keywords	interpretability machine learning Oral cavity cancer prognosis SHapley values
Issue Date	1-Dec-2022
Publisher	International Institute of Anticancer Research
Citation	Anticancer Research, 2022, v. 42, n. 12, p. 5859-5866 How to Cite? DOI: http://dx.doi.org/10.21873/anticanres.16094
Abstract	Background/Aim: Machine learning (ML) models are often modelled to predict cancer prognosis but rarely consider spatial factors in a region. Hence this study explored machine learning algorithms utilising Local Government Areas (LGAs) in Queensland, Australia to spatially predict 3- and 5-year prognosis of oral cancer patients and provide clinical interpretability of the predicted outcome made by the ML model. Patients and Methods: Data from a total of 3,841 oral cancer patients were retrieved from the Queensland Cancer Registry (QCR). Synthesizing minority oversampling technique together with edited nearest neighbours (SMOTE-ENN) was used to pre-process unbalanced datasets. Five ML models: logistic regression, random forest classifier, XGBoost, Gaussian Naïve Bayes and Voting Classifier were trained. Predictive features were age, sex, LGAs, tumour site and differentiation. Outcomes were 3- and 5-year overall survival of patients. Model performances on test set were evaluated using area under the curve and F1 scores. SHapley Additive exPlanations (SHAP) method was applied to the best performing model for model interpretation of the predicted outcome. Results: The Voting Classifier was the best performing model with F1 score of 0.58 and 0.64 for 3- and 5-year overall survival, respectively. Age was the most important feature in the Voting Classifier in 3- and 5-year prognosis prediction. LGAs at diagnosis was the top 3 predictive feature for both 3- and 5-year models. Conclusion: The Voting Classifier demonstrated the best overall performance in classifying both 3- and 5-year overall survival of oral cancer patients in Queensland. SHAP method provided clinical understanding of the predictive features of the Voting Classifier.
Persistent Identifier	http://hdl.handle.net/10722/329176
ISSN	0250-7005 2021 Impact Factor: 2.435 2020 SCImago Journal Rankings: 0.735

DC Field	Value	Language
dc.contributor.author	Tan, JY	-
dc.contributor.author	Adeoye, J	-
dc.contributor.author	Thomson, P	-
dc.contributor.author	Sharma, D	-
dc.contributor.author	Ramamurthy, P	-
dc.contributor.author	Choi, SW	-
dc.date.accessioned	2023-08-05T07:55:51Z	-
dc.date.available	2023-08-05T07:55:51Z	-
dc.date.issued	2022-12-01	-
dc.identifier.citation	Anticancer Research, 2022, v. 42, n. 12, p. 5859-5866	-
dc.identifier.issn	0250-7005	-
dc.identifier.uri	http://hdl.handle.net/10722/329176	-
dc.description.abstract	<p>Background/Aim: Machine learning (ML) models are often modelled to predict cancer prognosis but rarely consider spatial factors in a region. Hence this study explored machine learning algorithms utilising Local Government Areas (LGAs) in Queensland, Australia to spatially predict 3- and 5-year prognosis of oral cancer patients and provide clinical interpretability of the predicted outcome made by the ML model. Patients and Methods: Data from a total of 3,841 oral cancer patients were retrieved from the Queensland Cancer Registry (QCR). Synthesizing minority oversampling technique together with edited nearest neighbours (SMOTE-ENN) was used to pre-process unbalanced datasets. Five ML models: logistic regression, random forest classifier, XGBoost, Gaussian Naïve Bayes and Voting Classifier were trained. Predictive features were age, sex, LGAs, tumour site and differentiation. Outcomes were 3- and 5-year overall survival of patients. Model performances on test set were evaluated using area under the curve and F1 scores. SHapley Additive exPlanations (SHAP) method was applied to the best performing model for model interpretation of the predicted outcome. Results: The Voting Classifier was the best performing model with F1 score of 0.58 and 0.64 for 3- and 5-year overall survival, respectively. Age was the most important feature in the Voting Classifier in 3- and 5-year prognosis prediction. LGAs at diagnosis was the top 3 predictive feature for both 3- and 5-year models. Conclusion: The Voting Classifier demonstrated the best overall performance in classifying both 3- and 5-year overall survival of oral cancer patients in Queensland. SHAP method provided clinical understanding of the predictive features of the Voting Classifier.</p>	-
dc.language	eng	-
dc.publisher	International Institute of Anticancer Research	-
dc.relation.ispartof	Anticancer Research	-
dc.subject	interpretability	-
dc.subject	machine learning	-
dc.subject	Oral cavity cancer	-
dc.subject	prognosis	-
dc.subject	SHapley values	-
dc.title	Predicting Overall Survival Using Machine Learning Algorithms in Oral Cavity Squamous Cell Carcinoma	-
dc.type	Article	-
dc.identifier.doi	10.21873/anticanres.16094	-
dc.identifier.scopus	eid_2-s2.0-85143181130	-
dc.identifier.volume	42	-
dc.identifier.issue	12	-
dc.identifier.spage	5859	-
dc.identifier.epage	5866	-
dc.identifier.eissn	1791-7530	-
dc.identifier.issnl	0250-7005	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Predicting Overall Survival Using Machine Learning Algorithms in Oral Cavity Squamous Cell Carcinoma

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats