Development and validation of HBV surveillance models using big data and machine learning

Dong, Weinan; Da Roza, Cecilia Clara; Cheng, Dandan; Zhang, Dahao; Xiang, Yuling; Seto, Wai Kay; Wong, William C.W.

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1080/07853890.2024.2314237
Scopus: eid_2-s2.0-85184691885
PMID: 38340309
Find via

Supplementary

Citations:
- Scopus: 0
- PubMed Central: 0
Appears in Collections:
- Family Medicine and Primary Care: Journal/Magazine Articles
- Medicine: Journal/Magazine Articles

Article: Development and validation of HBV surveillance models using big data and machine learning

Title	Development and validation of HBV surveillance models using big data and machine learning
Authors	Dong, Weinan Da Roza, Cecilia Clara Cheng, Dandan Zhang, Dahao Xiang, Yuling Seto, Wai Kay Wong, William C.W.
Keywords	big data analytics Big data management China infectious disease surveillance machine learning
Issue Date	10-Feb-2024
Publisher	Taylor and Francis Group
Citation	Annals of Medicine, 2024, v. 56, n. 1 How to Cite? DOI: http://dx.doi.org/10.1080/07853890.2024.2314237
Abstract	Background: The construction of a robust healthcare information system is fundamental to enhancing countries’ capabilities in the surveillance and control of hepatitis B virus (HBV). Making use of China’s rapidly expanding primary healthcare system, this innovative approach using big data and machine learning (ML) could help towards the World Health Organization’s (WHO) HBV infection elimination goals of reaching 90% diagnosis and treatment rates by 2030. We aimed to develop and validate HBV detection models using routine clinical data to improve the detection of HBV and support the development of effective interventions to mitigate the impact of this disease in China. Methods: Relevant data records extracted from the Family Medicine Clinic of the University of Hong Kong-Shenzhen Hospital’s Hospital Information System were structuralized using state-of-the-art Natural Language Processing techniques. Several ML models have been used to develop HBV risk assessment models. The performance of the ML model was then interpreted using the Shapley value (SHAP) and validated using cohort data randomly divided at a ratio of 2:1 using a five-fold cross-validation framework. Results: The patterns of physical complaints of patients with and without HBV infection were identified by processing 158,988 clinic attendance records. After removing cases without any clinical parameters from the derivation sample (n = 105,992), 27,392 cases were analysed using six modelling methods. A simplified model for HBV using patients’ physical complaints and parameters was developed with good discrimination (AUC = 0.78) and calibration (goodness of fit test p-value >0.05). Conclusions: Suspected case detection models of HBV, showing potential for clinical deployment, have been developed to improve HBV surveillance in primary care setting in China. (Word count: 264).
Persistent Identifier	http://hdl.handle.net/10722/347364
ISSN	0785-3890 2023 Impact Factor: 4.9 2023 SCImago Journal Rankings: 1.306

DC Field	Value	Language
dc.contributor.author	Dong, Weinan	-
dc.contributor.author	Da Roza, Cecilia Clara	-
dc.contributor.author	Cheng, Dandan	-
dc.contributor.author	Zhang, Dahao	-
dc.contributor.author	Xiang, Yuling	-
dc.contributor.author	Seto, Wai Kay	-
dc.contributor.author	Wong, William C.W.	-
dc.date.accessioned	2024-09-21T00:31:31Z	-
dc.date.available	2024-09-21T00:31:31Z	-
dc.date.issued	2024-02-10	-
dc.identifier.citation	Annals of Medicine, 2024, v. 56, n. 1	-
dc.identifier.issn	0785-3890	-
dc.identifier.uri	http://hdl.handle.net/10722/347364	-
dc.description.abstract	Background: The construction of a robust healthcare information system is fundamental to enhancing countries’ capabilities in the surveillance and control of hepatitis B virus (HBV). Making use of China’s rapidly expanding primary healthcare system, this innovative approach using big data and machine learning (ML) could help towards the World Health Organization’s (WHO) HBV infection elimination goals of reaching 90% diagnosis and treatment rates by 2030. We aimed to develop and validate HBV detection models using routine clinical data to improve the detection of HBV and support the development of effective interventions to mitigate the impact of this disease in China. Methods: Relevant data records extracted from the Family Medicine Clinic of the University of Hong Kong-Shenzhen Hospital’s Hospital Information System were structuralized using state-of-the-art Natural Language Processing techniques. Several ML models have been used to develop HBV risk assessment models. The performance of the ML model was then interpreted using the Shapley value (SHAP) and validated using cohort data randomly divided at a ratio of 2:1 using a five-fold cross-validation framework. Results: The patterns of physical complaints of patients with and without HBV infection were identified by processing 158,988 clinic attendance records. After removing cases without any clinical parameters from the derivation sample (n = 105,992), 27,392 cases were analysed using six modelling methods. A simplified model for HBV using patients’ physical complaints and parameters was developed with good discrimination (AUC = 0.78) and calibration (goodness of fit test p-value >0.05). Conclusions: Suspected case detection models of HBV, showing potential for clinical deployment, have been developed to improve HBV surveillance in primary care setting in China. (Word count: 264).	-
dc.language	eng	-
dc.publisher	Taylor and Francis Group	-
dc.relation.ispartof	Annals of Medicine	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject	big data analytics	-
dc.subject	Big data management	-
dc.subject	China	-
dc.subject	infectious disease surveillance	-
dc.subject	machine learning	-
dc.title	Development and validation of HBV surveillance models using big data and machine learning	-
dc.type	Article	-
dc.identifier.doi	10.1080/07853890.2024.2314237	-
dc.identifier.pmid	38340309	-
dc.identifier.scopus	eid_2-s2.0-85184691885	-
dc.identifier.volume	56	-
dc.identifier.issue	1	-
dc.identifier.eissn	1365-2060	-
dc.identifier.issnl	0785-3890	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Development and validation of HBV surveillance models using big data and machine learning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats