File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1186/s12916-023-02941-4
- Scopus: eid_2-s2.0-85163852386
- PMID: 37400814
- WOS: WOS:001022895400003
Supplementary
- Citations:
- Appears in Collections:
Article: Sampling inequalities affect generalization of neuroimaging-based diagnostic classifiers in psychiatry
Title | Sampling inequalities affect generalization of neuroimaging-based diagnostic classifiers in psychiatry |
---|---|
Authors | |
Keywords | Diagnostic classification Meta-analysis Neuroimaging Psychiatric machine learning Sampling inequalities |
Issue Date | 2023 |
Citation | BMC Medicine, 2023, v. 21, n. 1, article no. 241 How to Cite? |
Abstract | Background: The development of machine learning models for aiding in the diagnosis of mental disorder is recognized as a significant breakthrough in the field of psychiatry. However, clinical practice of such models remains a challenge, with poor generalizability being a major limitation. Methods: Here, we conducted a pre-registered meta-research assessment on neuroimaging-based models in the psychiatric literature, quantitatively examining global and regional sampling issues over recent decades, from a view that has been relatively underexplored. A total of 476 studies (n = 118,137) were included in the current assessment. Based on these findings, we built a comprehensive 5-star rating system to quantitatively evaluate the quality of existing machine learning models for psychiatric diagnoses. Results: A global sampling inequality in these models was revealed quantitatively (sampling Gini coefficient (G) = 0.81, p <.01), varying across different countries (regions) (e.g., China, G = 0.47; the USA, G = 0.58; Germany, G = 0.78; the UK, G = 0.87). Furthermore, the severity of this sampling inequality was significantly predicted by national economic levels (β = − 2.75, p <.001, R 2adj = 0.40; r = −.84, 95% CI: −.41 to −.97), and was plausibly predictable for model performance, with higher sampling inequality for reporting higher classification accuracy. Further analyses showed that lack of independent testing (84.24% of models, 95% CI: 81.0–87.5%), improper cross-validation (51.68% of models, 95% CI: 47.2–56.2%), and poor technical transparency (87.8% of models, 95% CI: 84.9–90.8%)/availability (80.88% of models, 95% CI: 77.3–84.4%) are prevailing in current diagnostic classifiers despite improvements over time. Relating to these observations, model performances were found decreased in studies with independent cross-country sampling validations (all p <.001, BF10 > 15). In light of this, we proposed a purpose-built quantitative assessment checklist, which demonstrated that the overall ratings of these models increased by publication year but were negatively associated with model performance. Conclusions: Together, improving sampling economic equality and hence the quality of machine learning models may be a crucial facet to plausibly translating neuroimaging-based diagnostic classifiers into clinical practice. |
Persistent Identifier | http://hdl.handle.net/10722/330331 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Chen, Zhiyi | - |
dc.contributor.author | Hu, Bowen | - |
dc.contributor.author | Liu, Xuerong | - |
dc.contributor.author | Becker, Benjamin | - |
dc.contributor.author | Eickhoff, Simon B. | - |
dc.contributor.author | Miao, Kuan | - |
dc.contributor.author | Gu, Xingmei | - |
dc.contributor.author | Tang, Yancheng | - |
dc.contributor.author | Dai, Xin | - |
dc.contributor.author | Li, Chao | - |
dc.contributor.author | Leonov, Artemiy | - |
dc.contributor.author | Xiao, Zhibing | - |
dc.contributor.author | Feng, Zhengzhi | - |
dc.contributor.author | Chen, Ji | - |
dc.contributor.author | Chuan-Peng, Hu | - |
dc.date.accessioned | 2023-09-05T12:09:40Z | - |
dc.date.available | 2023-09-05T12:09:40Z | - |
dc.date.issued | 2023 | - |
dc.identifier.citation | BMC Medicine, 2023, v. 21, n. 1, article no. 241 | - |
dc.identifier.uri | http://hdl.handle.net/10722/330331 | - |
dc.description.abstract | Background: The development of machine learning models for aiding in the diagnosis of mental disorder is recognized as a significant breakthrough in the field of psychiatry. However, clinical practice of such models remains a challenge, with poor generalizability being a major limitation. Methods: Here, we conducted a pre-registered meta-research assessment on neuroimaging-based models in the psychiatric literature, quantitatively examining global and regional sampling issues over recent decades, from a view that has been relatively underexplored. A total of 476 studies (n = 118,137) were included in the current assessment. Based on these findings, we built a comprehensive 5-star rating system to quantitatively evaluate the quality of existing machine learning models for psychiatric diagnoses. Results: A global sampling inequality in these models was revealed quantitatively (sampling Gini coefficient (G) = 0.81, p <.01), varying across different countries (regions) (e.g., China, G = 0.47; the USA, G = 0.58; Germany, G = 0.78; the UK, G = 0.87). Furthermore, the severity of this sampling inequality was significantly predicted by national economic levels (β = − 2.75, p <.001, R 2adj = 0.40; r = −.84, 95% CI: −.41 to −.97), and was plausibly predictable for model performance, with higher sampling inequality for reporting higher classification accuracy. Further analyses showed that lack of independent testing (84.24% of models, 95% CI: 81.0–87.5%), improper cross-validation (51.68% of models, 95% CI: 47.2–56.2%), and poor technical transparency (87.8% of models, 95% CI: 84.9–90.8%)/availability (80.88% of models, 95% CI: 77.3–84.4%) are prevailing in current diagnostic classifiers despite improvements over time. Relating to these observations, model performances were found decreased in studies with independent cross-country sampling validations (all p <.001, BF10 > 15). In light of this, we proposed a purpose-built quantitative assessment checklist, which demonstrated that the overall ratings of these models increased by publication year but were negatively associated with model performance. Conclusions: Together, improving sampling economic equality and hence the quality of machine learning models may be a crucial facet to plausibly translating neuroimaging-based diagnostic classifiers into clinical practice. | - |
dc.language | eng | - |
dc.relation.ispartof | BMC Medicine | - |
dc.subject | Diagnostic classification | - |
dc.subject | Meta-analysis | - |
dc.subject | Neuroimaging | - |
dc.subject | Psychiatric machine learning | - |
dc.subject | Sampling inequalities | - |
dc.title | Sampling inequalities affect generalization of neuroimaging-based diagnostic classifiers in psychiatry | - |
dc.type | Article | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1186/s12916-023-02941-4 | - |
dc.identifier.pmid | 37400814 | - |
dc.identifier.scopus | eid_2-s2.0-85163852386 | - |
dc.identifier.volume | 21 | - |
dc.identifier.issue | 1 | - |
dc.identifier.spage | article no. 241 | - |
dc.identifier.epage | article no. 241 | - |
dc.identifier.eissn | 1741-7015 | - |
dc.identifier.isi | WOS:001022895400003 | - |