Deep Mining External Imperfect Data for Chest X-Ray Disease Screening

Luo, Luyang; Yu, Lequan; Chen, Hao; Liu, Quande; Wang, Xi; Xu, Jiaqi; Heng, Pheng Ann

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TMI.2020.3000949
Scopus: eid_2-s2.0-85094933040
PMID: 32746106
WOS: WOS:000586352000029

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Statistics & Actuarial Science: Journal/Magazine Articles

Article: Deep Mining External Imperfect Data for Chest X-Ray Disease Screening

Title	Deep Mining External Imperfect Data for Chest X-Ray Disease Screening
Authors	Luo, Luyang Yu, Lequan Chen, Hao Liu, Quande Wang, Xi Xu, Jiaqi Heng, Pheng Ann
Issue Date	2020
Citation	IEEE Transactions on Medical Imaging, 2020, v. 39, n. 11, p. 3583-3594 How to Cite? DOI: http://dx.doi.org/10.1109/TMI.2020.3000949
Abstract	Deep learning approaches have demonstrated remarkable progress in automatic Chest X-ray analysis. The data-driven feature of deep models requires training data to cover a large distribution. Therefore, it is substantial to integrate knowledge from multiple datasets, especially for medical images. However, learning a disease classification model with extra Chest X-ray (CXR) data is yet challenging. Recent researches have demonstrated that performance bottleneck exists in joint training on different CXR datasets, and few made efforts to address the obstacle. In this paper, we argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges. Specifically, the imperfect data is in two folds: domain discrepancy, as the image appearances vary across datasets; and label discrepancy, as different datasets are partially labeled. To this end, we formulate the multi-label thoracic disease classification problem as weighted independent binary tasks according to the categories. For common categories shared across domains, we adopt task-specific adversarial training to alleviate the feature differences. For categories existing in a single dataset, we present uncertainty-aware temporal ensembling of model predictions to mine the information from the missing labels further. In this way, our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability. We conduct extensive experiments on three datasets with more than 360,000 Chest X-ray images. Our method outperforms other competing models and sets state-of-the-art performance on the official NIH test set with 0.8349 AUC, demonstrating its effectiveness of utilizing the external dataset to improve the internal classification.
Persistent Identifier	http://hdl.handle.net/10722/299477
ISI Accession Number ID	WOS:000586352000029

DC Field	Value	Language
dc.contributor.author	Luo, Luyang	-
dc.contributor.author	Yu, Lequan	-
dc.contributor.author	Chen, Hao	-
dc.contributor.author	Liu, Quande	-
dc.contributor.author	Wang, Xi	-
dc.contributor.author	Xu, Jiaqi	-
dc.contributor.author	Heng, Pheng Ann	-
dc.date.accessioned	2021-05-21T03:34:29Z	-
dc.date.available	2021-05-21T03:34:29Z	-
dc.date.issued	2020	-
dc.identifier.citation	IEEE Transactions on Medical Imaging, 2020, v. 39, n. 11, p. 3583-3594	-
dc.identifier.uri	http://hdl.handle.net/10722/299477	-
dc.description.abstract	Deep learning approaches have demonstrated remarkable progress in automatic Chest X-ray analysis. The data-driven feature of deep models requires training data to cover a large distribution. Therefore, it is substantial to integrate knowledge from multiple datasets, especially for medical images. However, learning a disease classification model with extra Chest X-ray (CXR) data is yet challenging. Recent researches have demonstrated that performance bottleneck exists in joint training on different CXR datasets, and few made efforts to address the obstacle. In this paper, we argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges. Specifically, the imperfect data is in two folds: domain discrepancy, as the image appearances vary across datasets; and label discrepancy, as different datasets are partially labeled. To this end, we formulate the multi-label thoracic disease classification problem as weighted independent binary tasks according to the categories. For common categories shared across domains, we adopt task-specific adversarial training to alleviate the feature differences. For categories existing in a single dataset, we present uncertainty-aware temporal ensembling of model predictions to mine the information from the missing labels further. In this way, our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability. We conduct extensive experiments on three datasets with more than 360,000 Chest X-ray images. Our method outperforms other competing models and sets state-of-the-art performance on the official NIH test set with 0.8349 AUC, demonstrating its effectiveness of utilizing the external dataset to improve the internal classification.	-
dc.language	eng	-
dc.relation.ispartof	IEEE Transactions on Medical Imaging	-
dc.title	Deep Mining External Imperfect Data for Chest X-Ray Disease Screening	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TMI.2020.3000949	-
dc.identifier.pmid	32746106	-
dc.identifier.scopus	eid_2-s2.0-85094933040	-
dc.identifier.volume	39	-
dc.identifier.issue	11	-
dc.identifier.spage	3583	-
dc.identifier.epage	3594	-
dc.identifier.eissn	1558-254X	-
dc.identifier.isi	WOS:000586352000029	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Deep Mining External Imperfect Data for Chest X-Ray Disease Screening

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats