File Download

There are no files associated with this item.

Supplementary

Conference Paper: Granularity Matters: Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation

TitleGranularity Matters: Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation
Authors
Issue Date1-Dec-2023
Abstract

The automatic Brain CT reports generation can improve the efficiency and accuracy of diagnosing cranial diseases. However, current methods are limited by 1) coarse-grained supervision: the training data in image-text format lacks detailed supervision for recognizing subtle abnormalities, and 2) coupled cross-modal alignment: visual-textual alignment may be inevitably coupled in a coarse-grained manner, resulting in tangled feature representation for report generation. In this paper, we propose a novel Pathological Graph-driven Cross-modal Alignment (PGCA) model for accurate and robust Brain CT report generation. Our approach effectively decouples the cross-modal alignment by constructing a Pathological Graph to learn fine-grained visual cues and align them with textual words. This graph comprises heterogeneous nodes representing essential pathological attributes (i.e., tissue and lesion) connected by intra- and inter-attribute edges with prior domain knowledge. Through carefully designed graph embedding and updating modules, our model refines the visual features of subtle tissues and lesions and aligns them with textual words using contrastive learning. Extensive experimental results confirm the viability of our method. We believe that our PGCA model holds the potential to greatly enhance the automatic generation of Brain CT reports and ultimately contribute to improved cranial disease diagnosis.


Persistent Identifierhttp://hdl.handle.net/10722/346190

 

DC FieldValueLanguage
dc.contributor.authorShi, Yanzhao-
dc.contributor.authorJi, Junzhong-
dc.contributor.authorZhang, Xiaodan-
dc.contributor.authorQu, Liangqiong-
dc.contributor.authorLiu, Ying-
dc.date.accessioned2024-09-12T00:30:45Z-
dc.date.available2024-09-12T00:30:45Z-
dc.date.issued2023-12-01-
dc.identifier.urihttp://hdl.handle.net/10722/346190-
dc.description.abstract<p>The automatic Brain CT reports generation can improve the efficiency and accuracy of diagnosing cranial diseases. However, current methods are limited by 1) coarse-grained supervision: the training data in image-text format lacks detailed supervision for recognizing subtle abnormalities, and 2) coupled cross-modal alignment: visual-textual alignment may be inevitably coupled in a coarse-grained manner, resulting in tangled feature representation for report generation. In this paper, we propose a novel Pathological Graph-driven Cross-modal Alignment (PGCA) model for accurate and robust Brain CT report generation. Our approach effectively decouples the cross-modal alignment by constructing a Pathological Graph to learn fine-grained visual cues and align them with textual words. This graph comprises heterogeneous nodes representing essential pathological attributes (i.e., tissue and lesion) connected by intra- and inter-attribute edges with prior domain knowledge. Through carefully designed graph embedding and updating modules, our model refines the visual features of subtle tissues and lesions and aligns them with textual words using contrastive learning. Extensive experimental results confirm the viability of our method. We believe that our PGCA model holds the potential to greatly enhance the automatic generation of Brain CT reports and ultimately contribute to improved cranial disease diagnosis.<br></p>-
dc.languageeng-
dc.relation.ispartofEMNLP (01/12/2023-08/12/2023, Singapore)-
dc.titleGranularity Matters: Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation-
dc.typeConference_Paper-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats