Medical Entity-balanced Prompting Network for Brain CT Report Generation

Zhang, Xiaodan; Shi, Yanzhao; Ji, Junzhong; Zheng, Chengxin; Qu, Liangqiong

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1609/aaai.v39i24.3478

Supplementary

Citations:
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Medical Entity-balanced Prompting Network for Brain CT Report Generation

Title	Medical Entity-balanced Prompting Network for Brain CT Report Generation
Authors	Zhang, Xiaodan Shi, Yanzhao Ji, Junzhong Zheng, Chengxin Qu, Liangqiong
Issue Date	25-Feb-2025
Abstract	The automatic generation of brain CT reports has gained widespread attention, given its potential to assist radiologists in diagnosing cranial diseases. However, brain CT scans involve extensive medical entities, such as diverse anatomy regions and lesions, exhibiting highly inconsistent spatial patterns in 3D volumetric space. This leads to biased learning of medical entities in existing methods, resulting in repetitiveness and inaccuracy in generated reports. To this end, we propose a Medical Entity-balanced Prompting Network (MEPNet), which harnesses the large language model (LLM) to fairly interpret various entities for accurate brain CT report generation. By introducing the visual embedding and the learning status of medical entities as enriched clues, our method prompts the LLM to balance the learning of diverse entities, thereby enhancing reports with comprehensive findings. First, to extract visual embedding of entities, we propose Knowledge-driven Joint Attention to explore and distill entity patterns using both explicit and implicit medical knowledge. Then, a Learning Status Scorer is designed to evaluate the learning of entity visual embeddings, resulting in unique learning status for individual entities. Finally, these entity visual embeddings and status are elaborately integrated into multi-modal prompts, to guide the text generation of LLM. This process allows LLM to self-adapt the learning process for biased-fitted entities, thereby covering detailed findings in generated reports. We conduct experiments on two brain CT report generation benchmarks, showing the effectiveness in clinical accuracy and text coherence. Code — https://github.com/YanzhaoShi/MEPNet.
Persistent Identifier	http://hdl.handle.net/10722/359707

DC Field	Value	Language
dc.contributor.author	Zhang, Xiaodan	-
dc.contributor.author	Shi, Yanzhao	-
dc.contributor.author	Ji, Junzhong	-
dc.contributor.author	Zheng, Chengxin	-
dc.contributor.author	Qu, Liangqiong	-
dc.date.accessioned	2025-09-10T00:30:59Z	-
dc.date.available	2025-09-10T00:30:59Z	-
dc.date.issued	2025-02-25	-
dc.identifier.uri	http://hdl.handle.net/10722/359707	-
dc.description.abstract	<p>The automatic generation of brain CT reports has gained widespread attention, given its potential to assist radiologists in diagnosing cranial diseases. However, brain CT scans involve extensive medical entities, such as diverse anatomy regions and lesions, exhibiting highly inconsistent spatial patterns in 3D volumetric space. This leads to biased learning of medical entities in existing methods, resulting in repetitiveness and inaccuracy in generated reports. To this end, we propose a Medical Entity-balanced Prompting Network (MEPNet), which harnesses the large language model (LLM) to fairly interpret various entities for accurate brain CT report generation. By introducing the <em>visual embedding</em> and the <em>learning status</em> of medical entities as enriched clues, our method prompts the LLM to balance the learning of diverse entities, thereby enhancing reports with comprehensive findings. First, to extract visual embedding of entities, we propose Knowledge-driven Joint Attention to explore and distill entity patterns using both explicit and implicit medical knowledge. Then, a Learning Status Scorer is designed to evaluate the learning of entity visual embeddings, resulting in unique learning status for individual entities. Finally, these entity visual embeddings and status are elaborately integrated into multi-modal prompts, to guide the text generation of LLM. This process allows LLM to self-adapt the learning process for biased-fitted entities, thereby covering detailed findings in generated reports. We conduct experiments on two brain CT report generation benchmarks, showing the effectiveness in clinical accuracy and text coherence. Code — https://github.com/YanzhaoShi/MEPNet.</p>	-
dc.language	eng	-
dc.relation.ispartof	The 39th Annual AAAI Conference on Artificial Intelligence (AAAI) (25/02/2025-04/03/2025, Philadelphia, Pennsylvania)	-
dc.title	Medical Entity-balanced Prompting Network for Brain CT Report Generation	-
dc.type	Conference_Paper	-
dc.identifier.doi	10.1609/aaai.v39i24.3478	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Medical Entity-balanced Prompting Network for Brain CT Report Generation

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats