File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.3233/SHTI240557
- Scopus: eid_2-s2.0-85202002415
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Book Chapter: Exploring Offline Large Language Models for Clinical Information Extraction: A Study of Renal Histopathological Reports of Lupus Nephritis Patients
Title | Exploring Offline Large Language Models for Clinical Information Extraction: A Study of Renal Histopathological Reports of Lupus Nephritis Patients |
---|---|
Authors | |
Keywords | commodity hardware information extraction Large language models lupus nephritis offline prompt strategy renal biopsy reports |
Issue Date | 22-Aug-2024 |
Publisher | IOS Press |
Abstract | Open source, lightweight and offline generative large language models (LLMs) hold promise for clinical information extraction due to their suitability to operate in secured environments using commodity hardware without token cost. By creating a simple lupus nephritis (LN) renal histopathology annotation schema and generating gold standard data, this study investigates prompt-based strategies using three state-of-the-art lightweight LLMs, namely BioMistral-DARE-7B (BioMistral), Llama-2-13B (Llama 2), and Mistral-7B-instruct-v0.2 (Mistral). We examine the performance of these LLMs within a zero-shot learning environment for renal histopathology report information extraction. Incorporating four prompting strategies, including combinations of batch prompt (BP), single task prompt (SP), chain of thought (CoT) and standard simple prompt (SSP), our findings indicate that both Mistral and BioMistral consistently demonstrated higher performance compared to Llama 2. Mistral recorded the highest performance, achieving an F1-score of 0.996 [95% CI: 0.993, 0.999] for extracting the numbers of various subtypes of glomeruli across all BP settings and 0.898 [95% CI: 0.871, 0.921] in extracting relational values of immune markers under the BP+SSP setting. This study underscores the capability of offline LLMs to provide accurate and secure clinical information extraction, which can serve as a promising alternative to their heavy-weight online counterparts. |
Persistent Identifier | http://hdl.handle.net/10722/354938 |
ISSN | 2023 SCImago Journal Rankings: 0.289 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Tai, Isaac CY | - |
dc.contributor.author | Wong, Emmanual CK | - |
dc.contributor.author | Wu, Joseph T | - |
dc.contributor.author | Leung, Kathy | - |
dc.contributor.author | Yap, Desmond YH | - |
dc.contributor.author | Wong, Zoie SY | - |
dc.date.accessioned | 2025-03-18T00:35:28Z | - |
dc.date.available | 2025-03-18T00:35:28Z | - |
dc.date.issued | 2024-08-22 | - |
dc.identifier.issn | 0926-9630 | - |
dc.identifier.uri | http://hdl.handle.net/10722/354938 | - |
dc.description.abstract | <p>Open source, lightweight and offline generative large language models (LLMs) hold promise for clinical information extraction due to their suitability to operate in secured environments using commodity hardware without token cost. By creating a simple lupus nephritis (LN) renal histopathology annotation schema and generating gold standard data, this study investigates prompt-based strategies using three state-of-the-art lightweight LLMs, namely BioMistral-DARE-7B (BioMistral), Llama-2-13B (Llama 2), and Mistral-7B-instruct-v0.2 (Mistral). We examine the performance of these LLMs within a zero-shot learning environment for renal histopathology report information extraction. Incorporating four prompting strategies, including combinations of batch prompt (BP), single task prompt (SP), chain of thought (CoT) and standard simple prompt (SSP), our findings indicate that both Mistral and BioMistral consistently demonstrated higher performance compared to Llama 2. Mistral recorded the highest performance, achieving an F1-score of 0.996 [95% CI: 0.993, 0.999] for extracting the numbers of various subtypes of glomeruli across all BP settings and 0.898 [95% CI: 0.871, 0.921] in extracting relational values of immune markers under the BP+SSP setting. This study underscores the capability of offline LLMs to provide accurate and secure clinical information extraction, which can serve as a promising alternative to their heavy-weight online counterparts.</p> | - |
dc.language | eng | - |
dc.publisher | IOS Press | - |
dc.relation.ispartof | Studies in Health Technology and Informatics | - |
dc.subject | commodity hardware | - |
dc.subject | information extraction | - |
dc.subject | Large language models | - |
dc.subject | lupus nephritis | - |
dc.subject | offline | - |
dc.subject | prompt strategy | - |
dc.subject | renal biopsy reports | - |
dc.title | Exploring Offline Large Language Models for Clinical Information Extraction: A Study of Renal Histopathological Reports of Lupus Nephritis Patients | - |
dc.type | Book_Chapter | - |
dc.identifier.doi | 10.3233/SHTI240557 | - |
dc.identifier.scopus | eid_2-s2.0-85202002415 | - |
dc.identifier.volume | 316 | - |
dc.identifier.spage | 899 | - |
dc.identifier.epage | 903 | - |
dc.identifier.eisbn | 9781643685335 | - |
dc.identifier.issnl | 0926-9630 | - |