File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Book Chapter: Exploring Offline Large Language Models for Clinical Information Extraction: A Study of Renal Histopathological Reports of Lupus Nephritis Patients

TitleExploring Offline Large Language Models for Clinical Information Extraction: A Study of Renal Histopathological Reports of Lupus Nephritis Patients
Authors
Keywordscommodity hardware
information extraction
Large language models
lupus nephritis
offline
prompt strategy
renal biopsy reports
Issue Date22-Aug-2024
PublisherIOS Press
Abstract

Open source, lightweight and offline generative large language models (LLMs) hold promise for clinical information extraction due to their suitability to operate in secured environments using commodity hardware without token cost. By creating a simple lupus nephritis (LN) renal histopathology annotation schema and generating gold standard data, this study investigates prompt-based strategies using three state-of-the-art lightweight LLMs, namely BioMistral-DARE-7B (BioMistral), Llama-2-13B (Llama 2), and Mistral-7B-instruct-v0.2 (Mistral). We examine the performance of these LLMs within a zero-shot learning environment for renal histopathology report information extraction. Incorporating four prompting strategies, including combinations of batch prompt (BP), single task prompt (SP), chain of thought (CoT) and standard simple prompt (SSP), our findings indicate that both Mistral and BioMistral consistently demonstrated higher performance compared to Llama 2. Mistral recorded the highest performance, achieving an F1-score of 0.996 [95% CI: 0.993, 0.999] for extracting the numbers of various subtypes of glomeruli across all BP settings and 0.898 [95% CI: 0.871, 0.921] in extracting relational values of immune markers under the BP+SSP setting. This study underscores the capability of offline LLMs to provide accurate and secure clinical information extraction, which can serve as a promising alternative to their heavy-weight online counterparts.


Persistent Identifierhttp://hdl.handle.net/10722/354938
ISSN
2023 SCImago Journal Rankings: 0.289

 

DC FieldValueLanguage
dc.contributor.authorTai, Isaac CY-
dc.contributor.authorWong, Emmanual CK-
dc.contributor.authorWu, Joseph T-
dc.contributor.authorLeung, Kathy-
dc.contributor.authorYap, Desmond YH-
dc.contributor.authorWong, Zoie SY-
dc.date.accessioned2025-03-18T00:35:28Z-
dc.date.available2025-03-18T00:35:28Z-
dc.date.issued2024-08-22-
dc.identifier.issn0926-9630-
dc.identifier.urihttp://hdl.handle.net/10722/354938-
dc.description.abstract<p>Open source, lightweight and offline generative large language models (LLMs) hold promise for clinical information extraction due to their suitability to operate in secured environments using commodity hardware without token cost. By creating a simple lupus nephritis (LN) renal histopathology annotation schema and generating gold standard data, this study investigates prompt-based strategies using three state-of-the-art lightweight LLMs, namely BioMistral-DARE-7B (BioMistral), Llama-2-13B (Llama 2), and Mistral-7B-instruct-v0.2 (Mistral). We examine the performance of these LLMs within a zero-shot learning environment for renal histopathology report information extraction. Incorporating four prompting strategies, including combinations of batch prompt (BP), single task prompt (SP), chain of thought (CoT) and standard simple prompt (SSP), our findings indicate that both Mistral and BioMistral consistently demonstrated higher performance compared to Llama 2. Mistral recorded the highest performance, achieving an F1-score of 0.996 [95% CI: 0.993, 0.999] for extracting the numbers of various subtypes of glomeruli across all BP settings and 0.898 [95% CI: 0.871, 0.921] in extracting relational values of immune markers under the BP+SSP setting. This study underscores the capability of offline LLMs to provide accurate and secure clinical information extraction, which can serve as a promising alternative to their heavy-weight online counterparts.</p>-
dc.languageeng-
dc.publisherIOS Press-
dc.relation.ispartofStudies in Health Technology and Informatics-
dc.subjectcommodity hardware-
dc.subjectinformation extraction-
dc.subjectLarge language models-
dc.subjectlupus nephritis-
dc.subjectoffline-
dc.subjectprompt strategy-
dc.subjectrenal biopsy reports-
dc.titleExploring Offline Large Language Models for Clinical Information Extraction: A Study of Renal Histopathological Reports of Lupus Nephritis Patients-
dc.typeBook_Chapter-
dc.identifier.doi10.3233/SHTI240557-
dc.identifier.scopuseid_2-s2.0-85202002415-
dc.identifier.volume316-
dc.identifier.spage899-
dc.identifier.epage903-
dc.identifier.eisbn9781643685335-
dc.identifier.issnl0926-9630-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats