Computer Science > Computation and Language

arXiv:2403.12374 (cs)

[Submitted on 19 Mar 2024]

Title:Improving Generalizability of Extracting Social Determinants of Health Using Large Language Models through Prompt-tuning

Authors:Cheng Peng, Zehao Yu, Kaleb E Smith, Wei-Hsuan Lo-Ciganic, Jiang Bian, Yonghui Wu

View PDF

Abstract:The progress in natural language processing (NLP) using large language models (LLMs) has greatly improved patient information extraction from clinical narratives. However, most methods based on the fine-tuning strategy have limited transfer learning ability for cross-domain applications. This study proposed a novel approach that employs a soft prompt-based learning architecture, which introduces trainable prompts to guide LLMs toward desired outputs. We examined two types of LLM architectures, including encoder-only GatorTron and decoder-only GatorTronGPT, and evaluated their performance for the extraction of social determinants of health (SDoH) using a cross-institution dataset from the 2022 n2c2 challenge and a cross-disease dataset from the University of Florida (UF) Health. The results show that decoder-only LLMs with prompt tuning achieved better performance in cross-domain applications. GatorTronGPT achieved the best F1 scores for both datasets, outperforming traditional fine-tuned GatorTron by 8.9% and 21.8% in a cross-institution setting, and 5.5% and 14.5% in a cross-disease setting.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2403.12374 [cs.CL]
	(or arXiv:2403.12374v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2403.12374

Submission history

From: Cheng Peng [view email]
[v1] Tue, 19 Mar 2024 02:34:33 UTC (325 KB)

Computer Science > Computation and Language

Title:Improving Generalizability of Extracting Social Determinants of Health Using Large Language Models through Prompt-tuning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Improving Generalizability of Extracting Social Determinants of Health Using Large Language Models through Prompt-tuning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators