WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning

Ghadeer Mobasher, Wolfgang Müller, Olga Krebs, Michael Gertz

Abstract

Fine-tuning biomedical pre-trained language models (BioPLMs) such as BioBERT has become a common practice dominating leaderboards across various natural language processing tasks. Despite their success and wide adoption, prevailing fine-tuning approaches for named entity recognition (NER) naively train BioPLMs on targeted datasets without considering class distributions. This is problematic especially when dealing with imbalanced biomedical gold-standard datasets for NER in which most biomedical entities are underrepresented. In this paper, we address the class imbalance problem and propose WeLT, a cost-sensitive fine-tuning approach based on new re-scaled class weights for the task of biomedical NER. We evaluate WeLT’s fine-tuning performance on mixed-domain and domain-specific BioPLMs using eight biomedical gold-standard datasets. We compare our approach against vanilla fine-tuning and three other existing re-weighting schemes. Our results show the positive impact of handling the class imbalance problem. WeLT outperforms all the vanilla fine-tuned models. Furthermore, our method demonstrates advantages over other existing weighting schemes in most experiments.

Anthology ID:: 2023.bionlp-1.40
Volume:: The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Dina Demner-fushman, Sophia Ananiadou, Kevin Cohen
Venue:: BioNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 427–438
Language:
URL:: https://aclanthology.org/2023.bionlp-1.40
DOI:: 10.18653/v1/2023.bionlp-1.40
Bibkey:
Cite (ACL):: Ghadeer Mobasher, Wolfgang Müller, Olga Krebs, and Michael Gertz. 2023. WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 427–438, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning (Mobasher et al., BioNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.bionlp-1.40.pdf
Video:: https://aclanthology.org/2023.bionlp-1.40.mp4

PDF Cite Search Video