Biomedical Named Entity Recognition via Knowledge Guidance and Question Answering

Published: 18 July 2021


In this work, we formulated the named entity recognition (NER) task as a multi-answer knowledge guided question-answer task (KGQA) and showed that the knowledge guidance helps to achieve state-of-the-art results for 11 of 18 biomedical NER datasets. We prepended five different knowledge contexts—entity types, questions, definitions, and examples—to the input text and trained and tested BERT-based neural models on such input sequences from a combined dataset of the 18 different datasets. This novel formulation of the task (a) improved named entity recognition and illustrated the impact of different knowledge contexts, (b) reduced system confusion by limiting prediction to a single entity-class for each input token (i.e., B, I, O only) compared to multiple entity-classes in traditional NER (i.e., Bentity1, Bentity2, Ientity1, I, O), (c) made detection of nested entities easier, and (d) enabled the models to jointly learn NER-specific features from a large number of datasets. We performed extensive experiments of this KGQA formulation on the biomedical datasets, and through the experiments, we showed when knowledge improved named entity recognition. We analyzed the effect of the task formulation, the impact of the different knowledge contexts, the multi-task aspect of the generic format, and the generalization ability of KGQA. We also probed the model to better understand the key contributors for these improvements.


Information & Contributors


Published In

cover image ACM Transactions on Computing for Healthcare
ACM Transactions on Computing for Healthcare  Volume 2, Issue 4
October 2021
199 pages
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].


Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2021
Accepted: 01 May 2021
Revised: 01 April 2021
Received: 01 September 2020
Published in HEALTH Volume 2, Issue 4


Author Tags

  2. BIO tagging
  3. NER
  4. Named entity recognition
  5. biomedical
  6. multitask training
  7. question answering
  8. text tagging
  9. transfer learning


  • Research-article
  • Research
  • Refereed

Funding Sources


