research-article

DR.BENCH: : Diagnostic Reasoning Benchmark for Clinical Natural Language Processing

Authors:

Majid AfsharAuthors Info & Claims

Volume 138, Issue C

https://doi.org/10.1016/j.jbi.2023.104286

Published: 01 February 2023 Publication History

Abstract

The meaningful use of electronic health records (EHR) continues to progress in the digital era with clinical decision support systems augmented by artificial intelligence. A priority in improving provider experience is to overcome information overload and reduce the cognitive burden so fewer medical errors and cognitive biases are introduced during patient care. One major type of medical error is diagnostic error due to systematic or predictable errors in judgement that rely on heuristics. The potential for clinical natural language processing (cNLP) to model diagnostic reasoning in humans with forward reasoning from data to diagnosis and potentially reduce cognitive burden and medical error has not been investigated. Existing tasks to advance the science in cNLP have largely focused on information extraction and named entity recognition through classification tasks. We introduce a novel suite of tasks coined as Diagnostic Reasoning Benchmarks, Dr.Bench, as a new benchmark for developing and evaluating cNLP models with clinical diagnostic reasoning ability. The suite includes six tasks from ten publicly available datasets addressing clinical text understanding, medical knowledge reasoning, and diagnosis generation. DR.BENCH is the first clinical suite of tasks designed to be a natural language generation framework to evaluate pre-trained language models for diagnostic reasoning. The goal of DR. BENCH is to advance the science in cNLP to support downstream applications in computerized diagnostic decision support and improve the efficiency and accuracy of healthcare providers during patient care. We fine-tune and evaluate the state-of-the-art generative models on DR.BENCH. Experiments show that with domain adaptation pre-training on medical knowledge, the model demonstrated opportunities for improvement when evaluated in DR. BENCH. We share DR. BENCH as a publicly available GitLab repository with a systematic approach to load and evaluate models for the cNLP community. We also discuss the carbon footprint produced during the experiments and encourage future work on DR.BENCH to report the carbon footprint.

References

[1]

Fowler S.A., Yaeger L.H., Yu F., Doerhoff D., Schoening P., Kelly B., Electronic health record: Integrating evidence-based information at the point of clinical decision making, J. Med. Libr. Assoc. 102 (1) (2014) 52–55,. PMID: 24415920; PMCID: PMC3878937.

Abstract

References

Recommendations

Progress Note Understanding — Assessment and Plan Reasoning: Overview of the 2022 N2C2 Track 3 shared task

PASCLex: A comprehensive post-acute sequelae of COVID-19 (PASC) symptom lexicon derived from electronic health record clinical notes

Fever detection from free-text clinical records for biosurveillance

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations