Abstract
Information extraction from textual data is becoming more crucial with the increase of available data on the internet. Automatic extraction of information from biomedical data is very useful to researchers, saving time and effort exerted by them. Relation extraction between medical entities is one of the active research areas. In this paper we are presenting a relation extraction deep learning model based on SciBERT, to extract relations between drugs/chemicals and proteins/genes entities from PubMed literature. The model could achieve an average micro F1 score of 91.75% on the ChemProt test set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Grishman, R.: Information extraction: techniques and challenges. In: Pazienza, M.T. (ed.) SCIE 1997. LNCS, vol. 1299, pp. 10–27. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-63438-X_2
Khoo, C.S.G., Chan, S., Niu, Y.: Extracting causal knowledge from a medical database using graphical patterns. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong, pp. 336–343. Association for Computational Linguistics, October 2000
Shi, Y., Xiao, Y., Niu, L.: A brief survey of relation extraction based on distant supervision. In: Rodrigues, J.M.F., et al. (eds.) ICCS 2019. LNCS, vol. 11538, pp. 293–303. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22744-9_23
Krallinger, M., et al.: Overview of the BioCreative VI chemical-protein interaction track (2017)
Lim, S., Kang, J.: Chemical–gene relation extraction using recursive neural network. Database 2018 (2018)
Yüksel, A., Öztürk, H., Ozkirimli, E., Özgür, A.: CNN-Based Chemical-Protein Interactions Classification, p. 3
Peng, Y., Rios, A., Kavuluru, R., Lu, Z.: Extracting chemical–protein relations with ensembles of SVM and deep learning models. Database 2018 (2018)
Antunes, R., Matos, S.: Extraction of chemical–protein interactions from the literature using neural networks and narrow instance representation. Database 2019, baz095 (2019)
Mehryary, F., Bjorne, J., Salakoski, T., Ginter, F.: Potent pairing: ensemble of long short-term memory networks and support vector machine for chemical-protein relation extraction. Database 2018 (2018)
Liu, S., et al.: Attention-Based Neural Networks for Chemical Protein Relation Extraction, p. 4
Zhang, Y., Guo, Z., Lu, W.: Attention Guided Graph Convolutional Networks for Relation Extraction. arXiv:1906.07510 [cs], March 2020. arXiv:1906.07510
Zhang, Y., Lin, H., Yang, Z., Wang, J., Sun, Y.: Chemical–protein interaction extraction via contextualized word representations and multihead attention. Database 2019, baz054 (2019)
Lu, H., Li, L., He, X., Liu, Y., Zhou, A.: Extracting chemical-protein interactions from biomedical literature via granular attention based recurrent neural networks. Comput. Methods Programs Biomed. 176, 61–68 (2019)
Verga, P., McCallum, A.: Predicting Chemical Protein Relations with Biaffine Relation Attention Networks, p. 3
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs], May 2019. arXiv:1810.04805
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 3613–3618. Association for Computational Linguistics (2019)
Choi, D., Lee, H.: Extracting chemical-protein interactions via calibrated deep neural network and self-training. In: Findings of the Association for Computational Linguistics: EMNLP 2020, (Online), pp. 2086–2095. Association for Computational Linguistics, November 2020
Peng, Y.: Chemical-protein relation extraction with ensembles of SVM, CNN, and RNN models, p. 4
Sun, C., Yang, Z., Wang, L., Zhang, Y., Lin, H., Wang, J.: Attention guided capsule networks for chemical-protein interaction extraction. J. Biomed. Inform. 103, 103392 (2020)
SciBERT: A Pretrained Language Model for Scientific Text - ACL Anthology
Neumann, M., King, D., Beltagy, I., Ammar, W.: ScispaCy: fast and robust models for biomedical natural language processing. In: Proceedings of the 18th BioNLP Workshop and Shared Task, Florence, Italy, pp. 319–327. Association for Computational Linguistics, August 2019
Acknowledgment
This research is under the project “Jesor”, funded by the Academy of Scientific Research and Technology (ASRT).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
GabAllah, N., Rafea, A. (2022). Drug Protein Interaction Extraction Using SciBERT Based Deep Learning Model. In: Daimi, K., Al Sadoon, A. (eds) Proceedings of the ICR’22 International Conference on Innovations in Computing Research. ICR 2022. Advances in Intelligent Systems and Computing, vol 1431. Springer, Cham. https://doi.org/10.1007/978-3-031-14054-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-14054-9_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14053-2
Online ISBN: 978-3-031-14054-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)