More Web Proxy on the site http://driver.im/

research-article

BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision

Authors:

Chao ZhangAuthors Info & Claims

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 1054 - 1064

https://doi.org/10.1145/3394486.3403149

Published: 20 August 2020 Publication History

Abstract

We study the open-domain named entity recognition (NER) problem under distant supervision. The distant supervision, though does not require large amounts of manual annotations, yields highly incomplete and noisy distant labels via external knowledge bases. To address this challenge, we propose a new computational framework -- BOND, which leverages the power of pre-trained language models (e.g., BERT and RoBERTa) to improve the prediction performance of NER models. Specifically, we propose a two-stage training algorithm: In the first stage, we adapt the pre-trained language model to the NER tasks using the distant labels, which can significantly improve the recall and precision; In the second stage, we drop the distant labels, and propose a self-training approach to further improve the model performance. Thorough experiments on 5 benchmark datasets demonstrate the superiority of BOND over existing distantly supervised NER methods. The code and distantly labeled data have been released in https://github.com/cliang1453/BOND.

References

[1]

Dominic Balasuriya, Nicky Ringland, Joel Nothman, Tara Murphy, and James R Curran. 2009. Named entity recognition in wikipedia. In the 2009 Workshop on The People's Web Meets NLP. 10--18.

[2]

Kevin Bowden, Jiaqi Wu, Shereen Oraby, Amita Misra, and Marilyn Walker. 2018. SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems. In LREC.

[3]

Yixin Cao, Zikun Hu, Tat-seng Chua, Zhiyuan Liu, and Heng Ji. 2019. Low-Resource Name Tagging Learned with Weakly Labeled Data. In EMNLP-IJCNLP. 261--270.

[4]

Kevin Clark, Minh-Thang Luong, Christopher D. Manning, and Quoc V. Le. 2018. Semi-Supervised Sequence Modeling with Cross-View Training. In EMNLP.

[5]

Ryan Cotterell and Kevin Duh. 2017. Low-Resource Named Entity Recognition with Cross-lingual, Character-Level Neural Conditional Random Fields. In IJCNLP. Asian Federation of Natural Language Processing, 91--96.

[6]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186.

[7]

Xiaocheng Feng, Xiachong Feng, Bing Qin, Zhangyin Feng, and Ting Liu. 2018. Improving Low Resource Named Entity Recognition using Cross-lingual Knowledge Transfer. In IJCAI. 4071--4077.

[8]

Jason Fries, Sen Wu, Alex Ratner, and Christopher Ré. 2017. Swellshark: A generative model for biomedical named entity recognition without labeled data. arXiv preprint arXiv:1704.06360 (2017).

[9]

Athanasios Giannakopoulos, Claudiu Musat, Andreea Hossmann, and Michael Baeriswyl. 2017. Unsupervised Aspect Term Extraction with B-LS™ & CRF using Automatically Labelled Datasets. In the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. 180--188.

[10]

Fréderic Godin, Baptist Vandersmissen, Wesley De Neve, and Rik Van de Walle. 2015. Multimedia lab@ acl wnut ner shared task: Named entity recognition for twitter microposts using distributed word representations. In WNUT. 146--153.

[11]

Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015).

[12]

Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Tuo Zhao. 2019. SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization. arXiv preprint arXiv:1911.03437 (2019).

[13]

Deniz Karatay and Pinar Karagoz. 2015. User Interest Modeling in Twitter with Named Entity Recognition. Making Sense of Microposts (# Microposts2015) (2015).

[14]

Mahboob Alam Khalid, Valentin Jijkoun, and Maarten De Rijke. 2008. The impact of named entity normalization on information retrieval for question answering. In ECIR. Springer, 705--710.

[15]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[16]

John Lafferty, Andrew McCallum, and Fernando CN Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML.

[17]

Ouyu Lan, Xiao Huang, Bill Yuchen Lin, He Jiang, Liyuan Liu, and Xiang Ren. 2020 b. Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling. In ACL.

[18]

Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2020 a. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. In ICLR.

[19]

Qi Li, Haibo Li, Heng Ji, Wen Wang, Jing Zheng, and Fei Huang. 2012. Joint bilingual name tagging for parallel corpora. In CIKM. 1727--1731.

[20]

Nut Limsopatham and Nigel Collier. 2016. Bidirectional LSTM for named entity recognition in Twitter messages. (2016).

[21]

Angli Liu, Jingfei Du, and Veselin Stoyanov. 2019 a. Knowledge-augmented language model and its application to unsupervised named-entity recognition. In NAACL.

[22]

Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han. 2020. On the Variance of the Adaptive Learning Rate and Beyond. In ICLR.

[23]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019 b. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).

[24]

Edward Loper and Steven Bird. 2002. NLTK: the natural language toolkit. arXiv preprint cs/0205028 (2002).

Digital Library

[25]

Xuezhe Ma and Eduard Hovy. 2016. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In ACL. 1064--1074.

[26]

Yu Meng, Jiaming Shen, Chao Zhang, and Jiawei Han. 2018. Weakly-supervised neural text classification. In CIKM. 983--992.

[27]

Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, and Shin Ishii. 2018. Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE T-PAMI, Vol. 41, 8 (2018), 1979--1993.

[28]

Jian Ni, Georgiana Dinu, and Radu Florian. 2017. Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection. In ACL. 1470--1480.

[29]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In EMNLP. 1532--1543.

[30]

Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In NAACL-HLT. 2227--2237.

[31]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683 (2019).

[32]

Afshin Rahimi, Yuan Li, and Trevor Cohn. 2019. Multilingual NER transfer for low-resource languages. arXiv preprint arXiv:1902.00193 (2019).

[33]

Lev Ratinov and Dan Roth. 2009. Design challenges and misconceptions in named entity recognition. In CoNLL. 147--155.

[34]

Chuck Rosenberg, Martial Hebert, and Henry Schneiderman. 2005. Semi-Supervised Self-Training of Object Detection Models. In WACV. 29--36.

[35]

Erik F Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003).

[36]

Jingbo Shang, Liyuan Liu, Xiaotao Gu, Xiang Ren, Teng Ren, and Jiawei Han. 2018. Learning Named Entity Tagger using Domain-Specific Dictionary. In EMNLP. 2054--2064.

[37]

Benjamin Strauss, Bethany Toma, Alan Ritter, Marie-Catherine De Marneffe, and Wei Xu. 2016. Results of the wnut16 named entity recognition shared task. In WNUT. 138--144.

[38]

Antti Tarvainen and Harri Valpola. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In NIPS. 1195--1204.

[39]

Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In CoNLL-2003. 142--147.

[40]

Denny Vrandevc ić and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM, Vol. 57, 10 (2014), 78--85.

Digital Library

[41]

Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, et al. 2013. Ontonotes release 5.0 ldc2013t19. Linguistic Data Consortium, Philadelphia, PA, Vol. 23 (2013).

[42]

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016).

[43]

Junyuan Xie, Ross Girshick, and Ali Farhadi. 2016. Unsupervised deep embedding for clustering analysis. In ICML. 478--487.

[44]

Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A. Smith, and Jaime Carbonell. 2018. Neural Cross-Lingual Named Entity Recognition with Minimal Resources. In EMNLP. 369--379. https://doi.org/10.18653/v1/D18--1034

[45]

Yaosheng Yang, Wenliang Chen, Zhenghua Li, Zhengqiu He, and Min Zhang. 2018. Distantly supervised ner with partial annotation learning and reinforcement learning. In COLING. 2159--2169.

[46]

Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. In NeurIPS. 5754--5764.

[47]

GuoDong Zhou and Jian Su. 2002. Named entity recognition using an HMM-based chunk tagger. In ACL. 473--480.

[48]

Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In ICCV. 19--27.

Cited By

Shibahara TYamada INishida NTeranishi HKozaki KMatsumoto Y(2024)Weakly Supervised Named Entity Recognition Using Thesaurus with Hierarchical Structureシソーラスの階層的構造を利用した弱教師あり固有表現抽出Journal of Natural Language Processing10.5715/jnlp.31.98431:3(984-1014)Online publication date: 2024
https://doi.org/10.5715/jnlp.31.984
Nishida KYoshinaga NNishida K(2024)Self-Adaptive Named Entity Recognition by Retrieving Unstructured KnowledgeJournal of Natural Language Processing10.5715/jnlp.31.40731:2(407-432)Online publication date: 2024
https://doi.org/10.5715/jnlp.31.407
Krasadakis PSakkopoulos EVerykios V(2024)A Survey on Challenges and Advances in Natural Language Processing with a Focus on Legal Informatics and Low-Resource LanguagesElectronics10.3390/electronics1303064813:3(648)Online publication date: 4-Feb-2024
https://doi.org/10.3390/electronics13030648
Show More Cited By

Index Terms

BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction
  2. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Transfer learning

Recommendations

Semi-supervised geological disasters named entity recognition using few labeled data
Abstract
The geological disasters Named Entity Recognition (NER) method aims to recognize entities reflecting disaster event information in unstructured texts to construct a geohazard knowledge graph that can provide a reference for disaster emergency ...
Prompt-Based Self-training Framework for Few-Shot Named Entity Recognition
Knowledge Science, Engineering and Management
Abstract
Exploiting unlabeled data is one of the plausible methods to improve few-shot named entity recognition (few-shot NER), where only a small number of labeled examples are given for each entity type. Existing works focus on learning deep NER models ...
A Self-training Approach for Few-Shot Named Entity Recognition
Web and Big Data
Abstract
Named entity recognition (NER) is a basic task in natural language processing and can be used in a wide range of downstream tasks, such as question answering, text summarization, and machine translation. In recent years, deep-learning based ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

August 2020

3664 pages

ISBN:9781450379984

DOI:10.1145/3394486

General Chairs:
Rajesh Gupta
UC San Diego, USA
,
Yan Liu
USC, USA
,
Program Chairs:
Mohak Shah
LG Electronics, USA
,
Suju Rajan
Linkedin, USA
,
Publications Chairs:
Jiliang Tang
Michigan State, USA
,
B. Aditya Prakash
Georgia Tech, USA

Copyright © 2020 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2020

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '20

Sponsor:

KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

July 6 - 10, 2020

CA, Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

95
Total Citations
View Citations
2,197
Total Downloads

Downloads (Last 12 months)175
Downloads (Last 6 weeks)17

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Shibahara TYamada INishida NTeranishi HKozaki KMatsumoto Y(2024)Weakly Supervised Named Entity Recognition Using Thesaurus with Hierarchical Structureシソーラスの階層的構造を利用した弱教師あり固有表現抽出Journal of Natural Language Processing10.5715/jnlp.31.98431:3(984-1014)Online publication date: 2024
https://doi.org/10.5715/jnlp.31.984
Nishida KYoshinaga NNishida K(2024)Self-Adaptive Named Entity Recognition by Retrieving Unstructured KnowledgeJournal of Natural Language Processing10.5715/jnlp.31.40731:2(407-432)Online publication date: 2024
https://doi.org/10.5715/jnlp.31.407
Krasadakis PSakkopoulos EVerykios V(2024)A Survey on Challenges and Advances in Natural Language Processing with a Focus on Legal Informatics and Low-Resource LanguagesElectronics10.3390/electronics1303064813:3(648)Online publication date: 4-Feb-2024
https://doi.org/10.3390/electronics13030648
Huang JYan DCai YWooldridge MDy JNatarajan S(2024)PMRCProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i16.29791(18316-18326)Online publication date: 20-Feb-2024
https://dl.acm.org/doi/10.1609/aaai.v38i16.29791
Zhang KWang YLi OHao SHe JLan XYang JYe Y(2024)Improved self-training-based distant label denoising method for cybersecurity entity extractionsPLOS ONE10.1371/journal.pone.031547919:12(e0315479)Online publication date: 17-Dec-2024
https://doi.org/10.1371/journal.pone.0315479
Wang ZZhang TXia SLin LWang GGurrin CKongkachandra RSchoeffmann KDang-Nguyen DRossetto LSatoh SZhou L(2024)GBRAIN: Combating Textual Label Noise by Granular-ball based Robust TrainingProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658084(357-365)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3652583.3658084
Qi CLaili YRen LZhang LLi B(2024)A template augmented distant supervision framework for Chinese named entity recognitionInternational Journal of Modeling, Simulation, and Scientific Computing10.1142/S179396232450018115:01Online publication date: 19-Jan-2024
https://doi.org/10.1142/S1793962324500181
Qu XGu YXia QLi ZWang ZHuai B(2024)A Survey on Arabic Named Entity Recognition: Past, Recent Advances, and Future TrendsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.330313636:3(943-959)Online publication date: Mar-2024
https://doi.org/10.1109/TKDE.2023.3303136
Liang XMao RWu LLi JZhang MLi Q(2024)Enhancing Low-Resource NLP by Consistency Training With Data and Model PerturbationsIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2023.332597032(189-199)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TASLP.2023.3325970
Jilcha LKim DKwak J(2024)SaRLog: Semantic-Aware Robust Log Anomaly Detection via BERT-Augmented Contrastive LearningIEEE Internet of Things Journal10.1109/JIOT.2024.338618311:13(23727-23736)Online publication date: 1-Jul-2024
https://doi.org/10.1109/JIOT.2024.3386183
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents