[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3394486.3403149acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision

Published: 20 August 2020 Publication History

Abstract

We study the open-domain named entity recognition (NER) problem under distant supervision. The distant supervision, though does not require large amounts of manual annotations, yields highly incomplete and noisy distant labels via external knowledge bases. To address this challenge, we propose a new computational framework -- BOND, which leverages the power of pre-trained language models (e.g., BERT and RoBERTa) to improve the prediction performance of NER models. Specifically, we propose a two-stage training algorithm: In the first stage, we adapt the pre-trained language model to the NER tasks using the distant labels, which can significantly improve the recall and precision; In the second stage, we drop the distant labels, and propose a self-training approach to further improve the model performance. Thorough experiments on 5 benchmark datasets demonstrate the superiority of BOND over existing distantly supervised NER methods. The code and distantly labeled data have been released in https://github.com/cliang1453/BOND.

References

[1]
Dominic Balasuriya, Nicky Ringland, Joel Nothman, Tara Murphy, and James R Curran. 2009. Named entity recognition in wikipedia. In the 2009 Workshop on The People's Web Meets NLP. 10--18.
[2]
Kevin Bowden, Jiaqi Wu, Shereen Oraby, Amita Misra, and Marilyn Walker. 2018. SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems. In LREC.
[3]
Yixin Cao, Zikun Hu, Tat-seng Chua, Zhiyuan Liu, and Heng Ji. 2019. Low-Resource Name Tagging Learned with Weakly Labeled Data. In EMNLP-IJCNLP. 261--270.
[4]
Kevin Clark, Minh-Thang Luong, Christopher D. Manning, and Quoc V. Le. 2018. Semi-Supervised Sequence Modeling with Cross-View Training. In EMNLP.
[5]
Ryan Cotterell and Kevin Duh. 2017. Low-Resource Named Entity Recognition with Cross-lingual, Character-Level Neural Conditional Random Fields. In IJCNLP. Asian Federation of Natural Language Processing, 91--96.
[6]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186.
[7]
Xiaocheng Feng, Xiachong Feng, Bing Qin, Zhangyin Feng, and Ting Liu. 2018. Improving Low Resource Named Entity Recognition using Cross-lingual Knowledge Transfer. In IJCAI. 4071--4077.
[8]
Jason Fries, Sen Wu, Alex Ratner, and Christopher Ré. 2017. Swellshark: A generative model for biomedical named entity recognition without labeled data. arXiv preprint arXiv:1704.06360 (2017).
[9]
Athanasios Giannakopoulos, Claudiu Musat, Andreea Hossmann, and Michael Baeriswyl. 2017. Unsupervised Aspect Term Extraction with B-LS™ & CRF using Automatically Labelled Datasets. In the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. 180--188.
[10]
Fréderic Godin, Baptist Vandersmissen, Wesley De Neve, and Rik Van de Walle. 2015. Multimedia lab@ acl wnut ner shared task: Named entity recognition for twitter microposts using distributed word representations. In WNUT. 146--153.
[11]
Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015).
[12]
Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Tuo Zhao. 2019. SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization. arXiv preprint arXiv:1911.03437 (2019).
[13]
Deniz Karatay and Pinar Karagoz. 2015. User Interest Modeling in Twitter with Named Entity Recognition. Making Sense of Microposts (# Microposts2015) (2015).
[14]
Mahboob Alam Khalid, Valentin Jijkoun, and Maarten De Rijke. 2008. The impact of named entity normalization on information retrieval for question answering. In ECIR. Springer, 705--710.
[15]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[16]
John Lafferty, Andrew McCallum, and Fernando CN Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML.
[17]
Ouyu Lan, Xiao Huang, Bill Yuchen Lin, He Jiang, Liyuan Liu, and Xiang Ren. 2020 b. Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling. In ACL.
[18]
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2020 a. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. In ICLR.
[19]
Qi Li, Haibo Li, Heng Ji, Wen Wang, Jing Zheng, and Fei Huang. 2012. Joint bilingual name tagging for parallel corpora. In CIKM. 1727--1731.
[20]
Nut Limsopatham and Nigel Collier. 2016. Bidirectional LSTM for named entity recognition in Twitter messages. (2016).
[21]
Angli Liu, Jingfei Du, and Veselin Stoyanov. 2019 a. Knowledge-augmented language model and its application to unsupervised named-entity recognition. In NAACL.
[22]
Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han. 2020. On the Variance of the Adaptive Learning Rate and Beyond. In ICLR.
[23]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019 b. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[24]
Edward Loper and Steven Bird. 2002. NLTK: the natural language toolkit. arXiv preprint cs/0205028 (2002).
[25]
Xuezhe Ma and Eduard Hovy. 2016. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In ACL. 1064--1074.
[26]
Yu Meng, Jiaming Shen, Chao Zhang, and Jiawei Han. 2018. Weakly-supervised neural text classification. In CIKM. 983--992.
[27]
Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, and Shin Ishii. 2018. Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE T-PAMI, Vol. 41, 8 (2018), 1979--1993.
[28]
Jian Ni, Georgiana Dinu, and Radu Florian. 2017. Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection. In ACL. 1470--1480.
[29]
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In EMNLP. 1532--1543.
[30]
Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In NAACL-HLT. 2227--2237.
[31]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683 (2019).
[32]
Afshin Rahimi, Yuan Li, and Trevor Cohn. 2019. Multilingual NER transfer for low-resource languages. arXiv preprint arXiv:1902.00193 (2019).
[33]
Lev Ratinov and Dan Roth. 2009. Design challenges and misconceptions in named entity recognition. In CoNLL. 147--155.
[34]
Chuck Rosenberg, Martial Hebert, and Henry Schneiderman. 2005. Semi-Supervised Self-Training of Object Detection Models. In WACV. 29--36.
[35]
Erik F Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003).
[36]
Jingbo Shang, Liyuan Liu, Xiaotao Gu, Xiang Ren, Teng Ren, and Jiawei Han. 2018. Learning Named Entity Tagger using Domain-Specific Dictionary. In EMNLP. 2054--2064.
[37]
Benjamin Strauss, Bethany Toma, Alan Ritter, Marie-Catherine De Marneffe, and Wei Xu. 2016. Results of the wnut16 named entity recognition shared task. In WNUT. 138--144.
[38]
Antti Tarvainen and Harri Valpola. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In NIPS. 1195--1204.
[39]
Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In CoNLL-2003. 142--147.
[40]
Denny Vrandevc ić and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM, Vol. 57, 10 (2014), 78--85.
[41]
Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, et al. 2013. Ontonotes release 5.0 ldc2013t19. Linguistic Data Consortium, Philadelphia, PA, Vol. 23 (2013).
[42]
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016).
[43]
Junyuan Xie, Ross Girshick, and Ali Farhadi. 2016. Unsupervised deep embedding for clustering analysis. In ICML. 478--487.
[44]
Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A. Smith, and Jaime Carbonell. 2018. Neural Cross-Lingual Named Entity Recognition with Minimal Resources. In EMNLP. 369--379. https://doi.org/10.18653/v1/D18--1034
[45]
Yaosheng Yang, Wenliang Chen, Zhenghua Li, Zhengqiu He, and Min Zhang. 2018. Distantly supervised ner with partial annotation learning and reinforcement learning. In COLING. 2159--2169.
[46]
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. In NeurIPS. 5754--5764.
[47]
GuoDong Zhou and Jian Su. 2002. Named entity recognition using an HMM-based chunk tagger. In ACL. 473--480.
[48]
Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In ICCV. 19--27.

Cited By

View all
  • (2024)Weakly Supervised Named Entity Recognition Using Thesaurus with Hierarchical Structureシソーラスの階層的構造を利用した弱教師あり固有表現抽出Journal of Natural Language Processing10.5715/jnlp.31.98431:3(984-1014)Online publication date: 2024
  • (2024)Self-Adaptive Named Entity Recognition by Retrieving Unstructured KnowledgeJournal of Natural Language Processing10.5715/jnlp.31.40731:2(407-432)Online publication date: 2024
  • (2024)A Survey on Challenges and Advances in Natural Language Processing with a Focus on Legal Informatics and Low-Resource LanguagesElectronics10.3390/electronics1303064813:3(648)Online publication date: 4-Feb-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
August 2020
3664 pages
ISBN:9781450379984
DOI:10.1145/3394486
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2020

Check for updates

Author Tags

  1. distant supervision
  2. named entity recognition
  3. open-domain
  4. pre-trained language models
  5. self-training
  6. text mining

Qualifiers

  • Research-article

Conference

KDD '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)175
  • Downloads (Last 6 weeks)17
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Weakly Supervised Named Entity Recognition Using Thesaurus with Hierarchical Structureシソーラスの階層的構造を利用した弱教師あり固有表現抽出Journal of Natural Language Processing10.5715/jnlp.31.98431:3(984-1014)Online publication date: 2024
  • (2024)Self-Adaptive Named Entity Recognition by Retrieving Unstructured KnowledgeJournal of Natural Language Processing10.5715/jnlp.31.40731:2(407-432)Online publication date: 2024
  • (2024)A Survey on Challenges and Advances in Natural Language Processing with a Focus on Legal Informatics and Low-Resource LanguagesElectronics10.3390/electronics1303064813:3(648)Online publication date: 4-Feb-2024
  • (2024)PMRCProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i16.29791(18316-18326)Online publication date: 20-Feb-2024
  • (2024)Improved self-training-based distant label denoising method for cybersecurity entity extractionsPLOS ONE10.1371/journal.pone.031547919:12(e0315479)Online publication date: 17-Dec-2024
  • (2024)GBRAIN: Combating Textual Label Noise by Granular-ball based Robust TrainingProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658084(357-365)Online publication date: 30-May-2024
  • (2024)A template augmented distant supervision framework for Chinese named entity recognitionInternational Journal of Modeling, Simulation, and Scientific Computing10.1142/S179396232450018115:01Online publication date: 19-Jan-2024
  • (2024)A Survey on Arabic Named Entity Recognition: Past, Recent Advances, and Future TrendsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.330313636:3(943-959)Online publication date: Mar-2024
  • (2024)Enhancing Low-Resource NLP by Consistency Training With Data and Model PerturbationsIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2023.332597032(189-199)Online publication date: 1-Jan-2024
  • (2024)SaRLog: Semantic-Aware Robust Log Anomaly Detection via BERT-Augmented Contrastive LearningIEEE Internet of Things Journal10.1109/JIOT.2024.338618311:13(23727-23736)Online publication date: 1-Jul-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media