[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Diachronic Named Entity Disambiguation for Ancient Chinese Historical Records

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2023)

Abstract

Named entity disambiguation (NED) is a fundamental task in NLP. Although numerous methods have been proposed for NED in recent years, they ignore the fact that a lot of real-world corpora are diachronic by nature, such as historical documents or news articles, which vary greatly in time. As a consequence, most current methods fail to fully exploit the temporal information inside the corpora and knowledge bases. To address the issue, we propose a novel model which integrates temporal feature into pretrained language model to make our model aware of time and a new sample re-weighting scheme for diachronic NED which penalizes highly-frequent mention-entity pairs to improve performance on rare and unseen entities. We present WikiCMAG and WikiSM, two new NED datasets annotated on ancient Chinese historical records. Experiments show that our model outperforms existing methods by large margins, proving the effectiveness of integrating diachronic information and our re-weighting schema. Our model also gains competitive performance on out-of-distribution (OOD) settings. WikiSM is publicly available at https://github.com/PKUDHC/WikiSM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 63.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 79.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/PKUDHC/WikiSM.

  2. 2.

    In real scenarios, the time which a piece of text describes is easy to obtain since most texts have known sources and metadata that can help with time identification.

  3. 3.

    If there are more than 1 entity with the same minimum edit distance, a random one is chosen.

  4. 4.

    The name comes from that of their public GitHub repository.

  5. 5.

    Here k-shot means that the correct entity occurs k times in the training data.

  6. 6.

    https://huggingface.co/fnlp/bart-base-chinese.

References

  1. Agarwal, D., Angell, R., Monath, N., McCallum, A.: Entity linking via explicit mention-mention coreference modeling. In: Proceedings of NAACL 2022 (2022)

    Google Scholar 

  2. Agarwal, P., Strötgen, J., Del Corro, L., Hoffart, J., Weikum, G.: DiaNED: time-aware named entity disambiguation for diachronic corpora. In: Proceedings of the ACL 2018 (Volume 2: Short Papers), pp. 686–693 (2018)

    Google Scholar 

  3. Angell, R., Monath, N., Mohan, S., Yadav, N., McCallum, A.: Clustering-based Inference for Biomedical Entity Linking. In: Proceedings of NAACL 2021 (2021)

    Google Scholar 

  4. Barba, E., Procopio, L., Navigli, R.: ExtEnD: Extractive Entity Disambiguation. In: Proceedings of ACL 2022 (Volume 1: Long Papers), Dublin, Ireland (2022)

    Google Scholar 

  5. Beigman Klebanov, B., Leong, C.W., Flor, M.: Supervised word-level metaphor detection: Experiments with concreteness and reweighting of examples. In: Proceedings of the Third Workshop on Metaphor in NLP, pp. 11–20 (Jun 2015)

    Google Scholar 

  6. Byrd, J., Lipton, Z.: What is the effect of importance weighting in deep learning? In: Proceedings of ICML 2019, vol. 97, pp. 872–881. PMLR (09–15 Jun 2019)

    Google Scholar 

  7. Chen, S., Wang, J., Jiang, F., Lin, C.Y.: Improving entity linking by modeling latent entity type information. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7529–7537 (2020)

    Google Scholar 

  8. Crestani, F., Lalmas, M., Van Rijsbergen, C.J., Campbell, I.: “is this document relevant?\(\ldots \)probably’’: a survey of probabilistic models in information retrieval. ACM Comput. Surv. 30(4), 528–552 (1998)

    Article  Google Scholar 

  9. DeCao, N., Izacard, G., Riedel, S., Petroni, F.: Autoregressive entity retrieval. In: ICLR 2021. OpenReview.net (2021)

    Google Scholar 

  10. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of NAACL 2019, pp. 4171–4186 (2019)

    Google Scholar 

  11. Gillick, D., Kulkarni, S., Lansing, L., Presta, A., Baldridge, J., Ie, E., Garcia-Olano, D.: Learning dense representations for entity retrieval. In: Proceedings of CoNLL 2019, pp. 528–537 (Nov 2019)

    Google Scholar 

  12. Hoffart, J., et al.: Robust disambiguation of named entities in text. In: Proceedings of EMNLP 2011, pp. 782–792 (Jul 2011)

    Google Scholar 

  13. Hu, X., Wu, X., Shu, Y., Qu, Y.: Logical form generation via multi-task learning for complex question answering over knowledge bases. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 1687–1696 (Oct 2022)

    Google Scholar 

  14. van Hulst, J.M., Hasibi, F., Dercksen, K., Balog, K., de Vries, A.P.: REL: An Entity Linker Standing on the Shoulders of Giants. In: Proceedings of SIGIR 2020 (2020)

    Google Scholar 

  15. Khalid, M.A., Jijkoun, V., de Rijke, M.: The impact of named entity normalization on information retrieval for question answering. In: Advances in Information Retrieval. pp. 705–710. Springer, Berlin Heidelberg, Berlin (2008). https://doi.org/10.1007/978-3-540-78646-7_83

  16. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) ICLR 2015 (2015). http://arxiv.org/abs/1412.6980

  17. Liu, Y., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692 [cs] (Jul 2019)

  18. Logeswaran, L., Chang, M.W., Lee, K., Toutanova, K., Devlin, J., Lee, H.: Zero-shot entity linking by reading entity descriptions. In: Proceedings of ACL 2019, pp. 3449–3460. Association for Computational Linguistics, Florence, Italy (Jul 2019)

    Google Scholar 

  19. Martinez-Rodriguez, J.L., Lopez-Arevalo, I., Rios-Alvarado, A.B.: Openie-based approach for knowledge graph construction from text. Expert Syst. Appl. 113, 339–355 (2018)

    Article  Google Scholar 

  20. Rijhwani, S., Preotiuc-Pietro, D.: Temporally-informed analysis of named entity recognition. In: Proceedings of ACL 2020, pp. 7605–7617 (Jul 2020)

    Google Scholar 

  21. Su, Y., Zhang, H., Song, Y., Zhang, T.: Rare and zero-shot word sense disambiguation using Z-reweighting. In: Proceedings of ACL 2022 (Volume 1: Long Papers), pp. 4713–4723 (May 2022)

    Google Scholar 

  22. Wang, D., Liu, C., Zhu, Z., Liu, J., Hu, H., Shen, S., Li, B.: Construction and application of pre-trained models of Siku Quanshu in orientation to digital humanities. Library Tribune 42(06) (2022)

    Google Scholar 

  23. Wang, J., Jatowt, A., Yoshikawa, M.: TimeBERT: extending pre-trained language representations with temporal information. arXiv: 2204.13032 (2022)

  24. Wang, S., Zhuang, S., Zuccon, G.: Bert-based dense retrievers require interpolation with BM25 for effective passage retrieval. In: Proceedings of SIGIR 2021 (2021)

    Google Scholar 

  25. Wu, L., Petroni, F., Josifoski, M., Riedel, S., Zettlemoyer, L.: Scalable zero-shot entity linking with dense entity retrieval. In: Proceedings of EMNLP 2020 (2020)

    Google Scholar 

  26. Yamada, I., et al.: Wikipedia2Vec: An efficient toolkit for learning and visualizing the embeddings of words and entities from Wikipedia. In: Proceedings of EMNLP 2020: System Demonstrations, pp. 23–30. Association for Computational Linguistics (2020)

    Google Scholar 

  27. Yamada, I., Washio, K., Shindo, H., Matsumoto, Y.: Global entity disambiguation with BERT. In: Proceedings of NAACL 2022, pp. 3264–3271 (2022)

    Google Scholar 

  28. Zaporojets, K., Kaffee, L.A., Deleu, J., Demeester, T., Develder, C., Augenstein, I.: TempEL: linking dynamically evolving and newly emerging entities. In: NeurIPS 2022 Datasets and Benchmarks Track (2022)

    Google Scholar 

  29. Zhang, W., Hua, W., Stratos, K.: EntQA: entity linking as question answering. In: ICLR 2022. OpenReview.net (2022)

    Google Scholar 

Download references

Acknowledgements

This research is supported by the NSFC project “the Construction of the Knowledge Graph for the History of Chinese Confucianism” (Grant No. 72010107003).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Wang .

Editor information

Editors and Affiliations

Appendix A. Implementation Details

Appendix A. Implementation Details

Details of our implementations are listed as follows. Unless otherwise specified, we adopt the same hyperparameters as in the original works for baseline methods. Apart from GPT 3.5-ZS, all the baselines below use candidate entities generated by our fully-trained retriever, for it has a better performance than commonly used entity retrieval method in English NED benchmarks.

GPT 3.5-ZS. We use gpt-​3.5-​turbo with following prompt:

figure b

We manually pick out the answer if the response contains irrelevant information. Only exact matches are counted.

diaNED. We use Wikipedia2Vec [26] to obtain pretrained Chinese word and entity embeddings. Temporal vector dimensions are altered to 2500 to represent years 500 BC to 2000 AD. Document creation time is replaced by the approximate year described by the document. We use regular expressions to extract year expressions from Chinese Wikipedia.

GENRE. We use bart-base-chineseFootnote 6 to initialize the parameters.

ExtEnD. We use SikuRoBERTa for re-implementation because there is no pretrained Longformer in ancient Chinese.

LUKE. We use SikuRoBERTa to initialize the parameters.

Ours. See Table 5 for major hyperparameters of our model. We use SikuRoBERTa to initialize the parameters. See Fig. 3 for the plot of our re-weighting function with hyperparameters from Table 5.

Fig. 3.
figure 3

The plot of re-weighting function with our hyperparameters.

Table 5. Major hyperparameters in the experiment

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Deng, Z., Yang, H., Wang, J. (2024). Diachronic Named Entity Disambiguation for Ancient Chinese Historical Records. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1965. Springer, Singapore. https://doi.org/10.1007/978-981-99-8145-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8145-8_24

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8144-1

  • Online ISBN: 978-981-99-8145-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics