Diachronic Named Entity Disambiguation for Ancient Chinese Historical Records

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1965))

Included in the following conference series:

International Conference on Neural Information Processing

731 Accesses

Abstract

Named entity disambiguation (NED) is a fundamental task in NLP. Although numerous methods have been proposed for NED in recent years, they ignore the fact that a lot of real-world corpora are diachronic by nature, such as historical documents or news articles, which vary greatly in time. As a consequence, most current methods fail to fully exploit the temporal information inside the corpora and knowledge bases. To address the issue, we propose a novel model which integrates temporal feature into pretrained language model to make our model aware of time and a new sample re-weighting scheme for diachronic NED which penalizes highly-frequent mention-entity pairs to improve performance on rare and unseen entities. We present WikiCMAG and WikiSM, two new NED datasets annotated on ancient Chinese historical records. Experiments show that our model outperforms existing methods by large margins, proving the effectiveness of integrating diachronic information and our re-weighting schema. Our model also gains competitive performance on out-of-distribution (OOD) settings. WikiSM is publicly available at https://github.com/PKUDHC/WikiSM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 63.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 79.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Injecting Temporal-Aware Knowledge in Historical Named Entity Recognition

HistNERo: Historical Named Entity Recognition for the Romanian Language

Overview of CLEF HIPE 2020: Named Entity Recognition and Linking on Historical Newspapers

Notes

1.
https://github.com/PKUDHC/WikiSM.
2.
In real scenarios, the time which a piece of text describes is easy to obtain since most texts have known sources and metadata that can help with time identification.
3.
If there are more than 1 entity with the same minimum edit distance, a random one is chosen.
4.
The name comes from that of their public GitHub repository.
5.
Here k-shot means that the correct entity occurs k times in the training data.
6.
https://huggingface.co/fnlp/bart-base-chinese.

References

Agarwal, D., Angell, R., Monath, N., McCallum, A.: Entity linking via explicit mention-mention coreference modeling. In: Proceedings of NAACL 2022 (2022)
Google Scholar
Agarwal, P., Strötgen, J., Del Corro, L., Hoffart, J., Weikum, G.: DiaNED: time-aware named entity disambiguation for diachronic corpora. In: Proceedings of the ACL 2018 (Volume 2: Short Papers), pp. 686–693 (2018)
Google Scholar
Angell, R., Monath, N., Mohan, S., Yadav, N., McCallum, A.: Clustering-based Inference for Biomedical Entity Linking. In: Proceedings of NAACL 2021 (2021)
Google Scholar
Barba, E., Procopio, L., Navigli, R.: ExtEnD: Extractive Entity Disambiguation. In: Proceedings of ACL 2022 (Volume 1: Long Papers), Dublin, Ireland (2022)
Google Scholar
Beigman Klebanov, B., Leong, C.W., Flor, M.: Supervised word-level metaphor detection: Experiments with concreteness and reweighting of examples. In: Proceedings of the Third Workshop on Metaphor in NLP, pp. 11–20 (Jun 2015)
Google Scholar
Byrd, J., Lipton, Z.: What is the effect of importance weighting in deep learning? In: Proceedings of ICML 2019, vol. 97, pp. 872–881. PMLR (09–15 Jun 2019)
Google Scholar
Chen, S., Wang, J., Jiang, F., Lin, C.Y.: Improving entity linking by modeling latent entity type information. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7529–7537 (2020)
Google Scholar
Crestani, F., Lalmas, M., Van Rijsbergen, C.J., Campbell, I.: “is this document relevant?\(\ldots \)probably’’: a survey of probabilistic models in information retrieval. ACM Comput. Surv. 30(4), 528–552 (1998)
Article Google Scholar
DeCao, N., Izacard, G., Riedel, S., Petroni, F.: Autoregressive entity retrieval. In: ICLR 2021. OpenReview.net (2021)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of NAACL 2019, pp. 4171–4186 (2019)
Google Scholar
Gillick, D., Kulkarni, S., Lansing, L., Presta, A., Baldridge, J., Ie, E., Garcia-Olano, D.: Learning dense representations for entity retrieval. In: Proceedings of CoNLL 2019, pp. 528–537 (Nov 2019)
Google Scholar
Hoffart, J., et al.: Robust disambiguation of named entities in text. In: Proceedings of EMNLP 2011, pp. 782–792 (Jul 2011)
Google Scholar
Hu, X., Wu, X., Shu, Y., Qu, Y.: Logical form generation via multi-task learning for complex question answering over knowledge bases. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 1687–1696 (Oct 2022)
Google Scholar
van Hulst, J.M., Hasibi, F., Dercksen, K., Balog, K., de Vries, A.P.: REL: An Entity Linker Standing on the Shoulders of Giants. In: Proceedings of SIGIR 2020 (2020)
Google Scholar
Khalid, M.A., Jijkoun, V., de Rijke, M.: The impact of named entity normalization on information retrieval for question answering. In: Advances in Information Retrieval. pp. 705–710. Springer, Berlin Heidelberg, Berlin (2008). https://doi.org/10.1007/978-3-540-78646-7_83
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) ICLR 2015 (2015). http://arxiv.org/abs/1412.6980
Liu, Y., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692 [cs] (Jul 2019)
Logeswaran, L., Chang, M.W., Lee, K., Toutanova, K., Devlin, J., Lee, H.: Zero-shot entity linking by reading entity descriptions. In: Proceedings of ACL 2019, pp. 3449–3460. Association for Computational Linguistics, Florence, Italy (Jul 2019)
Google Scholar
Martinez-Rodriguez, J.L., Lopez-Arevalo, I., Rios-Alvarado, A.B.: Openie-based approach for knowledge graph construction from text. Expert Syst. Appl. 113, 339–355 (2018)
Article Google Scholar
Rijhwani, S., Preotiuc-Pietro, D.: Temporally-informed analysis of named entity recognition. In: Proceedings of ACL 2020, pp. 7605–7617 (Jul 2020)
Google Scholar
Su, Y., Zhang, H., Song, Y., Zhang, T.: Rare and zero-shot word sense disambiguation using Z-reweighting. In: Proceedings of ACL 2022 (Volume 1: Long Papers), pp. 4713–4723 (May 2022)
Google Scholar
Wang, D., Liu, C., Zhu, Z., Liu, J., Hu, H., Shen, S., Li, B.: Construction and application of pre-trained models of Siku Quanshu in orientation to digital humanities. Library Tribune 42(06) (2022)
Google Scholar
Wang, J., Jatowt, A., Yoshikawa, M.: TimeBERT: extending pre-trained language representations with temporal information. arXiv: 2204.13032 (2022)
Wang, S., Zhuang, S., Zuccon, G.: Bert-based dense retrievers require interpolation with BM25 for effective passage retrieval. In: Proceedings of SIGIR 2021 (2021)
Google Scholar
Wu, L., Petroni, F., Josifoski, M., Riedel, S., Zettlemoyer, L.: Scalable zero-shot entity linking with dense entity retrieval. In: Proceedings of EMNLP 2020 (2020)
Google Scholar
Yamada, I., et al.: Wikipedia2Vec: An efficient toolkit for learning and visualizing the embeddings of words and entities from Wikipedia. In: Proceedings of EMNLP 2020: System Demonstrations, pp. 23–30. Association for Computational Linguistics (2020)
Google Scholar
Yamada, I., Washio, K., Shindo, H., Matsumoto, Y.: Global entity disambiguation with BERT. In: Proceedings of NAACL 2022, pp. 3264–3271 (2022)
Google Scholar
Zaporojets, K., Kaffee, L.A., Deleu, J., Demeester, T., Develder, C., Augenstein, I.: TempEL: linking dynamically evolving and newly emerging entities. In: NeurIPS 2022 Datasets and Benchmarks Track (2022)
Google Scholar
Zhang, W., Hua, W., Stratos, K.: EntQA: entity linking as question answering. In: ICLR 2022. OpenReview.net (2022)
Google Scholar

Download references

Acknowledgements

This research is supported by the NSFC project “the Construction of the Knowledge Graph for the History of Chinese Confucianism” (Grant No. 72010107003).

Author information

Authors and Affiliations

Department of Information Management, Peking University, Beijing, China
Zekun Deng & Jun Wang
Research Center for Digital Humanities, Peking University, Beijing, China
Hao Yang & Jun Wang
Institute of Artificial Intelligence, Peking University, Beijing, China
Hao Yang

Authors

Zekun Deng
View author publications
You can also search for this author in PubMed Google Scholar
Hao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Wang .

Editor information

Editors and Affiliations

School of Automation, Central South University, Changsha, China
Biao Luo
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Long Cheng
Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, China
Zheng-Guang Wu
School of Automation, Guangdong University of Technology, Guangzhou, China
Hongyi Li
School of Electrical Engineering and Telecommunications, UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Appendix A. Implementation Details

Details of our implementations are listed as follows. Unless otherwise specified, we adopt the same hyperparameters as in the original works for baseline methods. Apart from GPT 3.5-ZS, all the baselines below use candidate entities generated by our fully-trained retriever, for it has a better performance than commonly used entity retrieval method in English NED benchmarks.

GPT 3.5-ZS. We use gpt-3.5-turbo with following prompt:

We manually pick out the answer if the response contains irrelevant information. Only exact matches are counted.

diaNED. We use Wikipedia2Vec [26] to obtain pretrained Chinese word and entity embeddings. Temporal vector dimensions are altered to 2500 to represent years 500 BC to 2000 AD. Document creation time is replaced by the approximate year described by the document. We use regular expressions to extract year expressions from Chinese Wikipedia.

GENRE. We use bart-base-chinese^{Footnote 6} to initialize the parameters.

ExtEnD. We use SikuRoBERTa for re-implementation because there is no pretrained Longformer in ancient Chinese.

LUKE. We use SikuRoBERTa to initialize the parameters.

Ours. See Table 5 for major hyperparameters of our model. We use SikuRoBERTa to initialize the parameters. See Fig. 3 for the plot of our re-weighting function with hyperparameters from Table 5.

Table 5. Major hyperparameters in the experiment

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deng, Z., Yang, H., Wang, J. (2024). Diachronic Named Entity Disambiguation for Ancient Chinese Historical Records. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1965. Springer, Singapore. https://doi.org/10.1007/978-981-99-8145-8_24

Download citation

DOI: https://doi.org/10.1007/978-981-99-8145-8_24
Published: 27 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8144-1
Online ISBN: 978-981-99-8145-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics