Abstract
Named entity disambiguation (NED) is a fundamental task in NLP. Although numerous methods have been proposed for NED in recent years, they ignore the fact that a lot of real-world corpora are diachronic by nature, such as historical documents or news articles, which vary greatly in time. As a consequence, most current methods fail to fully exploit the temporal information inside the corpora and knowledge bases. To address the issue, we propose a novel model which integrates temporal feature into pretrained language model to make our model aware of time and a new sample re-weighting scheme for diachronic NED which penalizes highly-frequent mention-entity pairs to improve performance on rare and unseen entities. We present WikiCMAG and WikiSM, two new NED datasets annotated on ancient Chinese historical records. Experiments show that our model outperforms existing methods by large margins, proving the effectiveness of integrating diachronic information and our re-weighting schema. Our model also gains competitive performance on out-of-distribution (OOD) settings. WikiSM is publicly available at https://github.com/PKUDHC/WikiSM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
In real scenarios, the time which a piece of text describes is easy to obtain since most texts have known sources and metadata that can help with time identification.
- 3.
If there are more than 1 entity with the same minimum edit distance, a random one is chosen.
- 4.
The name comes from that of their public GitHub repository.
- 5.
Here k-shot means that the correct entity occurs k times in the training data.
- 6.
References
Agarwal, D., Angell, R., Monath, N., McCallum, A.: Entity linking via explicit mention-mention coreference modeling. In: Proceedings of NAACL 2022 (2022)
Agarwal, P., Strötgen, J., Del Corro, L., Hoffart, J., Weikum, G.: DiaNED: time-aware named entity disambiguation for diachronic corpora. In: Proceedings of the ACL 2018 (Volume 2: Short Papers), pp. 686–693 (2018)
Angell, R., Monath, N., Mohan, S., Yadav, N., McCallum, A.: Clustering-based Inference for Biomedical Entity Linking. In: Proceedings of NAACL 2021 (2021)
Barba, E., Procopio, L., Navigli, R.: ExtEnD: Extractive Entity Disambiguation. In: Proceedings of ACL 2022 (Volume 1: Long Papers), Dublin, Ireland (2022)
Beigman Klebanov, B., Leong, C.W., Flor, M.: Supervised word-level metaphor detection: Experiments with concreteness and reweighting of examples. In: Proceedings of the Third Workshop on Metaphor in NLP, pp. 11–20 (Jun 2015)
Byrd, J., Lipton, Z.: What is the effect of importance weighting in deep learning? In: Proceedings of ICML 2019, vol. 97, pp. 872–881. PMLR (09–15 Jun 2019)
Chen, S., Wang, J., Jiang, F., Lin, C.Y.: Improving entity linking by modeling latent entity type information. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7529–7537 (2020)
Crestani, F., Lalmas, M., Van Rijsbergen, C.J., Campbell, I.: “is this document relevant?\(\ldots \)probably’’: a survey of probabilistic models in information retrieval. ACM Comput. Surv. 30(4), 528–552 (1998)
DeCao, N., Izacard, G., Riedel, S., Petroni, F.: Autoregressive entity retrieval. In: ICLR 2021. OpenReview.net (2021)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of NAACL 2019, pp. 4171–4186 (2019)
Gillick, D., Kulkarni, S., Lansing, L., Presta, A., Baldridge, J., Ie, E., Garcia-Olano, D.: Learning dense representations for entity retrieval. In: Proceedings of CoNLL 2019, pp. 528–537 (Nov 2019)
Hoffart, J., et al.: Robust disambiguation of named entities in text. In: Proceedings of EMNLP 2011, pp. 782–792 (Jul 2011)
Hu, X., Wu, X., Shu, Y., Qu, Y.: Logical form generation via multi-task learning for complex question answering over knowledge bases. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 1687–1696 (Oct 2022)
van Hulst, J.M., Hasibi, F., Dercksen, K., Balog, K., de Vries, A.P.: REL: An Entity Linker Standing on the Shoulders of Giants. In: Proceedings of SIGIR 2020 (2020)
Khalid, M.A., Jijkoun, V., de Rijke, M.: The impact of named entity normalization on information retrieval for question answering. In: Advances in Information Retrieval. pp. 705–710. Springer, Berlin Heidelberg, Berlin (2008). https://doi.org/10.1007/978-3-540-78646-7_83
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) ICLR 2015 (2015). http://arxiv.org/abs/1412.6980
Liu, Y., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692 [cs] (Jul 2019)
Logeswaran, L., Chang, M.W., Lee, K., Toutanova, K., Devlin, J., Lee, H.: Zero-shot entity linking by reading entity descriptions. In: Proceedings of ACL 2019, pp. 3449–3460. Association for Computational Linguistics, Florence, Italy (Jul 2019)
Martinez-Rodriguez, J.L., Lopez-Arevalo, I., Rios-Alvarado, A.B.: Openie-based approach for knowledge graph construction from text. Expert Syst. Appl. 113, 339–355 (2018)
Rijhwani, S., Preotiuc-Pietro, D.: Temporally-informed analysis of named entity recognition. In: Proceedings of ACL 2020, pp. 7605–7617 (Jul 2020)
Su, Y., Zhang, H., Song, Y., Zhang, T.: Rare and zero-shot word sense disambiguation using Z-reweighting. In: Proceedings of ACL 2022 (Volume 1: Long Papers), pp. 4713–4723 (May 2022)
Wang, D., Liu, C., Zhu, Z., Liu, J., Hu, H., Shen, S., Li, B.: Construction and application of pre-trained models of Siku Quanshu in orientation to digital humanities. Library Tribune 42(06) (2022)
Wang, J., Jatowt, A., Yoshikawa, M.: TimeBERT: extending pre-trained language representations with temporal information. arXiv: 2204.13032 (2022)
Wang, S., Zhuang, S., Zuccon, G.: Bert-based dense retrievers require interpolation with BM25 for effective passage retrieval. In: Proceedings of SIGIR 2021 (2021)
Wu, L., Petroni, F., Josifoski, M., Riedel, S., Zettlemoyer, L.: Scalable zero-shot entity linking with dense entity retrieval. In: Proceedings of EMNLP 2020 (2020)
Yamada, I., et al.: Wikipedia2Vec: An efficient toolkit for learning and visualizing the embeddings of words and entities from Wikipedia. In: Proceedings of EMNLP 2020: System Demonstrations, pp. 23–30. Association for Computational Linguistics (2020)
Yamada, I., Washio, K., Shindo, H., Matsumoto, Y.: Global entity disambiguation with BERT. In: Proceedings of NAACL 2022, pp. 3264–3271 (2022)
Zaporojets, K., Kaffee, L.A., Deleu, J., Demeester, T., Develder, C., Augenstein, I.: TempEL: linking dynamically evolving and newly emerging entities. In: NeurIPS 2022 Datasets and Benchmarks Track (2022)
Zhang, W., Hua, W., Stratos, K.: EntQA: entity linking as question answering. In: ICLR 2022. OpenReview.net (2022)
Acknowledgements
This research is supported by the NSFC project “the Construction of the Knowledge Graph for the History of Chinese Confucianism” (Grant No. 72010107003).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A. Implementation Details
Appendix A. Implementation Details
Details of our implementations are listed as follows. Unless otherwise specified, we adopt the same hyperparameters as in the original works for baseline methods. Apart from GPT 3.5-ZS, all the baselines below use candidate entities generated by our fully-trained retriever, for it has a better performance than commonly used entity retrieval method in English NED benchmarks.
GPT 3.5-ZS. We use gpt-3.5-turbo with following prompt:
We manually pick out the answer if the response contains irrelevant information. Only exact matches are counted.
diaNED. We use Wikipedia2Vec [26] to obtain pretrained Chinese word and entity embeddings. Temporal vector dimensions are altered to 2500 to represent years 500 BC to 2000 AD. Document creation time is replaced by the approximate year described by the document. We use regular expressions to extract year expressions from Chinese Wikipedia.
GENRE. We use bart-base-chineseFootnote 6 to initialize the parameters.
ExtEnD. We use SikuRoBERTa for re-implementation because there is no pretrained Longformer in ancient Chinese.
LUKE. We use SikuRoBERTa to initialize the parameters.
Ours. See Table 5 for major hyperparameters of our model. We use SikuRoBERTa to initialize the parameters. See Fig. 3 for the plot of our re-weighting function with hyperparameters from Table 5.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Deng, Z., Yang, H., Wang, J. (2024). Diachronic Named Entity Disambiguation for Ancient Chinese Historical Records. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1965. Springer, Singapore. https://doi.org/10.1007/978-981-99-8145-8_24
Download citation
DOI: https://doi.org/10.1007/978-981-99-8145-8_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8144-1
Online ISBN: 978-981-99-8145-8
eBook Packages: Computer ScienceComputer Science (R0)