More Web Proxy on the site http://driver.im/

research-article

Open access

Just Accepted

Domain Ontology-Driven Knowledge Graph Generation from Text

Authors:

Shaoxiong Zhan,

Wolfgang Mayer,

Zaiwen FengAuthors Info & Claims

ACM Transactions on Probabilistic Machine Learning

Accepted on 09 November 2024

https://doi.org/10.1145/3708478

Online AM: 18 December 2024 Publication History

Abstract

A knowledge graph serves as a unified and standardized representation for extracting and representing textual information. In the field of knowledge extraction and representation research, named entity recognition and relation extraction provide effective solutions for knowledge graph generation tasks. However, it is a challenge that lies in extracting domain-specific knowledge from the rich and general textual corpora and generating corresponding domain knowledge graphs to support domain-specific reasoning, question-answering, and decision-making tasks. The hierarchical domain knowledge representation model (i.e. domain ontology) provides a solution for this problem. Therefore, we propose an end-to-end approach based on domain ontology embedding and pre-trained language models for domain knowledge graph generation from text, which incorporates domain node recognition and domain relation extraction phases. We evaluated our domain ontology-driven model on the Wikidata-TekGen dataset and the DBpedia-WebNLG dataset, and the results indicate that our approach based on the pre-trained language models with fewer parameters compared with the baseline models has significantly contributed to the domain knowledge graph generation without prompts.

References

[1]

Bilal Abu-Salih. 2021. Domain-specific knowledge graphs: A survey. Journal of Network and Computer Applications 185 (2021), 103076.

[2]

Oshin Agarwal, Heming Ge, Siamak Shakeri, and Rami Al-Rfou. 2020. Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training. arXiv preprint arXiv:2010.12688 (2020).

[3]

Jiaoyan Chen, Pan Hu, Ernesto Jimenez-Ruiz, Ole Magnus Holter, Denvar Antonyrajah, and Ian Horrocks. 2021. OWL2Vec*: Embedding of OWL ontologies. Machine Learning 110, 7 (2021), 1813–1845.

[4]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[5]

Pierre L Dognin, Igor Melnyk, Inkit Padhi, Cicero Nogueira dos Santos, and Payel Das. 2020. Dualtkb: A dual learning bridge between text and knowledge base. arXiv preprint arXiv:2010.14660 (2020).

[6]

Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015).

[7]

Shaoxiong Ji, Shirui Pan, Erik Cambria, Pekka Marttinen, and S Yu Philip. 2021. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE transactions on neural networks and learning systems 33, 2 (2021), 494–514.

[8]

Danh Le-Phuoc, Hoan Nguyen Mau Quoc, Hung Ngo Quoc, Tuan Tran Nhat, and Manfred Hauswirth. 2016. The graph of things: A step towards the live knowledge graph of connected things. Journal of Web Semantics 37 (2016), 25–35.

Digital Library

[9]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019).

[10]

Linfeng Li, Peng Wang, Jun Yan, Yao Wang, Simin Li, Jinpeng Jiang, Zhe Sun, Buzhou Tang, Tsung-Hui Chang, Shenghui Wang, et al. 2020. Real-world data medical knowledge graph: construction and applications. Artificial intelligence in medicine 103 (2020), 101817.

[11]

Wei-Lin Chiang Lianmin Zheng, Ying Sheng and Lisa Dunlap. 2023. Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality. https://lmsys.org/blog/2023-03-30-vicuna/. [Accessed 11-07-2024].

[12]

Jinjiao Lin, Yanze Zhao, Weiyuan Huang, Chunfang Liu, and Haitao Pu. 2021. Domain knowledge graph-based research progress of knowledge representation. Neural Computing and Applications 33 (2021), 681–690.

Digital Library

[13]

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980–2988.

[14]

Zhihuang Lin, Dan Yang, and Xiaochun Yin. 2020. Patient similarity via joint embeddings of medical knowledge graph and medical entity descriptions. IEEE Access 8 (2020), 156663–156676.

[15]

Dongfang Lou, Zhilin Liao, Shumin Deng, Ningyu Zhang, and Huajun Chen. 2021. MLBiNet: A cross-sentence collective event detection network. arXiv preprint arXiv:2105.09458 (2021).

[16]

Igor Melnyk, Pierre Dognin, and Payel Das. 2022. Knowledge graph generation from text. arXiv preprint arXiv:2211.10511 (2022).

[17]

Nandana Mihindukulasooriya, Sanju Tiwari, Carlos F Enguix, and Kusum Lata. 2023. Text2kgbench: A benchmark for ontology-driven knowledge graph generation from text. In International Semantic Web Conference. Springer, 247–265.

Digital Library

[18]

Fabian Neuhaus. 2018. What is an Ontology? arXiv preprint arXiv:1810.09171 (2018).

[19]

Ciyuan Peng, Feng Xia, Mehdi Naseriparsa, and Francesco Osborne. 2023. Knowledge graphs: Opportunities and challenges. Artificial Intelligence Review (2023), 1–32.

[20]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 5485–5551.

Digital Library

[21]

Claire Gardent Thiago Castro Ferreira, Chris van der Lee Nikolai Ilinykh, Diego Moussallem Simon Mille, and Anastasia Shimorina. 2020. GitHub - WebNLG/WebNLG-Text-to-triples: WebNLG+ Challenge 2020: the automatic evaluation script for the text-to-RDF task — github.com. https://github.com/WebNLG/WebNLG-Text-to-triples. [Accessed 02-03-2024].

[22]

Chenguang Wang, Xiao Liu, and Dawn Song. 2020. Language models are open knowledge graphs. arXiv preprint arXiv:2010.11967 (2020).

[23]

Cheng Xie, Beibei Yu, Zuoying Zeng, Yun Yang, and Qing Liu. 2020. Multilayer internet-of-things middleware based on knowledge graph. IEEE Internet of Things Journal 8, 4 (2020), 2635–2648.

[24]

Hongbin Ye, Ningyu Zhang, Hui Chen, and Huajun Chen. 2022. Generative knowledge graph construction: A review. arXiv preprint arXiv:2210.12714 (2022).

[25]

Jiaxuan You, Rex Ying, Xiang Ren, William Hamilton, and Jure Leskovec. 2018. Graphrnn: Generating realistic graphs with deep auto-regressive models. In International conference on machine learning. PMLR, 5708–5717.

[26]

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric Xing, et al. 2024. Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in Neural Information Processing Systems 36 (2024).

[27]

Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, and Philip HS Torr. 2015. Conditional random fields as recurrent neural networks. In Proceedings of the IEEE international conference on computer vision. 1529–1537.

Digital Library

[28]

Lingfeng Zhong, Jia Wu, Qian Li, Hao Peng, and Xindong Wu. 2023. A comprehensive survey on automatic knowledge graph construction. arXiv preprint arXiv:2302.05019 (2023).

[29]

Dongzhuoran Zhou, Baifan Zhou, et al. 2022. Ontology reshaping for knowledge graph construction: Applied on bosch welding case. In International Semantic Web Conference. Springer, 770–790.

Digital Library

[30]

Dongzhuoran Zhou, Baifan Zhou, Zhuoxun Zheng, Ahmet Soylu, Ognjen Savkovic, Egor V Kostylev, and Evgeny Kharlamov. 2022. Schere: Schema reshaping for enhancing knowledge graph construction. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 5074–5078.

Digital Library

[31]

Guangtong Zhou, Selasi Kwashie, et al. 2023. FASTAGEDS: fast approximate graph entity dependency discovery. In International Conference on Web Information Systems Engineering. Springer, 451–465.

Digital Library

[32]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223–2232.

Index Terms

Domain Ontology-Driven Knowledge Graph Generation from Text
1. Computing methodologies
  1. Artificial intelligence
2. Information systems
  1. Information retrieval

Recommendations

Domain knowledge graph-based research progress of knowledge representation
Abstract
Domain knowledge graph has become a research topic in the era of artificial intelligence. Knowledge representation is the key step to construct domain knowledge graph. There have been quite a few well-established general knowledge graphs. However, ...
Demand-driven knowledge acquisition method for enhancing domain ontology integrity

Knowledge, as the most important resource for the knowledge economy in the 21st century, is fundamental to enterprise competitive strength. Therefore, how to effectively integrate internal and external knowledge and provide correct knowledge to the ...
CreaDO -- A Methodology to Create Domain Ontologies Using Parameter-Based Ontology Merging Techniques
MICAI '11: Proceedings of the 2011 10th Mexican International Conference on Artificial Intelligence

Nowadays, ontologies have become a key mechanism to represent the knowledge of a specific domain. Domain ontologies can be used for different purposes, one of them is the development of semantic search engines that obtain precise results by considering ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Probabilistic Machine Learning

ACM Transactions on Probabilistic Machine Learning Just Accepted

EISSN:2836-8924

Table of Contents

Copyright © 2024 Copyright held by the owner/author(s).

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 18 December 2024

Accepted: 09 November 2024

Revised: 16 August 2024

Received: 09 March 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
202
Total Downloads

Downloads (Last 12 months)202
Downloads (Last 6 weeks)94

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media