More Web Proxy on the site http://driver.im/

research-article

Enhancing multimodal knowledge graph representation learning through triple contrastive learning

AUTHORs:

Jinzhuo WangAuthors Info & Claims

IJCAI '24: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence

Article No.: 659, Pages 5963 - 5971

https://doi.org/10.24963/ijcai.2024/659

Published: 03 August 2024 Publication History

Abstract

Multimodal knowledge graphs incorporate multimodal information rather than pure symbols, which significantly enhance the representation of knowledge graphs and their capacity to understand the world. Despite these advances, the existing multimodal fusion technique still faces significant challenges in representing modalities and fully integrating the diverse attributes of entities, particularly when dealing with more than one modality. To address this issue, the article proposes a Knowledge Graph Multimodal Representation Learning (KG-MRI) method. This method utilizes foundation models to represent different modalities and incorporates a triple contrastive learning model and a dual-phase training strategy to effectively fuse the different modalities with knowledge graph embeddings. We conducted comprehensive comparisons with several knowledge graph embedding methods to validate the effectiveness of our KG-MRI model. Furthermore, validation on a real-world Non-Alcoholic Fatty Liver Disease (NAFLD) cohort demonstrated that the vector representations learned through our methodology have enhanced representational capabilities and can remove batch effects, showing promise for broader applications in complex multimodal environments.

References

[1]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.

[2]

Walid Ahmad, Elana Simon, Seyone Chithrananda, Gabriel Grand, and Bharath Ramsundar. Chemberta-2: Towards chemical foundation models. arXiv preprint arXiv:2209.01712, 2022.

[3]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems, 26, 2013.

[4]

Leo Breiman. Random forests. Machine learning, 45:5-32, 2001.

Digital Library

[5]

Zongsheng Cao, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, and Qingming Huang. Otkge: Multi-modal knowledge graph embeddings via optimal transport. Advances in Neural Information Processing Systems, 35:39090-39102, 2022.

[6]

Linlin Chao, Jianshan He, Taifeng Wang, and Wei Chu. Pairre: Knowledge graph embeddings via paired relation vectors. arXiv preprint arXiv:2011.03798, 2020.

[7]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597- 1607. PMLR, 2020.

[8]

Xiaojun Chen, Shengbin Jia, and Yang Xiang. A review: Knowledge reasoning over knowledge graph. Expert systems with applications, 141:112948, 2020.

[9]

Haotian Cui, Chloe Wang, Hassaan Maan, Kuan Pang, Fengning Luo, Nan Duan, and Bo Wang. scgpt: toward building a foundation model for single-cell multi-omics using generative ai. Nature Methods, pages 1-11, 2024.

[10]

Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. Convolutional 2d knowledge graph embeddings. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.

[11]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

[12]

Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 601-610, 2014.

Digital Library

[13]

Mikhail Galkin, Etienne Denis, Jiapeng Wu, and William L Hamilton. Nodepiece: Compositional and parameter-efficient representations of large knowledge graphs. arXiv preprint arXiv:2106.12144, 2021.

[14]

Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Armand Alwala, Joulin, and Ishan Misra. Imagebind: One embedding space to bind them all. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15180- 15190, 2023.

[15]

Qingyu Guo, Fuzhen Zhuang, Chuan Qin, Hengshu Zhu, Xing Xie, Hui Xiong, and Qing He. A survey on knowledge graph-based recommender systems. IEEE Transactions on Knowledge and Data Engineering, 34(8):3549-3568, 2020.

[16]

Shizhu He, Kang Liu, Guoliang Ji, and Jun Zhao. Learning to represent knowledge graphs with gaussian embedding. In Proceedings of the 24th ACM international on conference on information and knowledge management, pages 623-632, 2015.

Digital Library

[17]

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000-16009, 2022.

[18]

Marti A. Hearst, Susan T Dumais, Edgar Osuna, John Platt, and Bernhard Scholkopf. Support vector machines. IEEE Intelligent Systems and their applications, 13(4):18-28, 1998.

Digital Library

[19]

Guoliang Ji, Shizhu He, Liheng Xu, Kang Liu, and Jun Zhao. Knowledge graph embedding via dynamic mapping matrix. In Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers), pages 687- 696, 2015.

[20]

Shaoxiong Ji, Shirui Pan, Erik Cambria, Pekka Marttinen, and S Yu Philip. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE transactions on neural networks and learning systems, 33(2):494-514, 2021.

[21]

Seyed Mehran Kazemi and David Poole. Simple embedding for link prediction in knowledge graphs. Advances in neural information processing systems, 31, 2018.

[22]

Qian Li, Shu Guo, Yangyifei Luo, Cheng Ji, Lihong Wang, Jiawei Sheng, and Jianxin Li. Attribute-consistent knowledge graph representation learning for multi-modal entity alignment. In Proceedings of the ACM Web Conference 2023, pages 2499-2508, 2023.

Digital Library

[23]

Ke Liang, Yue Liu, Sihang Zhou, Xinwang Liu, and Wenxuan Tu. Relational symmetry based knowledge graph contrastive learning. ArXiv, abs/2211.10738, 2022.

[24]

Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the AAAI conference on artificial intelligence, volume 29, 2015.

[25]

Ilya Loshchilov and Frank Hutter. Sgdr: Stochastic gradient descent with warm restarts. In International Conference on Learning Representations, 2016.

[26]

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. In International Conference on Learning Representations, 2018.

[27]

Xinyu Lu, Lifang Wang, Zejun Jiang, Shichang He, and Shizhong Liu. Mmkrl: A robust embedding approach for multi-modal knowledge graph representation learning. Applied Intelligence, pages 1-18, 2022.

[28]

Yuxing Lu, Xiaohong Liu, Zongxin Du, Yuanxu Gao, and Guangyu Wang. Medkpl: a heterogeneous knowledge enhanced prompt learning framework for transferable diagnosis. Journal of Biomedical Informatics, 143:104417, 2023.

Digital Library

[29]

Yuxing Lu, Rui Peng, Lingkai Dong, Kun Xia, Renjie Wu, Shuai Xu, and Jinzhuo Wang. Multiomics dynamic learning enables personalized diagnosis and prognosis for pancancer and cancer subtypes. Briefings in Bioinformatics, 24(6):bbad378, 2023.

[30]

Sameh K Mohamed, Aayah Nounu, and Vít Nováček. Biological applications of knowledge graph embedding models. Briefings in bioinformatics, 22(2):1679-1693, 2021.

[31]

Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen, and Dinh Phung. A novel embedding model for knowledge base completion based on convolutional neural network. arXiv preprint arXiv:1712.02121, 2017.

[32]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748-8763. PMLR, 2021.

[33]

Irina Rish et al. An empirical study of the naive bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence, volume 3, pages 41-46, 2001.

[34]

Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, et al. Modeling relational data with graph convolutional networks. In The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3-7, 2018, Proceedings 15, pages 593-607. Springer, 2018.

Digital Library

[35]

Xiaogang Su, Xin Yan, and Chih-Ling Tsai. Linear regression. Wiley Interdisciplinary Reviews: Computational Statistics, 4(3):275-294, 2012.

Digital Library

[36]

Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, and Jian Tang. Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv preprint arXiv:1902.10197, 2019.

[37]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.

[38]

Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the AAAI conference on artificial intelligence, volume 28, 2014.

[39]

Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. Knowledge graph embedding: A survey of approaches and applications. IEEE transactions on knowledge and data engineering, 29(12):2724-2743, 2017.

[40]

Zikang Wang, Linjing Li, Qiudan Li, and Daniel Zeng. Multimodal data enhanced representation learning for knowledge graphs. In 2019 International Joint Conference on Neural Networks (IJCNN), pages 1-8. IEEE, 2019.

[41]

David S Wishart, AnChi Guo, Eponine Oler, Fei Wang, Afia Anjum, Harrison Peters, Raynard Dizon, Zinat Sayeeda, Siyang Tian, Brian L Lee, et al. Hmdb 5.0: the human metabolome database for 2022. Nucleic acids research, 50(D1):D622-D631, 2022.

[42]

Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. Embedding entities and relations for learning and inference in knowledge bases. arXiv preprint arXiv:1412.6575, 2014.

[43]

Yuhao Yang, Chao Huang, Lianghao Xia, and Chenliang Li. Knowledge graph contrastive learning for recommendation. In Proceedings of the 45th international ACM SIGIR conference on research and development in information retrieval, pages 1434-1443, 2022.

Digital Library

[44]

Linli Yao, Weijing Chen, and Qin Jin. Capenrich: Enriching caption semantics for web images via cross-modal pre-trained knowledge. In The WebConf, 2023.

Digital Library

[45]

Shuai Zhang, Yi Tay, Lina Yao, and Qi Liu. Quaternion knowledge graph embeddings. Advances in neural information processing systems, 32, 2019.

[46]

Yifei Zhang, Yankai Chen, Zixing Song, and Irwin King. Contrastive cross-scale graph knowledge synergy. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023.

Digital Library

[47]

Xiangru Zhu, Zhixu Li, Xiaodan Wang, Xueyao Jiang, Penglei Sun, Xuwu Wang, Yanghua Xiao, and Nicholas Jing Yuan. Multi-modal knowledge graph construction and application: A survey. IEEE Transactions on Knowledge and Data Engineering, 2022.

Digital Library

Index Terms

Enhancing multimodal knowledge graph representation learning through triple contrastive learning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
    2. Natural language processing
      1. Information extraction
  2. Machine learning
    1. Learning paradigms
      1. Multi-task learning
    2. Machine learning approaches
      1. Learning latent representations
      2. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Triple disentangled representation learning for multimodal affective analysis
Abstract
In multimodal affective analysis (MAA) tasks, the presence of heterogeneity among different modalities has propelled the exploration of the disentanglement methods as a pivotal area. Many emerging studies focus on disentangling the modality-...
Highlights
- Propose a triple disentanglement multimodal representation learning method.
- Design a dual-out attention output module for triple disentanglement.
- Eliminate label-irrelevant representations from modality-specific representations.
MCL: Multimodal Contrastive Learning for Deepfake Detection
Advancements in computer vision and deep learning have led to difficulty in distinguishing Deepfake and real videos. In particular, forgery audios are also generated to accompany fake videos and make them more realistic, which makes Deepfake detection ...
GRMI: Graph Representation Learning of Multimodal Data with Incompleteness
Database Systems for Advanced Applications
Abstract
Multimodal data can provide supplementary information of the subjects, which is of great potential for exploring the data-driven insights in various application scenarios. A large amount of researches focus on modal fusion to deriving quality ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

IJCAI '24: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence

August 2024

8859 pages

ISBN:978-1-956792-04-1

Editor:
Kate Larson

Copyright © 2024 International Joint Conferences on Artificial Intelligence.

Sponsors

International Joint Conferences on Artifical Intelligence (IJCAI)

Publisher

Unknown publishers

Publication History

Published: 03 August 2024

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten