[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Exploiting Pre-Trained Language Models for Black-Box Attack against Knowledge Graph Embeddings

Published: 29 November 2024 Publication History

Abstract

Despite the emerging research on adversarial attacks against knowledge graph embedding (KGE) models, most of them focus on white-box attack settings. However, white-box attacks are difficult to apply in practice compared to black-box attacks since they require access to model parameters that are unlikely to be provided. In this article, we propose a novel black-box attack method that only requires access to knowledge graph data, making it more realistic in real-world attack scenarios. Specifically, we utilize pre-trained language models (PLMs) to encode text features of the knowledge graphs, an aspect neglected by previous research. We then employ these encoded text features to identify the most influential triples for constructing corrupted triples for the attack. To improve the transferability of the attack, we further propose to fine-tune the PLM model by enriching triple embeddings with structure information. Extensive experiments conducted on two knowledge graph datasets illustrate the effectiveness of our proposed method.

References

[1]
Prithu Banerjee, Lingyang Chu, Yong Zhang, Laks V. S. Lakshmanan, and Lanjun Wang. 2021. Stealthy targeted data poisoning attack on knowledge graphs. In Proceedings of the 2021 IEEE 37th International Conference on Data Engineering (ICDE ’21). IEEE, 2069–2074.
[2]
Patrick Betz, Christian Meilicke, and Heiner Stuckenschmidt. 2022. Adversarial explanations for knowledge graph embeddings. In Proceedings of the 31st International Joint Conference on Artificial Intelligence, 2820–2826.
[3]
Peru Bhardwaj, John Kelleher, Luca Costabello, and Declan O’Sullivan. 2021. Adversarial attacks on knowledge graph embeddings via instance attribution methods. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 8225–8239.
[4]
Peru Bhardwaj, John Kelleher, Luca Costabello, and Declan O’Sullivan. 2021. Poisoning knowledge graph embeddings via relation inference patterns. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Vol. 1 (Long Papers), 1875–1888.
[5]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Advances in Neural Information Processing Systems 2 (2013), 2787–2795.
[6]
Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, and Zhangyang Wang. 2024. LLaGA: Large language and graph assistant. arXiv:2402.08170. Retrieved from https://arxiv.org/abs/2402.08170
[7]
Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2d knowledge graph embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (2018). DOI:
[8]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1 (Long and Short Papers), 4171–4186.
[9]
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv:1412.6572. Retrieved from https://arxiv.org/abs/1412.6572
[10]
Kazuaki Hanawa, Sho Yokoi, Satoshi Hara, and Kentaro Inui. 2021. Evaluation of similarity-based explanations. In Proceedings of the International Conference on Learning Representations.
[11]
Bosung Kim, Taesuk Hong, Youngjoong Ko, and Jungyun Seo. 2020. Multi-task learning for knowledge graph completion with pre-trained language models. In Proceedings of the 28th International Conference on Computational Linguistics, 1737–1743.
[12]
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 7871–7880.
[13]
Yuhan Li, Zhixun Li, Peisong Wang, Jia Li, Xiangguo Sun, Hong Cheng, and Jeffrey Xu Yu. 2023. A survey of graph meets large language model: Progress and future directions. arXiv:2311.12399. Retrieved from https://arxiv.org/abs/2311.12399
[14]
Yicong Li, Xiangguo Sun, Hongxu Chen, Sixiao Zhang, Yu Yang, and Guandong Xu. 2024. Attention is not the only choice: Counterfactual reasoning for path-based explainable recommendation. IEEE Transactions on Knowledge and Data Engineering (2024), 1–14.
[15]
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the AAAI Conference on Artificial Intelligence 29, 1 (2015). DOI:
[16]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692. Retrieved from https://arxiv.org/abs/1907.11692
[17]
Xin Lv, Yankai Lin, Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Peng Li, and Jie Zhou. 2022. Do pre-trained models benefit knowledge graph completion? A reliable evaluation and a reasonable approach. In Proceedings of the Findings of the Association for Computational Linguistics (ACL ’22), 3570–3581.
[18]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems 2 (2013), 3111–3119.
[19]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP ’14), 1532–1543.
[20]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 701–710.
[21]
M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1 (Long Papers), 2227–2237.
[22]
Pouya Pezeshkpour, Yifan Tian, and Sameer Singh. 2019. Investigating robustness and interpretability of link prediction via adversarial modifications. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL ’19), 3336–3347.
[23]
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training.
[24]
Andrea Rossi, Denilson Barbosa, Donatella Firmani, Antonio Matinata, and Paolo Merialdo. 2021. Knowledge graph embedding for link prediction: A comparative analysis. ACM Transactions on Knowledge Discovery from Data 15, 2 (2021), 1–49.
[25]
Aditya Sharma and Partha Talukdar. 2018. Towards understanding the geometry of knowledge graph embeddings. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Vol. 1 (Long Papers), 122–131.
[26]
Xiangguo Sun, Hong Cheng, Hang Dong, Bo Qiao, Si Qin, and Qingwei Lin. 2023. Counter-empirical attacking based on adversarial reinforcement learning for time-relevant scoring system. IEEE Transactions on Knowledge and Data Engineering (2023), 1–12.
[27]
Xiangguo Sun, Hong Cheng, Jia Li, Bo Liu, and Jihong Guan. 2023. All in one: Multi-task prompting for graph neural networks. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2120–2131.
[28]
Xiangguo Sun, Hong Cheng, Bo Liu, Jia Li, Hongyang Chen, Guandong Xu, and Hongzhi Yin. 2023. Self-supervised hypergraph representation learning for sociological analysis. IEEE Transactions on Knowledge and Data Engineering 35, 11 (2023), 11860–11871.
[29]
Xiangguo Sun, Hongzhi Yin, Bo Liu, Hongxu Chen, Jiuxin Cao, Yingxia Shao, and Nguyen Quoc Viet Hung. 2021. Heterogeneous hypergraph embedding for graph classification. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, 725–733.
[30]
Xiangguo Sun, Hongzhi Yin, Bo Liu, Qing Meng, Jiuxin Cao, Alexander Zhou, and Hongxu Chen. 2022. Structure learning via meta-hyperedge for dynamic rumor detection. IEEE Transactions on Knowledge and Data Engineering 35, 9 (2022), 9128–9139.
[31]
Xiangguo Sun, Jiawen Zhang, Xixi Wu, Hong Cheng, Yun Xiong, and Jia Li. 2023. Graph prompt learning: A comprehensive survey and beyond. arXiv:2311.16534. Retrieved from https://arxiv.org/abs/2311.16534
[32]
Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. Ernie: Enhanced representation through knowledge integration. arXiv:1904.09223. Retrieved from https://arxiv.org/abs/1904.09223
[33]
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv:1312.6199. Retrieved from https://arxiv.org/abs/1312.6199
[34]
Jiabin Tang, Yuhao Yang, Wei Wei, Lei Shi, Lixin Su, Suqi Cheng, Dawei Yin, and Chao Huang. 2023. Graphgpt: Graph instruction tuning for large language models. arXiv:2310.13023. Retrieved from https://arxiv.org/abs/2310.13023
[35]
Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. In Proceedings of the International Conference on Machine Learning. PMLR, 2071–2080.
[36]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems 30 (2017), 6000–6010.
[37]
Jiapu Wang, Boyue Wang, Junbin Gao, Yongli Hu, and Baocai Yin. 2023. Multi-concept representation learning for knowledge graph completion. ACM Transactions on Knowledge Discovery from Data 17, 1 (2023), 1–19.
[38]
Zhaohan Xi, Tianyu Du, Changjiang Li, Ren Pang, Shouling Ji, Xiapu Luo, Xusheng Xiao, Fenglong Ma, and Ting Wang. 2023. On the security risks of knowledge graph reasoning. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security ’23), 3259–3276.
[39]
Bishan Yang, Scott Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding entities and relations for learning and inference in knowledge bases. In Proceedings of the International Conference on Learning Representations (ICLR ’15).
[40]
Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. KG-BERT: BERT for knowledge graph completion. arXiv:1909.03193. Retrieved from https://arxiv.org/abs/1909.03193
[41]
Hengtong Zhang, Tianhang Zheng, Jing Gao, Chenglin Miao, Lu Su, Yaliang Li, and Kui Ren. 2019. Data poisoning attack against knowledge graph embedding. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, 4853–4859.
[42]
Sixiao Zhang, Hongxu Chen, Haoran Yang, Xiangguo Sun, Philip S. Yu, and Guandong Xu. 2022. Graph masked autoencoders with transformers. arXiv:2202.08391. Retrieved from https://arxiv.org/abs/2202.08391
[43]
Tianzhe Zhao, Jiaoyan Chen, Yanchi Ru, Qika Lin, Yuxia Geng, and Jun Liu. 2024. Untargeted adversarial attack on knowledge graph embeddings. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1701–1711.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 19, Issue 1
January 2025
431 pages
EISSN:1556-472X
DOI:10.1145/3703003
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 November 2024
Online AM: 04 September 2024
Accepted: 04 August 2024
Revised: 30 May 2024
Received: 21 March 2024
Published in TKDD Volume 19, Issue 1

Check for updates

Author Tags

  1. Knowledge Graph
  2. Adversarial Attack
  3. Language Model

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 282
    Total Downloads
  • Downloads (Last 12 months)282
  • Downloads (Last 6 weeks)59
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media