[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Aspect-Enhanced Explainable Recommendation with Multi-modal Contrastive Learning

Published: 02 January 2025 Publication History

Abstract

Explainable recommender systems (ERS) aim to enhance users’ trust in the systems by offering personalized recommendations with transparent explanations. This transparency provides users with a clear understanding of the rationale behind the recommendations, fostering a sense of confidence and reliability in the system’s outputs. Generally, the explanations are presented in a familiar and intuitive way, which is in the form of natural language, thus enhancing their accessibility to users. Recently, there has been an increasing focus on leveraging reviews as a valuable source of rich information in both modeling user-item preferences and generating textual interpretations, which can be performed simultaneously in a multi-task framework. Despite the progress made in these review-based recommendation systems, the integration of implicit feedback derived from user-item interactions and user-written text reviews has yet to be fully explored. To fill this gap, we propose a model named SERMON (Aspect-enhanced Explainable Recommendation with Multi-modal Contrast Learning). Our model explores the application of multimodal contrastive learning to facilitate reciprocal learning across two modalities, thereby enhancing the modeling of user preferences. Moreover, our model incorporates the aspect information extracted from the review, which provides two significant enhancements to our tasks. Firstly, the quality of the generated explanations is improved by incorporating the aspect characteristics into the explanations generated by a pre-trained model with controlled textual generation ability. Secondly, the commonly used user-item interactions are transformed into user-item-aspect interactions, which we refer to as interaction triple, resulting in a more nuanced representation of user preference. To validate the effectiveness of our model, we conduct extensive experiments on three real-world datasets. The experimental results show that our model outperforms state-of-the-art baselines, with a 2.0% improvement in prediction accuracy and a substantial 24.5% enhancement in explanation quality for the TripAdvisor dataset.

References

[1]
Krisztian Balog and Filip Radlinski. 2020. Measuring recommendation explanation quality: The conflicting goals of explanations. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’20). ACM, New York, NY, 329–338. DOI:
[2]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS ’20). Curran Associates Inc., Red Hook, NY, Article 159, 25 pages.
[3]
Chong Chen, Min Zhang, Yiqun Liu, and Shaoping Ma. 2018. Neural attentional rating regression with review-level explanations. In Proceedings of the 2018 World Wide Web Conference (WWW ’18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 1583–1592. DOI:
[4]
Li Chen and Feng Wang. 2017. Explaining recommendations based on feature sentiments in product reviews. In Proceedings of the 22nd International Conference on Intelligent User Interfaces (IUI ’17). ACM, New York, NY, 17–28. DOI:
[5]
Xu Chen, Hanxiong Chen, Hongteng Xu, Yongfeng Zhang, Yixin Cao, Zheng Qin, and Hongyuan Zha. 2019. Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’19). ACM, New York, NY, 765–774. DOI:
[6]
Zhongxia Chen, Xiting Wang, Xing Xie, Tong Wu, Guoqing Bu, Yining Wang, and Enhong Chen. 2019. Co-attentive multi-task learning for explainable recommendation. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI ’19). AAAI Press, 2137–2143.
[7]
Jin Yao Chin, Kaiqi Zhao, Shafiq Joty, and Gao Cong. 2018. ANR: Aspect-based neural recommender. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM ’18). ACM, New York, NY, 147–156. DOI:
[8]
Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP ’14). Association for Computational Linguistics, 1724–1734. DOI:
[9]
Fartash Faghri, David J. Fleet, Jamie Ryan Kiros, and Sanja Fidler. 2018. VSE++: Improving visual-semantic embeddings with hard negatives. In Proceedings of the British Machine Vision Conference (BMVC ’18). BMVA Press, Newcastle, UK, 12. DOI: http://bmvc2018.org/contents/papers/0344.pdf
[10]
Fatih Gedikli, Dietmar Jannach, and Mouzhi Ge. 2014. How should I explain? A comparison of different explanation types for recommender systems. Int. J. Hum. Comput. Stud. 72, 4 (2014), 367–382. DOI:
[11]
Shijie Geng, Zuohui Fu, Yingqiang Ge, Lei Li, Gerard de Melo, and Yongfeng Zhang. 2022. Improving personalized explanation generation through visualization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 244–255. DOI:
[12]
Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, and Yongfeng Zhang. 2023. VIP5: Towards multimodal foundation models for recommendation. In Findings of the Association for Computational Linguistics (EMNLP ’23). Association for Computational Linguistics, 9606–9620. DOI:
[13]
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, Vol. 9. PMLR, 249–256. Retrieved from https://proceedings.mlr.press/v9/glorot10a.html
[14]
Alex Graves. 2012. Supervised Sequence Labelling. Springer, Berlin, 37–45.
[15]
Deepesh V. Hada, Vijaikumar M., and Shirish K. Shevade. 2021. ReXPlug: Explainable recommendation using plug-and-play language model. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21). ACM, New York, NY, 81–91. DOI:
[16]
Chris Hokamp and Qun Liu. 2017. Lexically constrained decoding for sequence generation using grid beam search. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1535–1546. DOI:
[17]
Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’04). ACM, New York, NY, 168–177. DOI:
[18]
Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc Le, Yun-Hsuan Sung, Zhen Li, and Tom Duerig. 2021. Scaling up visual and vision-language representation learning with noisy text supervision. In Proceedings of the 38th International Conference on Machine Learning, Vol. 139. PMLR, 4904–4916. Retrieved from https://proceedings.mlr.press/v139/jia21b.html
[19]
SeongKu Kang, Wonbin Kweon, Dongha Lee, Jianxun Lian, Xing Xie, and Hwanjo Yu. 2023. Distillation from heterogeneous models for top-K recommendation. In Proceedings of the ACM Web Conference 2023 (WWW ’23). ACM, New York, NY, 801–811. DOI:
[20]
Reinald Kim Amplayo, Arthur Brazinskas, Yoshi Suhara, Xiaolan Wang, and Bing Liu. 2022. Beyond opinion mining: Summarizing opinions of customer reviews. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22). ACM, New York, NY, 3447–3450. DOI:
[21]
Jiacheng Li, Zhankui He, Jingbo Shang, and Julian McAuley. 2023. UCEpic: Unifying aspect planning and lexical constraints for generating explanations in recommendation. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’23). ACM, New York, NY, 1248–1257. DOI:
[22]
Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare, Shafiq Joty, Caiming Xiong, and Steven C. Hong Hoi. 2021. Align before fuse: Vision and language representation learning with momentum distillation. Advances in Neural Information Processing Systems 34 (2021), 9694–9705.
[23]
Junyi Li, Wayne X. Zhao, Ji-Rong Wen, and Yang Song. 2019. Generating long and informative reviews with aspect-aware coarse-to-fine decoding. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1969–1979. DOI:
[24]
Kunpeng Li, Yulun Zhang, Kai Li, Yuanyuan Li, and Yun Fu. 2019. Visual semantic reasoning for image-text matching. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV ’19 ). IEEE, 4653–4661. DOI:
[25]
Lei Li, Li Chen, and Ruihai Dong. 2021a. CAESAR: Context-aware explanation based on supervised attention for service recommendations. J. Intell. Inf. Syst. 57, 1 (2021), 147–170. DOI:
[26]
Lei Li, Yongfeng Zhang, and Li Chen. 2020a. Generate neural template explanations for recommendation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (CIKM ’20). ACM, New York, NY, 755–764. DOI:
[27]
Lei Li, Yongfeng Zhang, and Li Chen. 2020b. Generate neural template explanations for recommendation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (CIKM ’20). ACM, New York, NY, 755–764. DOI:
[28]
Lei Li, Yongfeng Zhang, and Li Chen. 2021c. EXTRA: Explanation ranking datasets for explainable recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21). ACM, New York, NY, 2463–2469. DOI:
[29]
Lei Li, Yongfeng Zhang, and Li Chen. 2021d. Personalized transformer for explainable recommendation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 4947–4957. DOI:
[30]
Liunian H. Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang. 2019a. VisualBERT: A simple and performant baseline for vision and language. arXiv:1908.03557. Retrieved from http://arxiv.org/abs/1908.03557
[31]
Piji Li, Zihao Wang, Zhaochun Ren, Lidong Bing, and Wai Lam. 2017. Neural rating regression with abstractive tips generation for recommendation. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’17). ACM, New York, NY, 345–354. DOI:
[32]
Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out. Association for Computational Linguistics, 74–81. Retrieved from https://aclanthology.org/W04-1013
[33]
Donghua Liu, Jing Li, Bo Du, Jun Chang, and Rong Gao. 2019. DAML: Dual attention mutual learning between ratings and reviews for item recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19). ACM, New York, NY, 344–352. DOI:
[34]
Yong Liu, Susen Yang, Chenyi Lei, Guoxin Wang, Haihong Tang, Juyong Zhang, Aixin Sun, and Chunyan Miao. 2021. Pre-training graph transformer with multimodal side information for recommendation. In Proceedings of the 29th ACM International Conference on Multimedia (MM ’21). ACM, New York, NY, 2853–2861. DOI:
[35]
Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, and Stefan Lee. 2020. 12-in-1: Multi-task vision and language representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR ’20). Computer Vision Foundation / IEEE, 10434–10443. DOI:
[36]
Yichao Lu, Ruihai Dong, and Barry Smyth. 2018. Coevolutionary recommendation model: Mutual learning between ratings and reviews. In Proceedings of the World Wide Web Conference (WWW ’18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 773–782. DOI:
[37]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 311–318. DOI:
[38]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning (ICML ’21), Vol. 139. PMLR, 8748–8763. Retrieved from http://proceedings.mlr.press/v139/radford21a.html
[39]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.
[40]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using siamese BERT-networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP ’19). Association for Computational Linguistics, 3982–3992. DOI:
[41]
Steffen Rendle. 2010. Factorization machines. In Proceedings of the 10th IEEE International Conference on Data Mining (ICDM ’10). IEEE Computer Society, 995–1000. DOI:
[42]
Sadaf Safavi, Mehrdad Jalali, and Mahboobeh Houshmand. 2022. Toward point-of-interest recommendation systems: A critical review on deep-learning Approaches. Electronics 11, 13 (2022), 1998.
[43]
Ruslan Salakhutdinov and Andriy Mnih. 2007. Probabilistic matrix factorization. In Proceedings of the 20th International Conference on Neural Information Processing Systems (NIPS’ 07). Curran Associates Inc., Red Hook, NY, 1257–1264.
[44]
Sungyong Seo, Jing Huang, Hao Yang, and Yan Liu. 2017. Interpretable convolutional neural networks with dual local and global attention for review rating prediction. In Proceedings of the 11th ACM Conference on Recommender Systems (RecSys ’17). ACM, New York, NY, 297–305. DOI:
[45]
Shaoyun Shi, Hanxiong Chen, Weizhi Ma, Jiaxin Mao, Min Zhang, and Yongfeng Zhang. 2020. Neural logic reasoning. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (CIKM ’20). ACM, New York, NY, 1365–1374. DOI:
[46]
Jie Shuai, Kun Zhang, Le Wu, Peijie Sun, Richang Hong, Meng Wang, and Yong Li. 2022. A review-aware graph contrastive learning framework for recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22). ACM, 1283–1293. DOI:
[47]
Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, and Jifeng Dai. 2020. VL-BERT: Pre-training of generic visual-linguistic representations. In Proceedings of the 8th International Conference on Learning Representations (ICLR ’20). 1–18. Retrieved from https://openreview.net/forum?id=SygXPaEYvH
[48]
Yi Tay, Anh Tuan Luu, and Siu Cheung Hui. 2018. Multi-pointer co-attention networks for recommendation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’18). ACM, New York, NY, 2309–2318. DOI:
[49]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS ’17). Curran Associates Inc., Red Hook, NY, 6000–6010.
[50]
Libing Wu, Cong Quan, Chenliang Li, Qian Wang, Bolong Zheng, and Xiangyang Luo. 2019. A context-aware user-item representation learning for item recommendation. ACM Trans. Inf. Syst. 37, 2 (2019), 221–22:29. DOI:
[51]
Yikun Xian, Zuohui Fu, S. Muthukrishnan, Gerard de Melo, and Yongfeng Zhang. 2019. Reinforcement knowledge graph reasoning for explainable recommendation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’19). ACM, New York, NY, 285–294. DOI:
[52]
Cai Xu, Ziyu Guan, Wei Zhao, Quanzhou Wu, Meng Yan, Long Chen, and Qiguang Miao. 2021. Recommendation by users’ multimodal preferences for smart city applications. IEEE Trans. Ind. Informatics 17, 6 (2021), 4197–4205. DOI:
[53]
Yongfeng Zhang and Xu Chen. 2020. Explainable recommendation: A survey and new perspectives. Found. Trends Inf. Retr. 14, 1 (2020), 1–101. DOI:
[54]
Yongfeng Zhang, Guokun Lai, Min Zhang, Yi Zhang, Yiqun Liu, and Shaoping Ma. 2014. Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR ’14). ACM, New York, NY, 83–92. DOI:
[55]
Yizhe Zhang, Guoyin Wang, Chunyuan Li, Zhe Gan, Chris Brockett, and Bill Dolan. 2020. POINTER: Constrained progressive text generation via insertion-based generative pre-training. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP ’20). Association for Computational Linguistics, Online, 8649–8670. DOI:
[56]
Yongfeng Zhang, Haochen Zhang, Min Zhang, Yiqun Liu, and Shaoping Ma. 2014. Do users rate or review? Boost phrase-level sentiment labeling with review-level sentiment classification. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR ’14). ACM, New York, NY, 1027–1030. DOI:
[57]
Lei Zheng, Vahid Noroozi, and Philip S. Yu. 2017. Joint deep modeling of users and items using reviews for recommendation. In Proceedings of the 10th ACM International Conference on Web Search and Data Mining (WSDM ’17). ACM, New York, NY, 425–434. DOI:
[58]
Defu Lian, Qi Liu, and Enhong Chen. 2020. Personalized ranking with importance sampling. In Proceedings of The Web Conference 2020 (WWW ’20). ACM, New York, NY, 1093–1103. DOI:

Cited By

View all
  • (2025)Enhancing Personalized Explainable Recommendations with Transformer Architecture and Feature HandlingElectronics10.3390/electronics1405099814:5(998)Online publication date: 28-Feb-2025

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 16, Issue 1
February 2025
592 pages
EISSN:2157-6912
DOI:10.1145/3703021
  • Editor:
  • Huan Liu
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 January 2025
Online AM: 19 June 2024
Accepted: 02 June 2024
Revised: 09 May 2024
Received: 30 June 2023
Published in TIST Volume 16, Issue 1

Check for updates

Author Tags

  1. Recommender systems
  2. review-based recommendation
  3. multimodal contrastive learning

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Guangdong Basic and Applied Basic Research Foundation
  • Guangdong Provincial Key Laboratory of Popular High Performance Computers
  • Guangdong Province Engineering Center of China-made High Performance Data Computing System, Shenzhen Fundamental Research-General Project

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)608
  • Downloads (Last 6 weeks)167
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Enhancing Personalized Explainable Recommendations with Transformer Architecture and Feature HandlingElectronics10.3390/electronics1405099814:5(998)Online publication date: 28-Feb-2025

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media