[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3626772.3658596acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Multimodality Invariant Learning for Multimedia-Based New Item Recommendation

Published: 11 July 2024 Publication History

Abstract

Multimedia-based recommendation provides personalized item suggestions by learning the content preferences of users. With the proliferation of digital devices and APPs, a huge number of new items are created rapidly over time. How to quickly provide recommendations for new items at the inference time is challenging. What's worse, real-world items exhibit varying degrees of modality missing(e.g., many short videos are uploaded without text descriptions). Though many efforts have been devoted to multimedia-based recommendations, they either could not deal with new multimedia items or assumed the modality completeness in the modeling process.
In this paper, we highlight the necessity of tackling the modality missing issue for new item recommendation. We argue that users' inherent content preference is stable and better kept invariant to arbitrary modality missing environments. Therefore, we approach this problem from a novel perspective of invariant learning. However, how to construct environments from finite user behavior training data to generalize any modality missing is challenging. To tackle this issue, we propose a novel Multimodality Invariant Learning reCommendation (a.k.a. MILK) framework. Specifically, MILK first designs a cross-modality alignment module to keep semantic consistency from pretrained multimedia item features. After that, MILK designs multi-modal heterogeneous environments with cyclic mixup to augment training data, in order to mimic any modality missing for invariant user preference learning.Extensive experiments on three real datasets verify the superiority of our proposed framework.The code is available at https://github.com/HaoyueBai98/MILK.

References

[1]
Martín Arjovsky, Léon Bottou, Ishaan Gulrajani, and David Lopez-Paz. 2019. Invariant Risk Minimization. ArXiv (2019).
[2]
Haoyue Bai, Min Hou, Le Wu, Yonghui Yang, Richang Hong, and Meng Wang. 2023. GoRec: A Generative Cold-Start Recommendation Framework. MM (2023).
[3]
Hao Chen, Zefan Wang, Feiran Huang, Xiao Huang, Yue Xu, Yishi Lin, Peng He, and Zhoujun Li. 2022. Generative Adversarial Framework for Cold-Start Item Recommendation. SIGIR (2022).
[4]
Jingyuan Chen, Hanwang Zhang, Xiangnan He, Liqiang Nie, Wei Liu, and Tat-Seng Chua. 2017. Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention. SIGIR (2017).
[5]
Yimeng Chen, Ruibin Xiong, Zhi-Ming Ma, and Yanyan Lan. 2022. When Does Group Invariant Learning Survive Spurious Correlations? NeurIPS (2022).
[6]
Elliot Creager, Jörn-Henrik Jacobsen, and Richard Zemel. 2021. Environment inference for invariant learning. (2021).
[7]
Tiago de Melo and Carlos M.S. Figueiredo. 2020. A first public dataset from Brazilian twitter and news on COVID-19 in Portuguese. Data in Brief (2020).
[8]
Xiaoyu Du, Xiang Wang, Xiangnan He, Zechao Li, Jinhui Tang, and Tat-Seng Chua. 2020. How to Learn Item Representation for Cold-Start Multimedia Recommendation? MM (2020).
[9]
Xiaoyu Du, Zike Wu, Fuli Feng, Xiangnan He, and Jinhui Tang. 2022. Invariant Representation Learning for Multimedia Recommendation. MM (2022).
[10]
Xue Geng, Hanwang Zhang, Jingwen Bian, and Tat-Seng Chua. 2015. Learning image and user features for recommendation in social networks. ICCV (2015).
[11]
Ruining He and Julian McAuley. 2015. VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback. AAAI (2015).
[12]
Ruining He and Julian McAuley. 2016. Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering. WWW (2016).
[13]
Wang-Cheng Kang, Chen Fang, Zhaowen Wang, and Julian McAuley. 2017. Visually-Aware Fashion Recommendation and Design with Generative Image Models. ICDM (2017).
[14]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. ICLR (2014).
[15]
David Krueger, Ethan Caballero, Jörn-Henrik Jacobsen, Amy Zhang, Jonathan Binas, Rémi Le Priol, and Aaron C. Courville. 2020. Out-of-Distribution Generalization via Risk Extrapolation (REx). ICML (2020).
[16]
Xingchen Li, Xiang Wang, Xiangnan He, Long Chen, Jun Xiao, and Tat-Seng Chua. 2020. Hierarchical Fashion Graph Network for Personalized Outfit Recommendation. SIGIR (2020).
[17]
Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guang zhong Sun. 2018. xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. KDD (2018).
[18]
Xinyu Lin, Wenjie Wang, Jujia Zhao, Yongqi Li, Fuli Feng, and Tat-Seng Chua. 2024. Temporally and Distributionally Robust Optimization for Cold-start Recommendation. AAAI (2024).
[19]
Jiashuo Liu, Zheyuan Hu, Peng Cui, B. Li, and Zheyan Shen. 2021. Heterogeneous Risk Minimization. ICML (2021).
[20]
Julian McAuley, Christopher Targett, Javen Qinfeng Shi, and Anton van den Hengel. 2015. Image-Based Recommendations on Styles and Substitutes. SIGIR (2015).
[21]
Jonas Peters, Peter Bühlmann, and Nicolai Meinshausen. 2016. Causal inference by using invariant prediction: identification and confidence intervals. J R Stat Soc (2016).
[22]
Francesco Pinto, Harry Yang, Ser Nam Lim, Philip H. S. Torr, and Puneet Kumar Dokania. 2022. RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and Out Distribution Robustness. NeurIPS (2022).
[23]
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. UAI (2009).
[24]
Suvash Sedhain, Scott Sanner, Darius Braziunas, Lexing Xie, and Jordan Christensen. 2014. Social collaborative filtering for cold-start recommendations. RecSys (2014).
[25]
Damien Teney, Ehsan Abbasnejad, and Anton van den Hengel. 2021. Unshuffling data for improved generalization in visual question answering. (2021).
[26]
Aäron van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep content-based music recommendation. NeurIPS (2013).
[27]
Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, David Lopez-Paz, and Yoshua Bengio. 2018. Manifold Mixup: Better Representations by Interpolating Hidden States. ICML (2018).
[28]
Maksims Volkovs, Guangwei Yu, and Tomi Poutanen. 2017. DropoutNet: Addressing Cold Start in Recommender Systems. NeurIPS (2017).
[29]
Shuai Wang, Kun Zhang, Le Wu, Haiping Ma, Richang Hong, and Meng Wang. 2021. Privileged Graph Distillation for Cold Start Recommendation. SIGIR (2021).
[30]
Zimu Wang, Yue He, Jiashuo Liu, Wenchao Zou, Philip S. Yu, and Peng Cui. 2022. Invariant Preference Learning for General Debiasing in Recommendation. KDD (2022).
[31]
Wei Wei, Chao Huang, Lianghao Xia, and Chuxu Zhang. 2023. Multi-Modal Self-Supervised Learning for Recommendation. WWW (2023).
[32]
Yin wei Wei, Xiang Wang, Qi Li, Liqiang Nie, Yan Li, Xuanping Li, and Tat-Seng Chua. 2021. Contrastive Learning for Cold-Start Recommendation. MM (2021).
[33]
Yin wei Wei, Xiang Wang, Liqiang Nie, Xiangnan He, and Tat-Seng Chua. 2020. Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback. MM (2020).
[34]
YinweiWei, XiangWang, Liqiang Nie, Xiangnan He, Richang Hong, and Tat-Seng Chua. 2019. MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video. MM (2019).
[35]
Le Wu, Lei Chen, Pengyang Shao, Richang Hong, Xiting Wang, and Meng Wang. 2021. Learning Fair Representations for Recommendation: A Graph-based Perspective. WWW (2021).
[36]
Le Wu, Xiangnan He, Xiang Wang, Kun Zhang, and Meng Wang. 2021. A Survey on Accuracy-Oriented Neural Recommendation: From Collaborative Filtering to Information-Rich Recommendation. TKDE (2021).
[37]
Le Wu, Junwei Li, Peijie Sun, Richang Hong, Yong Ge, and Meng Wang. 2020. DiffNet: A Neural Influence and Interest Diffusion Network for Social Recommendation. TKDE (2020).
[38]
Hongyi Zhang, Moustapha Cissé, Yann Dauphin, and David Lopez-Paz. 2018. mixup: Beyond Empirical Risk Minimization. ICLR (2018).
[39]
Jinghao Zhang, Qiang Liu, Shu Wu, and Liang Wang. 2023. Mining Stable Preferences: Adaptive Modality Decorrelation for Multimedia Recommendation. SIGIR (2023).
[40]
Jinghao Zhang, Yanqiao Zhu, Qiang Liu, ShuWu, ShuhuiWang, and LiangWang. 2021. Mining Latent Structures for Multimedia Recommendation. MM (2021).
[41]
Jinghao Zhang, Yanqiao Zhu, Qiang Liu, Mengqi Zhang, Shu Wu, and Liang Wang. 2023. Latent Structure Mining With Contrastive Modality Fusion for Multimedia Recommendation. TKDE (2023).
[42]
Xin Zhou and Zhiqi Shen. 2022. A Tale of Two Graphs: Freezing and Denoising Graph Structures for Multimodal Recommendation. MM (2022).
[43]
Zhihui Zhou, Lili Zhang, and Ning Yang. 2023. Contrastive Collaborative Filtering for Cold-Start Item Recommendation. WWW (2023).
[44]
Ziwei Zhu, Shahin Sefati, Parsa Saadatpanah, and James Caverlee. 2020. Recommendation for New Users and New Items via Randomized Training and Mixture-of-Experts Transformation. SIGIR (2020).

Index Terms

  1. Multimodality Invariant Learning for Multimedia-Based New Item Recommendation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2024
    3164 pages
    ISBN:9798400704314
    DOI:10.1145/3626772
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 July 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. invariant learning
    2. modality missing
    3. multimedia-based recommendation

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGIR 2024
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 296
      Total Downloads
    • Downloads (Last 12 months)296
    • Downloads (Last 6 weeks)81
    Reflects downloads up to 10 Dec 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media