[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3240508.3240617acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Temporal Hierarchical Attention at Category- and Item-Level for Micro-Video Click-Through Prediction

Published: 15 October 2018 Publication History

Abstract

Micro-video sharing gains great popularity in recent years, which calls for effective recommendation algorithm to help user find their interested micro-videos. Compared with traditional online (e.g. YouTube) videos, micro-videos contributed by grass-root users and taken by smartphones are much shorter (tens of seconds) and more short of tags or descriptive text, making the recommendation of micro-videos a challenging task. In this paper, we investigate how to model user's historical behaviors so as to predict the user's click-through of micro-videos. Inspired by the recent deep network-based methods, we propose a Temporal Hierarchical Attention at Category- and Item-Level (THACIL) network for user behavior modeling. First, we use temporal windows to capture the short-term dynamics of user interests; Second, we leverage a category-level attention mechanism to characterize user's diverse interests, as well as an item-level attention mechanism for fine-grained profiling of user interests; Third, we adopt forward multi-head self-attention to capture the long-term correlation within user behaviors. Our proposed THACIL network was tested on MicroVideo-1.7M, a new dataset of 1.7 million micro-videos, coming from real data of a micro-video sharing service in China. Experimental results demonstrate the effectiveness of the proposed method in comparison with the state-of-the-art solutions.

References

[1]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. ICLR .
[2]
Shumeet Baluja, Rohan Seth, D Sivakumar, Yushi Jing, Jay Yagnik, Shankar Kumar, Deepak Ravichandran, and Mohamed Aly. 2008. Video suggestion and discovery for youtube: taking random walks through the view graph. In WWW . 895--904.
[3]
Bisheng Chen, Jingdong Wang, Qinghua Huang, and Tao Mei. 2012. Personalized video recommendation through tripartite graph propagation. In MM . 1133--1136.
[4]
Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, and Tat-Seng Chua. 2016. Micro tells macro: Predicting the popularity of micro-videos via a transductive model. In MM . 898--907.
[5]
Jingyuan Chen, Hanwang Zhang, Xiangnan He, Liqiang Nie, Wei Liu, and Tat-Seng Chua. 2017. Attentive collaborative filtering: Multimedia recommendation with item-and component-level attention. In SIGIR. 335--344.
[6]
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In RecSys. 191--198.
[7]
Peng Cui, Zhiyu Wang, and Zhou Su. 2014. What videos are similar with you?: Learning a common attributed representation for video recommendation. In MM. 597--606.
[8]
Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, and Guoping Hu. 2016. Consensus Attention-based Neural Networks for Chinese Reading Comprehension. In COLING . 1777--1786.
[9]
James Davidson, Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi, Sujoy Gupta, Yu He, Mike Lambert, Blake Livingston, et almbox. 2010. The YouTube video recommendation system. In RecSys. 293--296.
[10]
Yashar Deldjoo, Mehdi Elahi, Paolo Cremonesi, Franca Garzotto, Pietro Piazzolla, and Massimo Quadrana. 2016. Content-based video recommendation system based on stylistic visual features. Journal on Data Semantics, Vol. 5, 2 (2016), 99--113.
[11]
Andrea Ferracani, Daniele Pezzatini, Marco Bertini, and Alberto Del Bimbo. 2016. Item-Based Video Recommendation: An Hybrid Approach considering Human Factors. In ICMR . 351--354.
[12]
Junyu Gao, Tianzhu Zhang, and Changsheng Xu. 2017. A Unified Personalized Video Recommendation via Dynamic Recurrent Neural Networks. In MM . 127--135.
[13]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In WWW. 173--182.
[14]
Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2016a. Session-based recommendations with recurrent neural networks. ICLR (2016).
[15]
Balázs Hidasi, Massimo Quadrana, Alexandros Karatzoglou, and Domonkos Tikk. 2016b. Parallel recurrent neural network architectures for feature-rich session-based recommendations. In RecSys. 241--248.
[16]
Lei Huang and Bin Luo. 2017. Personalized Micro-Video Recommendation via Hierarchical User Interest Modeling. In PCM . Springer, 564--574.
[17]
Yanxiang Huang, Bin Cui, Jie Jiang, Kunqian Hong, Wenyu Zhang, and Yiran Xie. 2016. Real-time video recommendation exploration. In SIGMOD. 35--46.
[18]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. ICLR, 1,15.
[19]
Ryan Kiros, Ruslan Salakhutdinov, and Richard S Zemel. 2014. Unifying visual-semantic embeddings with multimodal neural language models. CoRR,abs:1411.2539 (2014).
[20]
Tao Mei, Bo Yang, Xian-Sheng Hua, and Shipeng Li. 2011. Contextual video recommendation by multimodal relevance and user feedback. TOIS, Vol. 29, 2 (2011), 10.
[21]
Tomávs Mikolov, Stefan Kombrink, Lukávs Burget, Jan vC ernockỳ, and Sanjeev Khudanpur. 2011. Extensions of recurrent neural network language model. In Acoustics, Speech and Signal Processing (ICASSP), IEEE International Conference on. IEEE, 5528--5531.
[22]
Jonghun Park, Sang-Jin Lee, Sung-Jun Lee, Kwanho Kim, Beom-Suk Chung, and Yong-Ki Lee. 2010. An online video recommendation framework using view based tag cloud aggregation. IEEE Multimedia, Vol. 99, 1 (2010).
[23]
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In UAI. 452--461.
[24]
Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010. Factorizing personalized markov chains for next-basket recommendation. In SIGIR . 811--820.
[25]
Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, and Chengqi Zhang. 2018. Disan: Directional self-attention network for rnn/cnn-free language understanding. AAAI (2018).
[26]
Yang Song, Ali Mamdouh Elkahky, and Xiaodong He. 2016. Multi-rate deep learning for temporal recommendation. In SIGIR . 909--912.
[27]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NIPS. 6000--6010.
[28]
Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local Neural Networks. CVPR (2018).
[29]
Chao-Yuan Wu, Amr Ahmed, Alex Beutel, Alexander J Smola, and How Jing. 2017. Recurrent recommender networks. In WSDM. 495--503.
[30]
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In ICLR . 2048--2057.
[31]
Ming Yan, Jitao Sang, and Changsheng Xu. 2015. Unified youtube video recommendation via cross-network collaboration. In ICMR . 19--26.
[32]
Feng Yu, Qiang Liu, Shu Wu, Liang Wang, and Tieniu Tan. 2016. A dynamic recurrent model for next basket recommendation. In SIGIR . 729--732.
[33]
Yuyu Zhang, Hanjun Dai, Chang Xu, Jun Feng, Taifeng Wang, Jiang Bian, Bin Wang, and Tie-Yan Liu. 2014. Sequential Click Prediction for Sponsored Search with Recurrent Neural Networks. In AAAI, Vol. 14. 1369--1375.
[34]
Xiaojian Zhao, Guangda Li, Meng Wang, Jin Yuan, Zheng-Jun Zha, Zhoujun Li, and Tat-Seng Chua. 2011. Integrating rich information for video recommendation with multi-task rank aggregation. In MM. 1521--1524.
[35]
Chang Zhou, Jinze Bai, Junshuai Song, Xiaofei Liu, Zhengchao Zhao, Xiusi Chen, and Jun Gao. 2018. ATRank: An Attention-Based User Behavior Modeling Framework for Recommendation. AAAI (2018).
[36]
Xiangmin Zhou, Lei Chen, Yanchun Zhang, Longbing Cao, Guangyan Huang, and Chen Wang. 2015. Online video recommendation in sharing community. In SIGMOD. 1645--1656.
[37]
Qiusha Zhu, Mei-Ling Shyu, and Haohong Wang. 2013. Videotopic: Content-based video recommendation using a topic model. In Multimedia (ISM) IEEE International Symposium on. IEEE, 219--222.

Cited By

View all
  • (2024)SimCEN: Simple Contrast-enhanced Network for CTR PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681203(2311-2320)Online publication date: 28-Oct-2024
  • (2024)MultiLoRA: Multi-Directional Low Rank Adaptation for Multi-Domain RecommendationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679549(2148-2157)Online publication date: 21-Oct-2024
  • (2024)Orthogonal Hyper-category Guided Multi-interest Elicitation for Micro-video Matching2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687880(1-6)Online publication date: 15-Jul-2024
  • Show More Cited By

Index Terms

  1. Temporal Hierarchical Attention at Category- and Item-Level for Micro-Video Click-Through Prediction

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '18: Proceedings of the 26th ACM international conference on Multimedia
    October 2018
    2167 pages
    ISBN:9781450356657
    DOI:10.1145/3240508
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 October 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. attention mechanism
    2. click-through prediction
    3. micro-video
    4. recommendation algorithm
    5. user modeling

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '18
    Sponsor:
    MM '18: ACM Multimedia Conference
    October 22 - 26, 2018
    Seoul, Republic of Korea

    Acceptance Rates

    MM '18 Paper Acceptance Rate 209 of 757 submissions, 28%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)77
    • Downloads (Last 6 weeks)12
    Reflects downloads up to 01 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SimCEN: Simple Contrast-enhanced Network for CTR PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681203(2311-2320)Online publication date: 28-Oct-2024
    • (2024)MultiLoRA: Multi-Directional Low Rank Adaptation for Multi-Domain RecommendationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679549(2148-2157)Online publication date: 21-Oct-2024
    • (2024)Orthogonal Hyper-category Guided Multi-interest Elicitation for Micro-video Matching2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687880(1-6)Online publication date: 15-Jul-2024
    • (2024)Optimizing Personalized E-Commerce Micro-Video Recommendation with Self-Adaption Generative Gating Graph2024 5th International Conference on Computer Engineering and Application (ICCEA)10.1109/ICCEA62105.2024.10604194(1095-1106)Online publication date: 12-Apr-2024
    • (2024)Multimodal semantic enhanced representation network for micro-video event detectionKnowledge-Based Systems10.1016/j.knosys.2024.112255301(112255)Online publication date: Oct-2024
    • (2024)Modeling multi-behavior sequence via HyperGRU contrastive network for micro-video recommendationKnowledge-Based Systems10.1016/j.knosys.2024.111841295(111841)Online publication date: Jul-2024
    • (2024)A holistic view on positive and negative implicit feedback for micro-video recommendationKnowledge-Based Systems10.1016/j.knosys.2023.111299284(111299)Online publication date: Jan-2024
    • (2024)Temporal Diversity-Aware Micro-Video Recommendation with Long- and Short-Term Interests ModelingNeural Processing Letters10.1007/s11063-024-11652-756:3Online publication date: 3-Jun-2024
    • (2024)Context-aware focal alignment network for micro-video multi-label classificationPattern Analysis and Applications10.1007/s10044-024-01376-827:4Online publication date: 14-Nov-2024
    • (2024)Multi-trends Enhanced Dynamic Micro-video RecommendationArtificial Intelligence10.1007/978-981-99-8850-1_35(430-441)Online publication date: 4-Feb-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media