More Web Proxy on the site http://driver.im/

research-article

Temporal Hierarchical Attention at Category- and Item-Level for Micro-Video Click-Through Prediction

Authors:

Yan LiAuthors Info & Claims

MM '18: Proceedings of the 26th ACM international conference on Multimedia

Pages 1146 - 1153

https://doi.org/10.1145/3240508.3240617

Published: 15 October 2018 Publication History

Abstract

Micro-video sharing gains great popularity in recent years, which calls for effective recommendation algorithm to help user find their interested micro-videos. Compared with traditional online (e.g. YouTube) videos, micro-videos contributed by grass-root users and taken by smartphones are much shorter (tens of seconds) and more short of tags or descriptive text, making the recommendation of micro-videos a challenging task. In this paper, we investigate how to model user's historical behaviors so as to predict the user's click-through of micro-videos. Inspired by the recent deep network-based methods, we propose a Temporal Hierarchical Attention at Category- and Item-Level (THACIL) network for user behavior modeling. First, we use temporal windows to capture the short-term dynamics of user interests; Second, we leverage a category-level attention mechanism to characterize user's diverse interests, as well as an item-level attention mechanism for fine-grained profiling of user interests; Third, we adopt forward multi-head self-attention to capture the long-term correlation within user behaviors. Our proposed THACIL network was tested on MicroVideo-1.7M, a new dataset of 1.7 million micro-videos, coming from real data of a micro-video sharing service in China. Experimental results demonstrate the effectiveness of the proposed method in comparison with the state-of-the-art solutions.

References

[1]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. ICLR .

[2]

Shumeet Baluja, Rohan Seth, D Sivakumar, Yushi Jing, Jay Yagnik, Shankar Kumar, Deepak Ravichandran, and Mohamed Aly. 2008. Video suggestion and discovery for youtube: taking random walks through the view graph. In WWW . 895--904.

Digital Library

[3]

Bisheng Chen, Jingdong Wang, Qinghua Huang, and Tao Mei. 2012. Personalized video recommendation through tripartite graph propagation. In MM . 1133--1136.

Digital Library

[4]

Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, and Tat-Seng Chua. 2016. Micro tells macro: Predicting the popularity of micro-videos via a transductive model. In MM . 898--907.

Digital Library

[5]

Jingyuan Chen, Hanwang Zhang, Xiangnan He, Liqiang Nie, Wei Liu, and Tat-Seng Chua. 2017. Attentive collaborative filtering: Multimedia recommendation with item-and component-level attention. In SIGIR. 335--344.

Digital Library

[6]

Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In RecSys. 191--198.

Digital Library

[7]

Peng Cui, Zhiyu Wang, and Zhou Su. 2014. What videos are similar with you?: Learning a common attributed representation for video recommendation. In MM. 597--606.

Digital Library

[8]

Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, and Guoping Hu. 2016. Consensus Attention-based Neural Networks for Chinese Reading Comprehension. In COLING . 1777--1786.

[9]

James Davidson, Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi, Sujoy Gupta, Yu He, Mike Lambert, Blake Livingston, et almbox. 2010. The YouTube video recommendation system. In RecSys. 293--296.

Digital Library

[10]

Yashar Deldjoo, Mehdi Elahi, Paolo Cremonesi, Franca Garzotto, Pietro Piazzolla, and Massimo Quadrana. 2016. Content-based video recommendation system based on stylistic visual features. Journal on Data Semantics, Vol. 5, 2 (2016), 99--113.

[11]

Andrea Ferracani, Daniele Pezzatini, Marco Bertini, and Alberto Del Bimbo. 2016. Item-Based Video Recommendation: An Hybrid Approach considering Human Factors. In ICMR . 351--354.

Digital Library

[12]

Junyu Gao, Tianzhu Zhang, and Changsheng Xu. 2017. A Unified Personalized Video Recommendation via Dynamic Recurrent Neural Networks. In MM . 127--135.

Digital Library

[13]

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In WWW. 173--182.

Digital Library

[14]

Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2016a. Session-based recommendations with recurrent neural networks. ICLR (2016).

[15]

Balázs Hidasi, Massimo Quadrana, Alexandros Karatzoglou, and Domonkos Tikk. 2016b. Parallel recurrent neural network architectures for feature-rich session-based recommendations. In RecSys. 241--248.

Digital Library

[16]

Lei Huang and Bin Luo. 2017. Personalized Micro-Video Recommendation via Hierarchical User Interest Modeling. In PCM . Springer, 564--574.

[17]

Yanxiang Huang, Bin Cui, Jie Jiang, Kunqian Hong, Wenyu Zhang, and Yiran Xie. 2016. Real-time video recommendation exploration. In SIGMOD. 35--46.

Digital Library

[18]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. ICLR, 1,15.

[19]

Ryan Kiros, Ruslan Salakhutdinov, and Richard S Zemel. 2014. Unifying visual-semantic embeddings with multimodal neural language models. CoRR,abs:1411.2539 (2014).

[20]

Tao Mei, Bo Yang, Xian-Sheng Hua, and Shipeng Li. 2011. Contextual video recommendation by multimodal relevance and user feedback. TOIS, Vol. 29, 2 (2011), 10.

Digital Library

[21]

Tomávs Mikolov, Stefan Kombrink, Lukávs Burget, Jan vC ernockỳ, and Sanjeev Khudanpur. 2011. Extensions of recurrent neural network language model. In Acoustics, Speech and Signal Processing (ICASSP), IEEE International Conference on. IEEE, 5528--5531.

[22]

Jonghun Park, Sang-Jin Lee, Sung-Jun Lee, Kwanho Kim, Beom-Suk Chung, and Yong-Ki Lee. 2010. An online video recommendation framework using view based tag cloud aggregation. IEEE Multimedia, Vol. 99, 1 (2010).

[23]

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In UAI. 452--461.

Digital Library

[24]

Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010. Factorizing personalized markov chains for next-basket recommendation. In SIGIR . 811--820.

Digital Library

[25]

Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, and Chengqi Zhang. 2018. Disan: Directional self-attention network for rnn/cnn-free language understanding. AAAI (2018).

[26]

Yang Song, Ali Mamdouh Elkahky, and Xiaodong He. 2016. Multi-rate deep learning for temporal recommendation. In SIGIR . 909--912.

Digital Library

[27]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NIPS. 6000--6010.

[28]

Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local Neural Networks. CVPR (2018).

[29]

Chao-Yuan Wu, Amr Ahmed, Alex Beutel, Alexander J Smola, and How Jing. 2017. Recurrent recommender networks. In WSDM. 495--503.

Digital Library

[30]

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In ICLR . 2048--2057.

Digital Library

[31]

Ming Yan, Jitao Sang, and Changsheng Xu. 2015. Unified youtube video recommendation via cross-network collaboration. In ICMR . 19--26.

Digital Library

[32]

Feng Yu, Qiang Liu, Shu Wu, Liang Wang, and Tieniu Tan. 2016. A dynamic recurrent model for next basket recommendation. In SIGIR . 729--732.

Digital Library

[33]

Yuyu Zhang, Hanjun Dai, Chang Xu, Jun Feng, Taifeng Wang, Jiang Bian, Bin Wang, and Tie-Yan Liu. 2014. Sequential Click Prediction for Sponsored Search with Recurrent Neural Networks. In AAAI, Vol. 14. 1369--1375.

Digital Library

[34]

Xiaojian Zhao, Guangda Li, Meng Wang, Jin Yuan, Zheng-Jun Zha, Zhoujun Li, and Tat-Seng Chua. 2011. Integrating rich information for video recommendation with multi-task rank aggregation. In MM. 1521--1524.

Digital Library

[35]

Chang Zhou, Jinze Bai, Junshuai Song, Xiaofei Liu, Zhengchao Zhao, Xiusi Chen, and Jun Gao. 2018. ATRank: An Attention-Based User Behavior Modeling Framework for Recommendation. AAAI (2018).

[36]

Xiangmin Zhou, Lei Chen, Yanchun Zhang, Longbing Cao, Guangyan Huang, and Chen Wang. 2015. Online video recommendation in sharing community. In SIGMOD. 1645--1656.

Digital Library

[37]

Qiusha Zhu, Mei-Ling Shyu, and Haohong Wang. 2013. Videotopic: Content-based video recommendation using a topic model. In Multimedia (ISM) IEEE International Symposium on. IEEE, 219--222.

Digital Library

Cited By

Li HSang LZhang YZhang YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)SimCEN: Simple Contrast-enhanced Network for CTR PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681203(2311-2320)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681203
Song ZZhang WDeng LZhang JBian KCui BSerra ESpezzano F(2024)MultiLoRA: Multi-Directional Low Rank Adaptation for Multi-Domain RecommendationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679549(2148-2157)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679549
Li BJin BYu YZheng YSong JZhuo WXiang T(2024)Orthogonal Hyper-category Guided Multi-interest Elicitation for Micro-video Matching2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687880(1-6)Online publication date: 15-Jul-2024
https://doi.org/10.1109/ICME57554.2024.10687880
Show More Cited By

Index Terms

Temporal Hierarchical Attention at Category- and Item-Level for Micro-Video Click-Through Prediction
1. Information systems
  1. World Wide Web
    1. Web searching and information discovery
      1. Personalization

Recommendations

User-Video Co-Attention Network for Personalized Micro-video Recommendation
WWW '19: The World Wide Web Conference

With the increasing popularity of micro-video sharing where people shoot short-videos effortlessly and share their daily stories on social media platforms, the micro-video recommendation has attracted extensive research efforts to provide users with ...
Aspect-level sentiment capsule network for micro-video click-through rate prediction
Abstract
Micro-videos, a new form of videos that are constrained in duration, gain significant popularity in recent years. The volume and rate of online micro-videos urgently calls for effective recommendation algorithms to help users find their interested ...
Click-Through Rate Prediction with Multi-Modal Hypergraphs
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Advertising is critical to many online e-commerce platforms such as e-Bay and Amazon. One of the important signals that these platforms rely upon is the click-through rate (CTR) prediction. The recent popularity of multi-modal sharing platforms such as ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '18: Proceedings of the 26th ACM international conference on Multimedia

October 2018

2167 pages

ISBN:9781450356657

DOI:10.1145/3240508

General Chairs:
Susanne Boll
University of Oldenburg, Germany
,
Kyoung Mu Lee
Seoul National University, Korea
,
Jiebo Luo
University of Rochester, USA
,
Wenwu Zhu
Tsinghua University, China
,
Program Chairs:
Hyeran Byun
Yonsei University, Korea
,
Chang Wen Chen
State Univ. Of New York at Buffalo, USA
,
Rainer Lienhart
University of Augsburg, Germany
,
Tao Mei
JD AI, China

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

MM '18

Sponsor:

SIGMM

MM '18: ACM Multimedia Conference

October 22 - 26, 2018

Seoul, Republic of Korea

Acceptance Rates

MM '18 Paper Acceptance Rate 209 of 757 submissions, 28%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

45
Total Citations
View Citations
904
Total Downloads

Downloads (Last 12 months)77
Downloads (Last 6 weeks)12

Reflects downloads up to 01 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li HSang LZhang YZhang YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)SimCEN: Simple Contrast-enhanced Network for CTR PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681203(2311-2320)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681203
Song ZZhang WDeng LZhang JBian KCui BSerra ESpezzano F(2024)MultiLoRA: Multi-Directional Low Rank Adaptation for Multi-Domain RecommendationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679549(2148-2157)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679549
Li BJin BYu YZheng YSong JZhuo WXiang T(2024)Orthogonal Hyper-category Guided Multi-interest Elicitation for Micro-video Matching2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687880(1-6)Online publication date: 15-Jul-2024
https://doi.org/10.1109/ICME57554.2024.10687880
Chen PTan Y(2024)Optimizing Personalized E-Commerce Micro-Video Recommendation with Self-Adaption Generative Gating Graph2024 5th International Conference on Computer Engineering and Application (ICCEA)10.1109/ICCEA62105.2024.10604194(1095-1106)Online publication date: 12-Apr-2024
https://doi.org/10.1109/ICCEA62105.2024.10604194
Li YLiu XZhang LTian HJing P(2024)Multimodal semantic enhanced representation network for micro-video event detectionKnowledge-Based Systems10.1016/j.knosys.2024.112255301(112255)Online publication date: Oct-2024
https://doi.org/10.1016/j.knosys.2024.112255
Gu PHu HXu G(2024)Modeling multi-behavior sequence via HyperGRU contrastive network for micro-video recommendationKnowledge-Based Systems10.1016/j.knosys.2024.111841295(111841)Online publication date: Jul-2024
https://doi.org/10.1016/j.knosys.2024.111841
Gu PHu H(2024)A holistic view on positive and negative implicit feedback for micro-video recommendationKnowledge-Based Systems10.1016/j.knosys.2023.111299284(111299)Online publication date: Jan-2024
https://doi.org/10.1016/j.knosys.2023.111299
Gu PHu HWang DYu DXu G(2024)Temporal Diversity-Aware Micro-Video Recommendation with Long- and Short-Term Interests ModelingNeural Processing Letters10.1007/s11063-024-11652-756:3Online publication date: 3-Jun-2024
https://doi.org/10.1007/s11063-024-11652-7
Yuan BYao WJing PZhang JTsang KWang S(2024)Context-aware focal alignment network for micro-video multi-label classificationPattern Analysis and Applications10.1007/s10044-024-01376-827:4Online publication date: 14-Nov-2024
https://doi.org/10.1007/s10044-024-01376-8
Lu YHuang YZhang SHan WChen HFan WLai JZhao ZWu F(2024)Multi-trends Enhanced Dynamic Micro-video RecommendationArtificial Intelligence10.1007/978-981-99-8850-1_35(430-441)Online publication date: 4-Feb-2024
https://doi.org/10.1007/978-981-99-8850-1_35
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents