research-article

Visual Event Recognition in Videos by Learning from Web Data

Authors:

Lixin Duan,

Dong Xu,

Ivor Wai-Hung Tsang,

Jiebo LuoAuthors Info & Claims

IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 34, Issue 9

Pages 1667 - 1680

https://doi.org/10.1109/TPAMI.2011.265

Published: 01 September 2012 Publication History

Abstract

We propose a visual event recognition framework for consumer videos by leveraging a large amount of loosely labeled web videos (e.g., from YouTube). Observing that consumer videos generally contain large intraclass variations within the same type of events, we first propose a new method, called Aligned Space-Time Pyramid Matching (ASTPM), to measure the distance between any two video clips. Second, we propose a new transfer learning method, referred to as Adaptive Multiple Kernel Learning (A-MKL), in order to 1) fuse the information from multiple pyramid levels and features (i.e., space-time features and static SIFT features) and 2) cope with the considerable variation in feature distributions between videos from two domains (i.e., web video domain and consumer video domain). For each pyramid level and each type of local features, we first train a set of SVM classifiers based on the combined training set from two domains by using multiple base kernels from different kernel types and parameters, which are then fused with equal weights to obtain a prelearned average classifier. In A-MKL, for each event class we learn an adapted target classifier based on multiple base kernels and the prelearned average classifiers from this event class or all the event classes by minimizing both the structural risk functional and the mismatch between data distributions of two domains. Extensive experiments demonstrate the effectiveness of our proposed framework that requires only a small number of labeled consumer videos by leveraging web data. We also conduct an in-depth investigation on various aspects of the proposed method A-MKL, such as the analysis on the combination coefficients on the prelearned classifiers, the convergence of the learning algorithm, and the performance variation by using different proportions of labeled consumer videos. Moreover, we show that A-MKL using the prelearned classifiers from all the event classes leads to better performance when compared with A-MKL using the prelearned classifiers only from each individual event class.

Cited By

View all

Li LYang JKong XZhang JMa Y(2022)Unsupervised domain adaptation via discriminative feature learning and classifier adaptation from center-based distancesKnowledge-Based Systems10.1016/j.knosys.2022.109022250:COnline publication date: 17-Aug-2022
https://dl.acm.org/doi/10.1016/j.knosys.2022.109022
Chen XKim KYoun H(2022)Feature matching and instance reweighting with transfer learning for human activity recognition using smartphoneThe Journal of Supercomputing10.1007/s11227-021-03844-y78:1(712-739)Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1007/s11227-021-03844-y
Shang JNiu CHuang JZhou ZYang JXu SYang L(2022)Few-shot domain adaptation through compensation-guided progressive alignment and bias reductionApplied Intelligence10.1007/s10489-021-02987-y52:10(10917-10933)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1007/s10489-021-02987-y
Show More Cited By

Visual Event Recognition in Videos by Learning from Web Data
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence Volume 34, Issue 9

September 2012

206 pages

ISSN:0162-8828

Issue’s Table of Contents

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 September 2012

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

91
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Li LYang JKong XZhang JMa Y(2022)Unsupervised domain adaptation via discriminative feature learning and classifier adaptation from center-based distancesKnowledge-Based Systems10.1016/j.knosys.2022.109022250:COnline publication date: 17-Aug-2022
https://dl.acm.org/doi/10.1016/j.knosys.2022.109022
Chen XKim KYoun H(2022)Feature matching and instance reweighting with transfer learning for human activity recognition using smartphoneThe Journal of Supercomputing10.1007/s11227-021-03844-y78:1(712-739)Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1007/s11227-021-03844-y
Shang JNiu CHuang JZhou ZYang JXu SYang L(2022)Few-shot domain adaptation through compensation-guided progressive alignment and bias reductionApplied Intelligence10.1007/s10489-021-02987-y52:10(10917-10933)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1007/s10489-021-02987-y
Zhang WXu DZhang JOuyang W(2021)Progressive Modality Cooperation for Multi-Modality Domain AdaptationIEEE Transactions on Image Processing10.1109/TIP.2021.305208330(3293-3306)Online publication date: 2-Mar-2021
https://dl.acm.org/doi/10.1109/TIP.2021.3052083
Chen YWang HLi WSakaridis CDai DVan Gool L(2021)Scale-Aware Domain Adaptive Faster R-CNNInternational Journal of Computer Vision10.1007/s11263-021-01447-x129:7(2223-2243)Online publication date: 1-Jul-2021
https://dl.acm.org/doi/10.1007/s11263-021-01447-x
Li PTang HYu JSong W(2021)LSTM and multiple CNNs based event image classificationMultimedia Tools and Applications10.1007/s11042-020-10165-480:20(30743-30760)Online publication date: 1-Aug-2021
https://dl.acm.org/doi/10.1007/s11042-020-10165-4
Wu HYan YYe YNg MWu Q(2020)Geometric Knowledge Embedding for unsupervised domain adaptationKnowledge-Based Systems10.1016/j.knosys.2019.105155191:COnline publication date: 5-Mar-2020
https://dl.acm.org/doi/10.1016/j.knosys.2019.105155
Mallick AMukhopadhyay S(2020)Video retrieval using salient foreground region of motion vector based extracted keyframes and spatial pyramid matchingMultimedia Tools and Applications10.1007/s11042-020-09312-879:37-38(27995-28022)Online publication date: 1-Oct-2020
https://dl.acm.org/doi/10.1007/s11042-020-09312-8
Gholenji ETahmoresnezhad J(2020)Joint discriminative subspace and distribution adaptation for unsupervised domain adaptationApplied Intelligence10.1007/s10489-019-01610-550:7(2050-2066)Online publication date: 1-Jul-2020
https://dl.acm.org/doi/10.1007/s10489-019-01610-5
Tong MBai HYue XBu H(2020)PTL-LTM model for complex action recognition using local-weighted NMF and deep dual-manifold regularized NMF with sparsity constraintNeural Computing and Applications10.1007/s00521-020-04783-032:17(13759-13781)Online publication date: 1-Sep-2020
https://dl.acm.org/doi/10.1007/s00521-020-04783-0
Show More Cited By

Abstract

Cited By

Recommendations

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Share

Share this Publication link

Share on social media

Affiliations