[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/CVPR.2014.288guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Video Event Detection by Inferring Temporal Instance Labels

Published: 23 June 2014 Publication History

Abstract

Video event detection allows intelligent indexing of video content based on events. Traditional approaches extract features from video frames or shots, then quantize and pool the features to form a single vector representation for the entire video. Though simple and efficient, the final pooling step may lead to loss of temporally local information, which is important in indicating which part in a long video signifies presence of the event. In this work, we propose a novel instance-based video event detection approach. We represent each video as multiple 'instances', defined as video segments of different temporal intervals. The objective is to learn an instance-level event detection model based on only video-level labels. To solve this problem, we propose a large-margin formulation which treats the instance labels as hidden latent variables, and simultaneously infers the instance labels as well as the instance-level classification model. Our framework infers optimal solutions that assume positive videos have a large number of positive instances while negative videos have the fewest ones. Extensive experiments on large-scale video event datasets demonstrate significant performance gains. The proposed method is also useful in explaining the detection results by localizing the temporal segments in a video which is responsible for the positive detection.

Cited By

View all
  • (2020)Learning from label proportionsProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497590(22256-22267)Online publication date: 6-Dec-2020
  • (2020)Eliciting User Preferences for Personalized Explanations for Video SummariesProceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization10.1145/3340631.3394862(98-106)Online publication date: 7-Jul-2020
  • (2019)Visual Content Recognition by Exploiting Semantic Feature Map with Attention and Multi-task LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/323173915:1s(1-22)Online publication date: 5-Feb-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
CVPR '14: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition
June 2014
4302 pages
ISBN:9781479951185

Publisher

IEEE Computer Society

United States

Publication History

Published: 23 June 2014

Author Tags

  1. Multiple Instance Learning
  2. Proportion SVM
  3. Video Event Detection

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Learning from label proportionsProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497590(22256-22267)Online publication date: 6-Dec-2020
  • (2020)Eliciting User Preferences for Personalized Explanations for Video SummariesProceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization10.1145/3340631.3394862(98-106)Online publication date: 7-Jul-2020
  • (2019)Visual Content Recognition by Exploiting Semantic Feature Map with Attention and Multi-task LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/323173915:1s(1-22)Online publication date: 5-Feb-2019
  • (2019)Exploiting Negative Evidence for Deep Latent Structured ModelsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2017.278843541:2(337-351)Online publication date: 1-Feb-2019
  • (2017)DECKProceedings of the Thirty-First AAAI Conference on Artificial Intelligence10.5555/3298023.3298154(4032-4038)Online publication date: 4-Feb-2017
  • (2017)Learning Semantic Feature Map for Visual Content RecognitionProceedings of the 25th ACM international conference on Multimedia10.1145/3123266.3123379(1291-1299)Online publication date: 23-Oct-2017
  • (2017)Real-time Action Recognition Based on Key Frame DetectionProceedings of the 9th International Conference on Machine Learning and Computing10.1145/3055635.3056569(272-277)Online publication date: 24-Feb-2017
  • (2017)Discriminative Multi-instance Multitask Learning for 3D Action RecognitionIEEE Transactions on Multimedia10.1109/TMM.2016.262695919:3(519-529)Online publication date: 1-Mar-2017
  • (2017)Hierarchical Latent Concept Discovery for Video Event DetectionIEEE Transactions on Image Processing10.1109/TIP.2017.267078226:5(2149-2162)Online publication date: 1-May-2017
  • (2017)Revealing Event Saliency in Unconstrained Video CollectionIEEE Transactions on Image Processing10.1109/TIP.2017.265895726:4(1746-1758)Online publication date: 1-Apr-2017
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media