More Web Proxy on the site http://driver.im/

research-article

Video Anomaly Detection via Progressive Learning of Multiple Proxy Tasks

Authors:

Jianxin LiaoAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 4719 - 4728

https://doi.org/10.1145/3664647.3680871

Published: 28 October 2024 Publication History

Abstract

Learning multiple proxy tasks is a popular training strategy in semi-supervised video anomaly detection. However, the traditional method of learning multiple proxy tasks simultaneously is prone to suboptimal solutions, and simply executing multiple proxy tasks sequentially cannot ensure continuous performance improvement. In this paper, we thoroughly investigate the impact of task composition and training order on performance enhancement. We find that ensuring continuous performance improvement in multi-task learning requires different but continuous optimization objectives in different training phases. To this end, a training strategy based on progressive learning is proposed to enhance the multi-task learning in VAD. The learning objectives of the model in previous phases contribute to the training in subsequent phases. Specifically, we decompose video anomaly detection into three phases: perception, comprehension, and inference, continuously refining the learning objectives to enhance model performance. In the three phases, we perform the visual task, the semantic task and the open-set task in turn to train the model. The model learns different levels of features and focuses on different types of anomalies in different phases. Extensive experiments demonstrate the effectiveness of our method, highlighting that the benefits derived from the progressive learning transcend specific proxy tasks.

References

[1]

Andra Acsintoae, Andrei Florescu, Mariana-Iuliana Georgescu, Tudor Mare, Paul Sumedrea, Radu Tudor Ionescu, Fahad Shahbaz Khan, and Mubarak Shah. 2022. UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection. In CVPR. 20111--20121.

[2]

Qianyue Bao, Fang Liu, Yang Liu, Licheng Jiao, Xu Liu, and Lingling Li. 2022. Hierarchical Scene Normality-Binding Modeling for Anomaly Detection in Surveillance Videos. In ACM Multimedia. 6103--6112.

[3]

Congqi Cao, Yue Lu, Peng Wang, and Yanning Zhang. 2023. A New Comprehensive Benchmark for Semi-supervised Video Anomaly Detection and Anticipation. In CVPR. 20392--20401.

[4]

Yunpeng Chang, Zhigang Tu, Wei Xie, and Junsong Yuan. 2020. Clustering Driven Deep Autoencoder for Video Anomaly Detection. In ECCV. 329--345.

[5]

Chengwei Chen, Yuan Xie, Shaohui Lin, Angela Yao, Guannan Jiang, Wei Zhang, Yanyun Qu, Ruizhi Qiao, Bo Ren, and Lizhuang Ma. 2022. Comprehensive Regularization in a Bi-directional Predictive Network for Video Anomaly Detection. In AAAI. 230--238.

[6]

Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In ECCV. 833--851.

[7]

MyeongAh Cho, Taeoh Kim, Woo Jin Kim, Suhwan Cho, and Sangyoun Lee. 2022. Unsupervised video anomaly detection via normalizing flows with implicit latent features. Pattern Recognit., Vol. 129 (2022), 108703.

Digital Library

[8]

Jia-Chang Feng, Fa-Ting Hong, and Wei-Shi Zheng. 2021. MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection. In CVPR. 14009--14018.

[9]

Alessandro Flaborea, Luca Collorone, Guido Maria D'Amely di Melendugno, Stefano D'Arrigo, Bardh Prenkaj, and Fabio Galasso. 2023. Multimodal Motion Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection. In (ICCV). 10318--10329.

[10]

Mariana-Iuliana Georgescu, Antonio Barbalau, Radu Tudor Ionescu, Fahad Shahbaz Khan, Marius Popescu, and Mubarak Shah. 2021. Anomaly Detection in Video via Self-Supervised and Multi-Task Learning. In CVPR. 12742--12752.

[11]

Mariana-Iuliana Georgescu, Radu Tudor Ionescu, Fahad Shahbaz Khan, Marius Popescu, and Mubarak Shah. 2022. A Background-Agnostic Framework With Adversarial Training for Abnormal Event Detection in Video. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 44, 9 (2022), 4505--4523.

[12]

Dong Gong, Lingqiao Liu, Vuong Le, Budhaditya Saha, Moussa Reda Mansour, Svetha Venkatesh, and Anton van den Hengel. 2019. Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection. In ICCV. 1705--1714.

[13]

Dong Gong, Lingqiao Liu, Vuong Le, Budhaditya Saha, Moussa Reda Mansour, Svetha Venkatesh, and Anton van den Hengel. 2019. Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection. In ICCV. 1705--1714.

[14]

Michelle Guo, Albert Haque, De-An Huang, Serena Yeung, and Li Fei-Fei. 2018. Dynamic Task Prioritization for Multitask Learning. In ECCV (16). 282--299.

[15]

Mahmudul Hasan, Jonghyun Choi, Jan Neumann, Amit K. Roy-Chowdhury, and Larry S. Davis. 2016. Learning Temporal Regularity in Video Sequences. In CVPR. 733--742.

[16]

Nicolas Heess, Gregory Wayne, Yuval Tassa, Timothy P. Lillicrap, Martin A. Riedmiller, and David Silver. 2016. Learning and Transfer of Modulated Locomotor Controllers. CoRR, Vol. abs/1610.05182 (2016).

[17]

Jinlei Hou, Yingying Zhang, Qiaoyong Zhong, Di Xie, Shiliang Pu, and Hong Zhou. 2021. Divide-and-Assemble: Learning Block-wise Memory for Unsupervised Anomaly Detection. In ICCV. 8771--8780.

[18]

Yuzheng Hu, Ruicheng Xian, Qilong Wu, Qiuling Fan, Lang Yin, and Han Zhao. 2023. Revisiting Scalarization in Multi-Task Learning: A Theoretical Perspective. In NeurIPS.

[19]

Radu Tudor Ionescu, Fahad Shahbaz Khan, Mariana-Iuliana Georgescu, and Ling Shao. 2019. Object-Centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video. In CVPR. 7842--7851.

[20]

Parker Knight and Rui Duan. 2023. Multi-task learning with summary statistics. In NeurIPS.

[21]

Thanh-Thien Le, Manh Nguyen, Tung Thanh Nguyen, Ngo Van Linh, and Thien Huu Nguyen. 2024. Continual Relation Extraction via Sequential Multi-Task Learning. In AAAI. 18444--18452.

[22]

Wenrui Liu, Hong Chang, Bingpeng Ma, Shiguang Shan, and Xilin Chen. 2023. Diversity-measurable anomaly detection. In CVPR. 12147--12156.

[23]

Wen Liu, Weixin Luo, Dongze Lian, and Shenghua Gao. 2018. Future Frame Prediction for Anomaly Detection - A New Baseline. In CVPR. 6536--6545.

[24]

Yang Liu, Zhaoyang Xia, Mengyang Zhao, Donglai Wei, Yuzheng Wang, Siao Liu, Bobo Ju, Gaoyun Fang, Jing Liu, and Liang Song. 2023. Learning Causality-inspired Representation Consistency for Video Anomaly Detection. In ACM Multimedia. 203--212.

[25]

Zhian Liu, Yongwei Nie, Chengjiang Long, Qing Zhang, and Guiqing Li. 2021. A Hybrid Video Anomaly Detection Framework via Memory-Augmented Flow Reconstruction and Flow-Guided Frame Prediction. In ICCV. 13568--13577.

[26]

Zuhao Liu, Xiao-Ming Wu, Dian Zheng, Kun-Yu Lin, and Wei-Shi Zheng. 2023. Generating Anomalies for Video Anomaly Detection with Prompt-based Feature Mapping. In CVPR. IEEE, 24500--24510.

[27]

Cewu Lu, Jianping Shi, and Jiaya Jia. 2013. Abnormal event detection at 150 FPS in MATLAB. In ICCV. 2720--2727.

[28]

Weixin Luo, Wen Liu, and Shenghua Gao. 2017. A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework. In ICCV. 341--349.

[29]

Hui Lv, Chen Chen, Zhen Cui, Chunyan Xu, Yong Li, and Jian Yang. 2021. Learning Normal Dynamics in Videos With Meta Prototype Network. In CVPR. 15425--15434.

[30]

Hui Lv, Zhongqi Yue, Qianru Sun, Bin Luo, Zhen Cui, and Hanwang Zhang. 2023. Unbiased Multiple Instance Learning for Weakly Supervised Video Anomaly Detection. In CVPR. 8022--8031.

[31]

Trong-Nguyen Nguyen and Jean Meunier. 2019. Anomaly Detection in Video Sequence With Appearance-Motion Correspondence. In ICCV. 1273--1283.

[32]

Ravikiran Parameshwara, Ibrahim Radwan, Akshay Asthana, Iman Abbasnejad, Ramanathan Subramanian, and Roland Goecke. 2023. Efficient Labelling of Affective Video Datasets via Few-Shot & Multi-Task Contrastive Learning. In ACM Multimedia. 6161--6170.

[33]

Hyunjong Park, Jongyoun Noh, and Bumsub Ham. 2020. Learning Memory-Guided Normality for Anomaly Detection. In CVPR. 14360--14369.

[34]

Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. CoRR, Vol. abs/1804.02767 (2018).

[35]

Nicolae-Catalin Ristea, Neelu Madan, Radu Tudor Ionescu, Kamal Nasrollahi, Fahad Shahbaz Khan, Thomas B. Moeslund, and Mubarak Shah. 2022. Self-Supervised Predictive Convolutional Attentive Block for Anomaly Detection. In CVPR. 13566--13576.

[36]

Florinel-Alin Croitoru, Radu Tudor Ionescu, Marius Popescu, Fahad Shahbaz Khan, and Mubarak Shah. 2024. Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 15984--15995.

[37]

Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In CVPR. 815--823.

[38]

Ozan Sener and Vladlen Koltun. 2018. Multi-Task Learning as Multi-Objective Optimization. In NeurIPS. 525--536.

[39]

Chenrui Shi, Che Sun, Yuwei Wu, and Yunde Jia. 2023. Video Anomaly Detection via Sequentially Learning Multiple Pretext Tasks. In ICCV. 10296--10306.

[40]

Ashish Singh, Michael J Jones, and Erik G Learned-Miller. 2023. EVAL: Explainable Video Anomaly Localization. In CVPR. 18717--18726.

[41]

Waqas Sultani, Chen Chen, and Mubarak Shah. 2018. Real-World Anomaly Detection in Surveillance Videos. In CVPR. 6479--6488.

[42]

Che Sun, Chenrui Shi, Yunde Jia, and Yuwei Wu. 2023. Learning Event-Relevant Factors for Video Anomaly Detection. In AAAI. 2384--2392.

[43]

Shengyang Sun and Xiaojin Gong. 2023. Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection. In CVPR. 22846--22856.

[44]

Yu Tian, Guansong Pang, Yuanhong Chen, Rajvinder Singh, Johan W. Verjans, and Gustavo Carneiro. 2021. Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning. In ICCV. 4955--4966.

[45]

Guodong Wang, Yunhong Wang, Jie Qin, Dongming Zhang, Xiuguo Bao, and Di Huang. 2022. Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles. In ECCV. 494--511.

[46]

Xuanzhao Wang, Zhengping Che, Bo Jiang, Ning Xiao, Ke Yang, Jian Tang, Jieping Ye, Jingyu Wang, and Qi Qi. 2022. Robust Unsupervised Video Anomaly Detection by Multipath Frame Prediction. IEEE Trans. Neural Networks Learn. Syst., Vol. 33, 6 (2022), 2301--2312.

[47]

Ziming Wang, Yuexian Zou, and Zeming Zhang. 2020. Cluster Attention Contrast for Video Anomaly Detection. In ACM Multimedia. 2463--2471.

[48]

Jie Wu, Wei Zhang, Guanbin Li, Wenhao Wu, Xiao Tan, Yingying Li, Errui Ding, and Liang Lin. 2021. Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video. In IJCAI. 1172--1178.

[49]

Peng Wu, Xuerong Zhou, Guansong Pang, Yujia Sun, Jing Liu, Peng Wang, and Yanning Zhang. 2023. Open-Vocabulary Video Anomaly Detection. CoRR, Vol. abs/2311.07042 (2023).

[50]

Zhiwei Yang, Jing Liu, Zhaoyang Wu, Peng Wu, and Xiaotao Liu. 2023. Video Event Restoration Based on Keyframes for Video Anomaly Detection. In CVPR. 14592--14601.

[51]

Zhiwei Yang, Peng Wu, Jing Liu, and Xiaotao Liu. 2022. Dynamic Local Aggregation Network with Adaptive Clusterer for Anomaly Detection. In ECCV. 404--421.

[52]

Guang Yu, Siqi Wang, Zhiping Cai, En Zhu, Chuanfu Xu, Jianping Yin, and Marius Kloft. 2020. Cloze Test Helps: Effective Video Anomaly Detection via Learning to Complete Video Events. In ACM Multimedia. 583--591.

[53]

Jongmin Yu, Younkwan Lee, Kin Choong Yow, Moongu Jeon, and Witold Pedrycz. 2022. Abnormal Event Detection and Localization via Adversarial Event Prediction. IEEE Trans. Neural Networks Learn. Syst., Vol. 33, 8 (2022), 3572--3586.

[54]

Muhammad Zaigham Zaheer, Arif Mahmood, Muhammad Haris Khan, Mattia Segù, Fisher Yu, and Seung-Ik Lee. 2022. Generative Cooperative Learning for Unsupervised Video Anomaly Detection. In CVPR. 14724--14734.

[55]

Xianlin Zeng, Yalong Jiang, Wenrui Ding, Hongguang Li, Yafeng Hao, and Zifeng Qiu. 2023. A Hierarchical Spatio-Temporal Graph Convolutional Neural Network for Anomaly Detection in Videos. IEEE Trans. Circuits Syst. Video Technol., Vol. 33, 1 (2023), 200--212.

[56]

Chen Zhang, Guorong Li, Yuankai Qi, Shuhui Wang, Laiyun Qing, Qingming Huang, and Ming-Hsuan Yang. 2023. Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection. In CVPR. 16271--16280.

[57]

Menghao Zhang, Jingyu Wang, Qi Qi, Haifeng Sun, Zirui Zhuang, Pengfei Ren, Ruilong Ma, and Jianxin Liao. 2024. Multi-Scale Video Anomaly Detection by Multi-Grained Spatio-Temporal Representation Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 17385--17394.

[58]

Yiru Zhao, Bing Deng, Chen Shen, Yao Liu, Hongtao Lu, and Xian-Sheng Hua. 2017. Spatio-Temporal AutoEncoder for Video Anomaly Detection. In ACM Multimedia. 1933--1941.

[59]

Yuanhong Zhong, Xia Chen, Yongting Hu, Panliang Tang, and Fan Ren. 2022. Bidirectional Spatio-Temporal Feature Learning With Multiscale Evaluation for Video Anomaly Detection. IEEE Trans. Circuits Syst. Video Technol., Vol. 32, 12 (2022), 8285--8296.

[60]

Hang Zhou, Junqing Yu, and Wei Yang. 2023. Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection. In AAAI. AAAI, 3769--3777.

Index Terms

Video Anomaly Detection via Progressive Learning of Multiple Proxy Tasks
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Activity recognition and understanding
        Scene anomaly detection

Recommendations

Dy-MIL: dynamic multiple-instance learning framework for video anomaly detection
Abstract
Anomaly detection is an extremely challenging task in the field of visual understanding because it involves identifying events that deviate significantly from normal patterns. One of the primary reasons for the difficulty of this task is the ...
Towards Open Set Video Anomaly Detection
Computer Vision – ECCV 2022
Abstract
Open Set Video Anomaly Detection (OpenVAD) aims to identify abnormal events from video data where both known anomalies and novel ones exist in testing. Unsupervised models learned solely from normal videos are applicable to any testing anomalies ...
Video Anomaly Detection via self-supervised and spatio-temporal proxy tasks learning
Abstract
Video Anomaly Detection (VAD) aims to identify events in videos that deviate from typical patterns. Given the scarcity of anomalous samples, previous research has primarily focused on learning regular patterns from datasets exclusively containing ...
Highlights
- Introduced novel proxy tasks for enhanced video anomaly detection.
- Developed self-supervised model separating spatio-temporal dimensions.
- Implemented end-to-end training, independent of pre-trained models.
- Achieved high AUC ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China under Grants
Beijing University of Posts and Telecommunications-China Mobile Research Institute Joint Innovation Center
the BUPT Excellent Ph.D. Students Foundation
the Ministry of Education and China Mobile Joint Fund
Project funded by China Postdoctoral Science Foundation

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
205
Total Downloads

Downloads (Last 12 months)205
Downloads (Last 6 weeks)146

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten