[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3503161.3548199acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Hierarchical Scene Normality-Binding Modeling for Anomaly Detection in Surveillance Videos

Published: 10 October 2022 Publication History

Abstract

Anomaly detection in surveillance videos is an important topic in the multimedia community, which requires efficient scene context extraction and the capture of temporal information as a basis for decision. From the perspective of hierarchical modeling, we parse the surveillance scene from global to local and propose a Hierarchical Scene Normality-Binding Modeling framework (HSNBM) to handle anomaly detection. For the static background hierarchy, we design a Region Clustering-driven Multi-task Memory Autoencoder (RCM-MemAE), which can simultaneously perform region segmentation and scene reconstruction. The normal prototypes of each local region are stored, and the frame reconstruction error is subsequently amplified by global memory augmentation. For the dynamic foreground object hierarchy, we employ a Scene-Object Binding Frame Prediction module (SOB-FP) to bind all foreground objects in the frame with the prototypes stored in the background hierarchy according their positions, thus fully exploit the normality relationship between foreground and background. The bound features are then fed into the decoder to predict the future movement of the objects. With the binding mechanism between foreground and background, HSNBM effectively integrates the "reconstruction" and "prediction" tasks and builds a semantic bridge between the two hierarchies. Finally, HSNBM fuses the anomaly scores of the two hierarchies to make a comprehensive decision. Extensive empirical studies on three standard video anomaly detection datasets demonstrate the effectiveness of the proposed HSNBM framework.

Supplementary Material

MP4 File (MM22-fp1861.mp4)
In this presentation video, we present our work on the Hierarchical Scene Normality-Binding Modeling framework (HSNBM) for video anomaly detection. The presentation starts with the context and related works, and then goes over the technical details of the proposed framework specifically the RCM-MemAE, and the SOB-FP. Next, we present qualitative and quantitative experiment results, and conclude with the summary of the contributions and innovations. We conducted extensive experiments on three public datasets (Ped2, Avenue, and Shanghai Tech).

References

[1]
Ruichu Cai, Hao Zhang, Wen Liu, Shenghua Gao, and Zhifeng Hao. 2021. Appearance-motion memory consistency network for video anomaly detection. In Proc. AAAI. 938--946.
[2]
Mathilde Caron, Piotr Bojanowski, Armand Joulin, and Matthijs Douze. 2018. Deep clustering for unsupervised learning of visual features. In Proceedings of the European conference on computer vision (ECCV). 132--149.
[3]
Raghavendra Chalapathy and Sanjay Chawla. 2019. Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407 (2019).
[4]
Yunpeng Chang, Zhigang Tu, Wei Xie, and Junsong Yuan. 2020. Clustering driven deep autoencoder for video anomaly detection. In European Conference on Computer Vision. Springer, 329--345.
[5]
Dongyue Chen, Lingyi Yue, Xingya Chang, Ming Xu, and Tong Jia. 2021. NM-GAN: Noise-modulated generative adversarial network for video anomaly detection. Pattern Recognition, Vol. 116 (2021), 107969.
[6]
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV). 801--818.
[7]
Jang Hyun Cho, Utkarsh Mall, Kavita Bala, and Bharath Hariharan. 2021. Picie: Unsupervised semantic segmentation using invariance and equivariance in clustering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16794--16804.
[8]
Jia-Chang Feng, Fa-Ting Hong, and Wei-Shi Zheng. 2021a. Mist: Multiple instance self-training framework for video anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14009--14018.
[9]
Xinyang Feng, Dongjin Song, Yuncong Chen, Zhengzhang Chen, Jingchao Ni, and Haifeng Chen. 2021b. Convolutional Transformer based Dual Discriminator Generative Adversarial Networks for Video Anomaly Detection. In Proceedings of the 29th ACM International Conference on Multimedia. 5546--5554.
[10]
Jie Gao, Licheng Jiao, Fang Liu, Shuyuan Yang, Biao Hou, and Xu Liu. 2021. Multiscale Curvelet Scattering Network. IEEE Transactions on Neural Networks and Learning Systems (2021).
[11]
Mariana-Iuliana Georgescu, Antonio Barbalau, Radu Tudor Ionescu, Fahad Shahbaz Khan, Marius Popescu, and Mubarak Shah. 2021. Anomaly detection in video via self-supervised and multi-task learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12742--12752.
[12]
Dong Gong, Lingqiao Liu, Vuong Le, Budhaditya Saha, Moussa Reda Mansour, Svetha Venkatesh, and Anton van den Hengel. 2019. Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1705--1714.
[13]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems, Vol. 27 (2014).
[14]
Alex Graves, Greg Wayne, Malcolm Reynolds, Tim Harley, Ivo Danihelka, Agnieszka Grabska-Barwi'nska, Sergio Gómez Colmenarejo, Edward Grefenstette, Tiago Ramalho, John Agapiou, et al. 2016. Hybrid computing using a neural network with dynamic external memory. Nature, Vol. 538, 7626 (2016), 471--476.
[15]
Zhicheng Guo, Jiaxuan Zhao, Licheng Jiao, Xu Liu, and Fang Liu. 2021. A Universal Quaternion Hypergraph Network for Multimodal Video Question Answering. IEEE Transactions on Multimedia (2021).
[16]
Mahmudul Hasan, Jonghyun Choi, Jan Neumann, Amit K Roy-Chowdhury, and Larry S Davis. 2016. Learning temporal regularity in video sequences. In Proceedings of the IEEE conference on computer vision and pattern recognition. 733--742.
[17]
Ryota Hinami, Tao Mei, and Shin'ichi Satoh. 2017. Joint detection and recounting of abnormal events by learning deep generic knowledge. In Proceedings of the IEEE international conference on computer vision. 3619--3627.
[18]
Radu Tudor Ionescu, Fahad Shahbaz Khan, Mariana-Iuliana Georgescu, and Ling Shao. 2019. Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7842--7851.
[19]
Licheng Jiao, Ronghua Shang, Fang Liu, and Weitong Zhang. 2020. Brain and Nature-Inspired Learning, Computation and Recognition. Elsevier.
[20]
Licheng Jiao, Ruohan Zhang, Fang Liu, Shuyuan Yang, Biao Hou, Lingling Li, and Xu Tang. 2021. New generation deep learning for video object detection: A survey. IEEE Transactions on Neural Networks and Learning Systems (2021).
[21]
Jaechul Kim and Kristen Grauman. 2009. Observe locally, infer globally: a space-time MRF for detecting abnormal activities with incremental updates. In 2009 IEEE conference on computer vision and pattern recognition. IEEE, 2921--2928.
[22]
Sangmin Lee, Hak Gu Kim, Dae Hwi Choi, Hyung-Il Kim, and Yong Man Ro. 2021. Video prediction recalling long-term motion context via memory alignment learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3054--3063.
[23]
Sangho Lee, Jinyoung Sung, Youngjae Yu, and Gunhee Kim. 2018. A memory network approach for story-based temporal summarization of 360 videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1410--1419.
[24]
Shuo Li, Fang Liu, and Licheng Jiao. 2022. Self-training multi-sequence learning with Transformer for weakly supervised video anomaly detection. Proceedings of the AAAI, Virtual, Vol. 24 (2022).
[25]
Weixin Li, Vijay Mahadevan, and Nuno Vasconcelos. 2013. Anomaly detection and localization in crowded scenes. IEEE transactions on pattern analysis and machine intelligence, Vol. 36, 1 (2013), 18--32.
[26]
Fang Liu, Xiaoxue Qian, Licheng Jiao, Xiangrong Zhang, Lingling Li, and Yuanhao Cui. 2022. Contrastive Learning-Based Dual Dynamic GCN for SAR Image Scene Classification. IEEE Transactions on Neural Networks and Learning Systems (2022).
[27]
Wen Liu, Weixin Luo, Dongze Lian, and Shenghua Gao. 2018. Future frame prediction for anomaly detection--a new baseline. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6536--6545.
[28]
Zhian Liu, Yongwei Nie, Chengjiang Long, Qing Zhang, and Guiqing Li. 2021. A Hybrid Video Anomaly Detection Framework via Memory-Augmented Flow Reconstruction and Flow-Guided Frame Prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13588--13597.
[29]
Cewu Lu, Jianping Shi, and Jiaya Jia. 2013. Abnormal event detection at 150 fps in matlab. In Proceedings of the IEEE international conference on computer vision. 2720--2727.
[30]
Yiwei Lu, K Mahesh Kumar, Seyed shahabeddin Nabavi, and Yang Wang. 2019. Future frame prediction using convolutional vrnn for anomaly detection. In 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 1--8.
[31]
Weixin Luo, Wen Liu, and Shenghua Gao. 2017a. Remembering history with convolutional lstm for anomaly detection. In 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 439--444.
[32]
Weixin Luo, Wen Liu, and Shenghua Gao. 2017b. A revisit of sparse coding based anomaly detection in stacked rnn framework. In Proceedings of the IEEE international conference on computer vision. 341--349.
[33]
Yawei Luo, Liang Zheng, Tao Guan, Junqing Yu, and Yi Yang. 2019. Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2507--2516.
[34]
Vijay Mahadevan, Weixin Li, Viral Bhalodia, and Nuno Vasconcelos. 2010. Anomaly detection in crowded scenes. In 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, 1975--1981.
[35]
Trong-Nguyen Nguyen and Jean Meunier. 2019. Anomaly detection in video sequence with appearance-motion correspondence. In Proceedings of the IEEE/CVF international conference on computer vision. 1273--1283.
[36]
Hyunjong Park, Jongyoun Noh, and Bumsub Ham. 2020. Learning memory-guided normality for anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14372--14381.
[37]
Xiaoxue Qian, Fang Liu, Licheng Jiao, Xiangrong Zhang, Puhua Chen, Lingling Li, Jing Gu, and Yuanhao Cui. 2021. A Hybrid Network With Structural Constraints for SAR Image Scene Classification. IEEE Transactions on Geoscience and Remote Sensing, Vol. 60 (2021), 1--17.
[38]
Waqas Sultani, Chen Chen, and Mubarak Shah. 2018. Real-world anomaly detection in surveillance videos. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6479--6488.
[39]
Che Sun, Yunde Jia, Yao Hu, and Yuwei Wu. 2020. Scene-aware context reasoning for unsupervised abnormal event detection in videos. In Proceedings of the 28th ACM International Conference on Multimedia. 184--192.
[40]
Yao Tang, Lin Zhao, Shanshan Zhang, Chen Gong, Guangyu Li, and Jian Yang. 2020. Integrating prediction and reconstruction for anomaly detection. Pattern Recognition Letters, Vol. 129 (2020), 123--130.
[41]
Yu Tian, Guansong Pang, Yuanhong Chen, Rajvinder Singh, Johan W Verjans, and Gustavo Carneiro. 2021. Weakly-supervised video anomaly detection with robust temporal feature magnitude learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4975--4986.
[42]
Xuanzhao Wang, Zhengping Che, Bo Jiang, Ning Xiao, Ke Yang, Jian Tang, Jieping Ye, Jingyu Wang, and Qi Qi. 2021. Robust unsupervised video anomaly detection by multipath frame prediction. IEEE Transactions on Neural Networks and Learning Systems (2021).
[43]
Zitong Wu, Biao Hou, and Licheng Jiao. 2020. Multiscale CNN with autoencoder regularization joint contextual attention network for SAR image classification. IEEE Transactions on Geoscience and Remote Sensing, Vol. 59, 2 (2020), 1200--1213.
[44]
Muchao Ye, Xiaojiang Peng, Weihao Gan, Wei Wu, and Yu Qiao. 2019. Anopcn: Video anomaly detection via deep predictive coding network. In Proceedings of the 27th ACM International Conference on Multimedia. 1805--1813.
[45]
Guang Yu, Siqi Wang, Zhiping Cai, En Zhu, Chuanfu Xu, Jianping Yin, and Marius Kloft. 2020. Cloze test helps: Effective video anomaly detection via learning to complete video events. In Proceedings of the 28th ACM International Conference on Multimedia. 583--591.
[46]
Jongmin Yu, Younkwan Lee, Kin Choong Yow, Moongu Jeon, and Witold Pedrycz. 2021. Abnormal event detection and localization via adversarial event prediction. IEEE Transactions on Neural Networks and Learning Systems (2021).
[47]
Muhammad Zaigham Zaheer, Jin-ha Lee, Marcella Astrid, and Seung-Ik Lee. 2020. Old is gold: Redefining the adversarially learned one-class classifier training paradigm. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14183--14193.
[48]
Fan Zhang, Yanqin Chen, Zhihang Li, Zhibin Hong, Jingtuo Liu, Feifei Ma, Junyu Han, and Errui Ding. 2019. Acfnet: Attentional class feature network for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6798--6807.
[49]
Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017b. Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2881--2890.
[50]
Yiru Zhao, Bing Deng, Chen Shen, Yao Liu, Hongtao Lu, and Xian-Sheng Hua. 2017a. Spatio-temporal autoencoder for video anomaly detection. In Proceedings of the 25th ACM international conference on Multimedia. 1933--1941.
[51]
Yuanhong Zhong, Xia Chen, Jinyang Jiang, and Fan Ren. 2022. A cascade reconstruction model with generalization ability evaluation for anomaly detection in videos. Pattern Recognition, Vol. 122 (2022), 108336.
[52]
Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, and Antonio Torralba. 2017. Scene parsing through ade20k dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition. 633--641.
[53]
Joey Tianyi Zhou, Le Zhang, Zhiwen Fang, Jiawei Du, Xi Peng, and Yang Xiao. 2019. Attention-driven loss for anomaly detection in video surveillance. IEEE transactions on circuits and systems for video technology, Vol. 30, 12 (2019), 4639--4647.

Cited By

View all
  • (2024)Cefdet: Cognitive Effectiveness Network Based on Fuzzy Inference for Action DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681226(7985-7994)Online publication date: 28-Oct-2024
  • (2024)Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep ModelsACM Computing Surveys10.1145/364510156:7(1-38)Online publication date: 9-Apr-2024
  • (2024)Cognition Guided Video Anomaly Detection Framework for Surveillance ServicesIEEE Transactions on Services Computing10.1109/TSC.2024.340758817:5(2109-2123)Online publication date: Sep-2024
  • Show More Cited By

Index Terms

  1. Hierarchical Scene Normality-Binding Modeling for Anomaly Detection in Surveillance Videos

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '22: Proceedings of the 30th ACM International Conference on Multimedia
    October 2022
    7537 pages
    ISBN:9781450392037
    DOI:10.1145/3503161
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. background
    2. foreground
    3. hierarchical modeling
    4. video anomaly detection

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • Program for Cheung Kong Scholars and Innovative Research Team in University
    • the State Key Program of National Natural Science of China
    • the National Key Research and Development Program of China
    • Fund for Foreign Scholars in University Research and Teaching Programs
    • Key Research and Development Program in Shaanxi Province of China
    • the Key Scientific Technological Innovation Research Project by Ministry of Education

    Conference

    MM '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)121
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Cefdet: Cognitive Effectiveness Network Based on Fuzzy Inference for Action DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681226(7985-7994)Online publication date: 28-Oct-2024
    • (2024)Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep ModelsACM Computing Surveys10.1145/364510156:7(1-38)Online publication date: 9-Apr-2024
    • (2024)Cognition Guided Video Anomaly Detection Framework for Surveillance ServicesIEEE Transactions on Services Computing10.1109/TSC.2024.340758817:5(2109-2123)Online publication date: Sep-2024
    • (2024)A Knowledge-Based Hierarchical Causal Inference Network for Video Action RecognitionIEEE Transactions on Multimedia10.1109/TMM.2024.338633926(9135-9149)Online publication date: 1-Jan-2024
    • (2024)AMP-Net: Appearance-Motion Prototype Network Assisted Automatic Video Anomaly Detection SystemIEEE Transactions on Industrial Informatics10.1109/TII.2023.329847620:2(2843-2855)Online publication date: Feb-2024
    • (2024)Multi-Grained Gradual Inference Model for Multimedia Event ExtractionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.340224234:10(10507-10520)Online publication date: Oct-2024
    • (2024)Context-aware Video Anomaly Detection in Long-Term Datasets2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00404(4002-4011)Online publication date: 17-Jun-2024
    • (2024)Fast video anomaly detection via context-aware shortcut exploration and abnormal feature distance learningPattern Recognition10.1016/j.patcog.2024.110877(110877)Online publication date: Aug-2024
    • (2024)Memory-enhanced spatial-temporal encoding framework for industrial anomaly detection systemExpert Systems with Applications10.1016/j.eswa.2024.123718250(123718)Online publication date: Sep-2024
    • (2023)Learning Causality-inspired Representation Consistency for Video Anomaly DetectionProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612393(203-212)Online publication date: 26-Oct-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media