More Web Proxy on the site http://driver.im/

research-article

Learning dual updatable memory modules for video anomaly detection: Learning dual updatable memory...

Authors:

Xiaoru LiuAuthors Info & Claims

Multimedia Systems, Volume 31, Issue 1

https://doi.org/10.1007/s00530-024-01597-1

Published: 05 December 2024 Publication History

Abstract

We propose a novel video anomaly detection method that leverages two updatable memory modules to learn and update prototypical patterns of normal and abnormal data within an autoencoder (AE) framework. To enhance the robustness of the model, we employ a pseudo anomaly synthesizer to generate synthetic anomalies from normal data, and train the AE to minimize the reconstruction loss on pseudo anomalies while maximizing it on normal data. The memory modules are optimized using a feature compactness loss and a separateness loss to refine the representation of details, and skip connections are incorporated to prevent the recording of only the most prototypical patterns. Additionally, a memory loss is proposed to enhance the distinction between the two memory modules, thereby enabling effective anomaly detection. Experimental results demonstrate the efficacy of our approach, underscoring the importance of the two updatable memory modules in achieving state-of-the-art performance in video anomaly detection. Our code is available at https://github.com/SVIL2024/Memup.git.

References

[1]

Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR), 1–58 (2009)

[2]

Chang, Y., Tu, Z., Xie, W., Yuan, J.: Clustering driven deep autoencoder for video anomaly detection. In: Proceedings of the European Conference on Computer Vision, pp. 329–345 (2020)

[3]

Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection–a new baseline. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6536–6545 (2018)

[4]

Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2720–2727 (2013)

[5]

Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked rnn framework. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 341–349 (2017)

[6]

Abati, D., Porrello, A., Calderara, S., Cucchiara, R.: Latent space autoregression for novelty detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 481–490 (2019)

[7]

Kumar, N., Segvic, S., Eslami, A., Gumhold, S.: Normalizing flow based feature synthesis for outlier-aware object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5156–5165 (2023)

[8]

Islam A, Long C, and Radke R A hybrid attention mechanism for weakly-supervised temporal action localization Proc. AAAI Conf. Artif. Intell. 2021 35 1637-1645

[9]

Zhang, J., Lin, X., Zhang, W., Wang, K., Tan, X., Han, J., Ding, E., Wang, J., Li, G.: Semi-detr: Semi-supervised object detection with detection transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 23809–23818 (2023)

[10]

Ma, J., Niu, Y., Xu, J., Huang, S., Han, G., Chang, S.-F.: Digeo: Discriminative geometry-aware learning for generalized few-shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3208–3218 (2023)

[11]

Lee, Y.-L., Tsai, Y.-H., Chiu, W.-C., Lee, C.-Y.: Multimodal prompting with missing modalities for visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14943–14952 (2023)

[12]

Wang, Y., Shi, B., Zhang, X., Li, J., Liu, Y., Dai, W., Li, C., Xiong, H., Tian, Q.: Adapting shortcut with normalizing flow: An efficient tuning framework for visual recognition. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15965–15974 (2023)

[13]

Jin, Y., Li, M., Lu, Y., Cheung, Y.-m., Wang, H.: Long-tailed visual recognition via self-heterogeneous integration with knowledge excavation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 23695–23704 (2023)

[14]

Li, J., Meng, Z., Shi, D., Song, R., Diao, X., Wang, J., Xu, H.: Fcc: Feature clusters compression for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 24080–24089 (2023)

[15]

Gu, J., Hu, C., Zhang, T., Chen, X., Wang, Y., Wang, Y., Zhao, H.: Vip3d: End-to-end visual trajectory prediction via 3d agent queries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5496–5506 (2023)

[16]

Mao, W., Xu, C., Zhu, Q., Chen, S., Wang, Y.: Leapfrog diffusion model for stochastic trajectory prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5517–5526 (2023)

[17]

Sun, J., Li, Y., Chai, L., Lu, C.: Stimulus verification is a universal and effective sampler in multi-modal human trajectory prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22014–22023 (2023)

[18]

Mo X, Huang Z, Xing Y, and Lv C Multi-agent trajectory prediction with heterogeneous edge-enhanced graph attention network IEEE Trans. Intell. Transp. Syst. 2022 23 7 9554-9567

Digital Library

[19]

Dessi, R., Bevilacqua, M., Gualdoni, E., Rakotonirina, N.C., Franzon, F., Baroni, M.: Cross-domain image captioning with discriminative finetuning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6935–6944 (2023)

[20]

Ramos, R., Martins, B., Elliott, D., Kementchedjhieva, Y.: Smallcap: lightweight image captioning prompted with retrieval augmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2840–2849 (2023)

[21]

Zeng, Z., Zhang, H., Lu, R., Wang, D., Chen, B., Wang, Z.: Conzic: Controllable zero-shot image captioning by sampling-based polishing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 23465–23476 (2023)

[22]

Liu, Z., Nie, Y., Long, C., Zhang, Q., Li, G.: A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13588–13597 (2021)

[23]

Park, H., Noh, J., Ham, B.: Learning memory-guided normality for anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14372–14381 (2020)

[24]

Yu, G., Wang, S., Cai, Z., Zhu, E., Xu, C., Yin, J., Kloft, M.: Cloze test helps: effective video anomaly detection via learning to complete video events. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 583–591 (2020)

[25]

Gong, D., Liu, L., Le, V., Saha, B., Mansour, M.R., Venkatesh, S., Hengel, A.v.d.: Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1705–1714 (2019)

[26]

Fan, Y., Wen, G., Li, D., Qiu, S., Levine, M.D., Xiao, F.: Video anomaly detection and localization via gaussian mixture fully convolutional variational autoencoder. Comput. Vis. Image Understand., 102920 (2020)

[27]

Astrid, M., Zaheer, M.Z., Lee, S.-I.: Synthetic temporal anomaly guided end-to-end video anomaly detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 207–214 (2021)

[28]

Munawar, A., Vinayavekhin, P., De Magistris, G.: Limiting the reconstruction capability of generative neural network using negative learning. In: 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6 (2017). IEEE

[29]

Zaheer, M.Z., Lee, J.-H., Astrid, M., Lee, S.-I.: Old is gold: Redefining the adversarially learned one-class classifier training paradigm. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14183–14193 (2020)

[30]

Zong, B., Song, Q., Min, M.R., Cheng, W., Lumezanu, C., Cho, D., Chen, H.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In: International Conference on Learning Representations (2018)

[31]

Hochreiter S and Schmidhuber J Long short-term memory Neural Comput. 1997 9 8 1735-1780

Digital Library

[32]

Jason Weston, S.C., Borde, A.: Memory networks. In: International Conference on Learning Representations (2015)

[33]

Sukhbaatar, S., szlam, a., Weston, J., Fergus, R.: End-to-end memory networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)

[34]

Georgescu MI, Ionescu RT, Khan FS, Popescu M, and Shah M A background-agnostic framework with adversarial training for abnormal event detection in video IEEE Trans. Pattern Anal. Mach. Intell. 2021 44 9 4505-4523

[35]

Huang, X., Zhao, C., Gao, C., Chen, L., Wu, Z.: Synthetic pseudo anomalies for unsupervised video anomaly detection: a simple yet efficient framework based on masked autoencoder. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1–5 (2023)

[36]

Lappas, D., Argyriou, V., Makris, D.: Dynamic distinction learning: adaptive pseudo anomalies for video anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3961–3970 (2024)

[37]

Ionescu, R.T., Khan, F.S., Georgescu, M.-I., Shao, L.: Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7842–7851 (2019)

[38]

Basharat, A., Gritai, A., Shah, M.: Learning object motion patterns for anomaly detection and improved object detection. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)

[39]

Piciarelli C, Micheloni C, and Foresti GL Trajectory-based anomalous event detection IEEE Trans. Circ. Syst. Video Technol. 2008 18 11 1544-1554

Digital Library

[40]

Zhang, T., Lu, H., Li, S.Z.: Learning semantic scene models by object classification and trajectory clustering. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1940–1947 (2009)

[41]

Kiran, B.R., Thomas, D.M., Parakkal, R.: An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos. J. Imaging 36 (2018)

[42]

Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 733–742 (2016)

[43]

Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.-S.: Spatio-temporal autoencoder for video anomaly detection. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1933–1941 (2017)

[44]

Luo, W., Liu, W., Gao, S.: Remembering history with convolutional lstm for anomaly detection. In: 2017 IEEE International Conference on Multimedia and Expo, pp. 439–444 (2017)

[45]

Lu, Y., Kumar, K.M., Nabavi, S., Wang, Y.: Future frame prediction using convolutional vrnn for anomaly detection. In: 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 1–8 (2019)

[46]

Ristea, N.-C., Croitoru, F.-A., Ionescu, R.T., Popescu, M., Khan, F.S., Shah, M., : Self-distilled masked auto-encoders are efficient video anomaly detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15984–15995 (2024)

[47]

Munawar, A., Vinayavekhin, P., De Magistris, G.: Limiting the reconstruction capability of generative neural network using negative learning. In: 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6 (2017)

[48]

Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4489–4497 (2015)

[49]

Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)

[50]

Maas, A.L., Hannun, A.Y., Ng, A.Y., : Rectifier nonlinearities improve neural network acoustic models. In: Proc. Icml, vol. 30, p. 3 (2013)

[51]

Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1520–1528 (2015)

[52]

Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)

[53]

Zaheer, M.Z., Mahmood, A., Astrid, M., Lee, S.-I.: Claws: Clustering assisted weakly supervised learning with normalcy suppression for anomalous event detection. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXII 16, pp. 358–376 (2020)

[54]

Mathieu, M., Couprie, C., Lecun, Y.: Deep multi-scale video prediction beyond mean square error. In: International Conference on Learning Representations (2016)

[55]

Li W, Mahadevan V, and Vasconcelos N Anomaly detection and localization in crowded scenes IEEE Trans. Pattern Anal. Mach. Intell. 2013 36 1 18-32

Digital Library

[56]

Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2720–2727 (2013)

[57]

Tudor Ionescu, R., Smeureanu, S., Alexe, B., Popescu, M.: Unmasking the abnormal events in video. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2895–2903 (2017)

[58]

Liu, Z., Wu, X.-M., Zheng, D., Lin, K.-Y., Zheng, W.-S.: Generating anomalies for video anomaly detection with prompt-based feature mapping. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 24500–24510 (2023)

[59]

Habeb MH, Salama M, and Elrefaei LA Enhancing video anomaly detection using a transformer spatiotemporal attention unsupervised framework for large datasets Algorithms 2024 17 7 286

Index Terms

Learning dual updatable memory modules for video anomaly detection: Learning dual updatable memory...

Index terms have been assigned to the content through auto-classification.

Recommendations

Enhancing video anomaly detection with learnable memory network: A new approach to memory-based auto-encoders
Abstract
The aim of video anomaly detection is to detect anomalous events in a video sequence. In an unsupervised setting, enhancing detection accuracy hinges on the ability to learn normal features during the training phase and subsequently generate ...
Highlights
- We modify the memory module by using transformer architecture.
- The training procedure will generate adaptive memory items.
- Our memory module is efficient and plug-and-play.
- The results prove the generated memory items contain ...
Semantic-driven dual consistency learning for weakly supervised video anomaly detection
Abstract
Video anomaly detection presents a significant challenge in computer vision, with the aim of distinguishing various anomaly events from numerous normal ones. Weakly supervised video anomaly detection has recently emerged as a promising solution, ...
Highlights
- In this paper, we propose a weakly supervised paradigm of cross-modal detection and consistency learning, leveraging dual consistency to provide discriminative representations for anomalies at both the semantic-to-target and target-to-...
An integration of Pseudo Anomalies and Memory Augmented Autoencoder for Video Anomaly Detection
SoICT '22: Proceedings of the 11th International Symposium on Information and Communication Technology

Video anomaly detection (VAD) has received a lot of attention from the research community in recent years. The purpose of VAD is to identify the anomalous appearance and behavior of objects in videos. Due to the difficulty in collecting anomalous data, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Multimedia Systems

Multimedia Systems Volume 31, Issue 1

Feb 2025

1408 pages

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 05 December 2024

Accepted: 22 November 2024

Received: 11 June 2024

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Science and Technology Research Project of the Department of Education of Liaoning Province
Social Science Planning Fund of Liaoning Province

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents