[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Deep-Cascade: Cascading 3D Deep Neural Networks for Fast Anomaly Detection and Localization in Crowded Scenes

Published: 01 April 2017 Publication History

Abstract

This paper proposes a fast and reliable method for anomaly detection and localization in video data showing crowded scenes. Time-efficient anomaly localization is an ongoing challenge and subject of this paper. We propose a cubic-patch-based method, characterised by a cascade of classifiers, which makes use of an advanced feature-learning approach. Our cascade of classifiers has two main stages. First, a light but deep 3D auto-encoder is used for early identification of “many” normal cubic patches. This deep network operates on small cubic patches as being the first stage, before carefully resizing the remaining candidates of interest, and evaluating those at the second stage using a more complex and deeper 3D convolutional neural network (CNN). We divide the deep auto-encoder and the CNN into multiple sub-stages, which operate as cascaded classifiers. Shallow layers of the cascaded deep networks (designed as Gaussian classifiers, acting as weak single-class classifiers) detect “simple” normal patches, such as background patches and more complex normal patches, are detected at deeper layers. It is shown that the proposed novel technique (a cascade of two cascaded classifiers) performs comparable to current top-performing detection and localization methods on standard benchmarks, but outperforms those in general with respect to required computation time.

References

[1]
Y. Yang, G. Shu, and M. Shah, “Semi-supervised learning of feature hierarchies for object detection in a video,” in Proc. CVPR, Jun. 2013, pp. 1650–1657.
[2]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1097–1105.
[3]
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proc. CVPR, 2014, pp. 580–587.
[4]
K. Simonyan and A. Zisserman, “Two-stream convolutional networks for action recognition in videos,” in Proc. Adv. Neural Inf. Process. Syst., 2014, pp. 568–576.
[5]
A. Giusti, D. C. Ciresan, J. Masci, L. M. Gambardella, and J. Schmidhuber, “Fast image scanning with deep max-pooling convolutional neural networks,” in Proc. ICIP, Sep. 2013, pp. 4034–4038.
[6]
R. Klette, Concise Computer Vision . London, U.K.: Springer, 2014.
[7]
P. Luo, Y. Tian, X. Wang, and X. Tang, “Switchable deep network for pedestrian detection,” in Proc. CVPR, Jun. 2014, pp. 899–906.
[8]
A. Angelova, A. Krizhevsky, V. Vanhoucke, A. Ogale, and D. Ferguson, “Real-time pedestrian detection with deep network cascades,” in Proc. BMVC, 2015, pp. 1–12.
[9]
W. Ouyang and X. Wang, “Joint deep learning for pedestrian detection,” in Proc. ICCV, Dec. 2013, pp. 2056–2063.
[10]
H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, “A convolutional neural network cascade for face detection,” in Proc. CVPR, Jun. 2015, pp. 5325–5334.
[11]
F. Jiang, J. Yuan, S. A. Tsaftaris, and A. K. Katsaggelos, “Anomalous video event detection using spatiotemporal context,” Comput. Vis. Image Understand., vol. Volume 115, no. Issue 3, pp. 323–333, 2011.
[12]
S. Wu, B. E. Moore, and M. Shah, “Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes,” in Proc. CVPR, Jun. 2010, pp. 2054–2060.
[13]
C. Piciarelli, C. Micheloni, and G. L. Foresti, “Trajectory-based anomalous event detection,” IEEE Trans. Circuits Syst. Video Technol., vol. Volume 18, no. Issue 11, pp. 1544–1554, 2008.
[14]
C. Piciarelli and G. L. Foresti, “On-line trajectory clustering for anomalous events detection,” Pattern Recognit. Lett., vol. Volume 27, no. Issue 15, pp. 1835–1842, 2006.
[15]
F. Tung, J. S. Zelek, and D. A. Clausi, “Goal-based trajectory analysis for unusual behaviour detection in intelligent surveillance,” Image Vis. Comput., vol. Volume 29, no. Issue 4, pp. 230–240, 2011.
[16]
W. Hu, X. Xiao, Z. Fu, D. Xie, T. Tan, and S. Maybank, “A system for learning statistical motion patterns,” IEEE Trans. Pattern Anal. Mach. Intell., vol. Volume 28, no. Issue 9, pp. 1450–1464, 2006.
[17]
B. T. Morris and M. M. Trivedi, “Trajectory learning for activity understanding: Unsupervised, multilevel, and long-term adaptive approach,” IEEE Trans. Pattern Anal. Mach. Intell., vol. Volume 33, no. Issue 11, pp. 2287–2301, 2011.
[18]
S. Calderara, U. Heinemann, A. Prati, R. Cucchiara, and N. Tishby, “Detecting anomalies in people's trajectories using spectral graph analysis,” Comput. Vis. Image Understand., vol. Volume 115, no. Issue 8, pp. 1099–1111, 2011.
[19]
P. Antonakaki, D. Kosmopoulos, and S. J. Perantonis, “Detecting abnormal human behaviour using multiple cameras,” Signal Process., vol. Volume 89, no. Issue 9, pp. 1723–1738, 2009.
[20]
V. Mahadevan, W. Li, V. Bhalodia, and N. Vasconcelos, “Anomaly detection in crowded scenes,” in Proc. CVPR, 2010, pp. 1975–1981.
[21]
UCSD Dataset, access on <day>1</day>, 2016. {Online}. Available: http://www.svcl.ucsd.edu/projects/anomaly/dataset.html
[22]
UMN Dataset, access on <day>1</day>, 2016. {Online}. Available: http://mha.cs.umn.edu/Movies/Crowd-Activity-All.avi
[23]
A. Adam, E. Rivlin, I. Shimshoni, and D. Reinitz, “Robust real-time unusual event detection using multiple fixed-location monitors,” IEEE Trans. Pattern Anal. Mach. Intell., vol. Volume 30, no. Issue 3, pp. 555–560, 2008.
[24]
V. Saligrama and Z. Chen, “Video anomaly detection based on local statistical aggregates,” in Proc. CVPR, Jun. 2012, pp. 2112–2119.
[25]
Y. Benezeth, P.-M. Jodoin, V. Saligrama, and C. Rosenberger, “Abnormal events detection based on spatio-temporal co-occurences,” in Proc. CVPR, Jun. 2009, pp. 1446–1453.
[26]
J. Kim and K. Grauman, “Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates,” in Proc. CVPR, Jun. 2009, pp. 2921–2928.
[27]
L. Kratz and K. Nishino, “Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models,” in Proc. CVPR, Jun. 2009, pp. 1446–1453.
[28]
D. Zhang, D. Gatica-Perez, S. Bengio, and I. McCowan, “Semi-supervised adapted HMMs for unusual event detection,” in Proc. CVPR, Jun. 2005, pp. 611–618.
[29]
R. Mehran, A. Oyama, and M. Shah, “Abnormal crowd behavior detection using social force model,” in Proc. CVPR, Jun. 2009, pp. 935–942.
[30]
X. Wang, X. Ma, and E. Grimson, “Unsupervised activity perception by hierarchical Bayesian models,” in Proc. CVPR, Jun. 2007, pp. 1–8.
[31]
W. Li, V. Mahadevan, and N. Vasconcelos, “Anomaly detection and localization in crowded scenes,” IEEE Trans. Pattern Anal. Mach. Intell., vol. Volume 36, no. Issue 1, pp. 18–32, 2014.
[32]
Y. Yuan, J. Fang, and Q. Wang, “Online anomaly detection in crowd scenes via structure analysis,” IEEE Trans. Cybern., vol. Volume 45, no. Issue 3, pp. 548–561, 2015.
[33]
A. Zaharescu and R. Wildes, “Anomalous behaviour detection using spatiotemporal oriented energies, subset inclusion histogram comparison and event-driven processing,” in Proc. ECCV, 2010, pp. 563–576.
[34]
O. Boiman and M. Irani, “Detecting irregularities in images and in video,” Int. J. Comput. Vis., vol. Volume 74, pp. 17–31, 2007.
[35]
Y. Cong, J. Yuan, and J. Liu, “Sparse reconstruction cost for abnormal event detection,” in Proc. CVPR, Jun. 2011, pp. 3449–3456.
[36]
Y. Cong, J. Yuan, and Y. Tang, “Video anomaly search in crowded scenes via spatio-temporal motion context,” IEEE Trans. Inf. Forensics Security, vol. Volume 8, no. Issue 10, pp. 1590–1599, 2013.
[37]
C. Lu, J. Shi, and J. Jia, “Abnormal event detection at 150 FPS in MATLAB,” in Proc. ICCV, Dec. 2013, pp. 2720–2727.
[38]
B. Antic and B. Ommer, “Video parsing for abnormality detection,” in Proc. ICCV, Nov. 2011, pp. 2415–2422.
[39]
M. J. Roshtkhari and M. D. Levine, “An on-line, real-time learning method for detecting anomalies in videos using spatio-temporal compositions,” Comput. Vis. Image Understand., vol. Volume 117, no. Issue 10, pp. 1436–1452, 2013.
[40]
M. J. Roshtkhari and M. D. Levine, “Online dominant and anomalous behavior detection in videos,” in Proc. CVPR, Jun. 2013, pp. 2611–2618.
[41]
Y. Zhu, N. M. Nayak, and A. K. Roy-Chowdhury, “Context-aware modeling and recognition of activities in video,” in Proc. CVPR, 2013, pp. 2491–2498.
[42]
M. Sabokrou, M. Fathy, M. Hoseini, and R. Klette, “Real-time anomaly detection and localization in crowded scenes,” in Proc. CVPR Workshops, Jun. 2015, pp. 56–62.
[43]
D. Xu, R. Song, X. Wu, N. Li, W. Feng, and H. Qian, “Video anomaly detection based on a hierarchical activity discovery within spatio-temporal contexts,” Neurocomputing, vol. Volume 143, pp. 144–152, 2014.
[44]
S. Kullback and R. A. Leibler, “On information and sufficiency,” Ann. Math. Statist., vol. Volume 22, no. Issue 1, pp. 79–86, 1951.
[45]
P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, “Extracting and composing robust features with denoising autoencoders,” in Proc. Int. ACM Conf. Mach. Learn., 2008, pp. 1096–1103.
[46]
D. Xu, E. Ricci, Y. Yan, J. Song, N. Sebe, and F. B. Kessler, “Learning deep representations of appearance and motion for anomalous event detection,” in Proc. BMVC, 2015, pp. 8.1–8.12.
[47]
K. He and J. Sun, “Convolutional neural networks at constrained time cost,” in Proc. CVPR, Jun. 2015, pp. 5353–5360.
[48]
M. Ranzato, F. J. Huang, Y.-L. Boureau, and Y. LeCun, “Unsupervised learning of invariant feature hierarchies with applications to object recognition,” in Proc. CVPR, Jun. 2007, pp. 1–8.
[49]
J. Wu, S. C. Brubaker, M. D. Mullin, and J. M. Rehg, “Fast asymmetric learning for cascade face detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. Volume 30, no. Issue 3, pp. 369–382, 2008.
[50]
T. Xiao, C. Zhang, and H. Zha, “Learning to detect anomalies in surveillance video,” IEEE Signal Process. Lett., vol. Volume 22, no. Issue 9, pp. 1477–1481, 2015.
[51]
M. Sabokrou, M. Fathy, and M. Hoseini, “Video anomaly detection and localization based on the sparsity and reconstruction error of autoencoder,” IET Electron. Lett., vol. Volume 52, no. Issue 13, pp. 1122–1124, 2016.
[52]
V. Reddy, C. Sanderson, and B. C. Lovell, “Improved anomaly detection in crowded scenes via cell-based analysis of foreground speed, size and texture,” in Proc. CVPR Workshops, Jun. 2011, pp. 55–61.
[53]
M. Bertini, A. Del Bimbo, and L. Seidenari, “Multi-scale and real-time non-parametric approach for anomaly detection and localization,” Comput. Vis. Image Understand., vol. Volume 116, no. Issue 3, pp. 320–329, 2012.
[54]
H. Mousavi, M. Nabi, H. K. Galoogahi, A. Perina, and V. Murino, “Abnormality detection with improved histogram of oriented tracklets,” in Proc. ICIAP, 2015, pp. 722–732.

Cited By

View all
  • (2024)TDSD: Text-Driven Scene-Decoupled Weakly Supervised Video Anomaly DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680934(5055-5064)Online publication date: 28-Oct-2024
  • (2024)Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep ModelsACM Computing Surveys10.1145/364510156:7(1-38)Online publication date: 9-Apr-2024
  • (2024)Self-Supervised Masked Convolutional Transformer Block for Anomaly DetectionIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332260446:1(525-542)Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Image Processing
IEEE Transactions on Image Processing  Volume 26, Issue 4
April 2017
526 pages

Publisher

IEEE Press

Publication History

Published: 01 April 2017

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)TDSD: Text-Driven Scene-Decoupled Weakly Supervised Video Anomaly DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680934(5055-5064)Online publication date: 28-Oct-2024
  • (2024)Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep ModelsACM Computing Surveys10.1145/364510156:7(1-38)Online publication date: 9-Apr-2024
  • (2024)Self-Supervised Masked Convolutional Transformer Block for Anomaly DetectionIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332260446:1(525-542)Online publication date: 1-Jan-2024
  • (2024)DSS-Net: Dynamic Self-Supervised Network for Video Anomaly DetectionIEEE Transactions on Multimedia10.1109/TMM.2023.329259626(2124-2136)Online publication date: 1-Jan-2024
  • (2024)Toward Video Anomaly Retrieval From Video Anomaly Detection: New Benchmarks and ModelIEEE Transactions on Image Processing10.1109/TIP.2024.337407033(2213-2225)Online publication date: 18-Mar-2024
  • (2024)Context Recovery and Knowledge Retrieval: A Novel Two-Stream Framework for Video Anomaly DetectionIEEE Transactions on Image Processing10.1109/TIP.2024.337246633(1810-1825)Online publication date: 1-Jan-2024
  • (2024)Vision-based Human Fall Detection SystemsProcedia Computer Science10.1016/j.procs.2024.08.028241:C(203-211)Online publication date: 18-Nov-2024
  • (2024)Y-GANExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123410248:COnline publication date: 15-Aug-2024
  • (2024)Crowd behavior detection: leveraging video swin transformer for crowd size and violence level analysisApplied Intelligence10.1007/s10489-024-05775-654:21(10709-10730)Online publication date: 1-Nov-2024
  • (2024)Video anomaly detection based on attention and efficient spatio-temporal feature extractionThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-024-03361-y40:10(6825-6841)Online publication date: 1-Oct-2024
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media