Selected Space-Time Based Methods for Action Recognition

Sławomir Wojciechowski⁸,
Marek Kulbacki⁸,
Jakub Segen⁸,
Rafał Wyciślok⁸,
Artur Bąk⁸,
Kamil Wereszczyński^8,9 &
…
Konrad Wojciechowski⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9622))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

Abstract

A survey on very recent and efficient space-time methods for action recognition is presented. We select the methods with highest accuracy achieved on the challenging datasets such as: HMDB51, UCF101 and Hollywood2. This research focuses on two main space-time based approaches, namely the hand-crafted and deep learning features. We intuitively explain the selected pipelines and review good practices used in state-of-the-art methods including the best descriptors, encoding methods, deep architectures and classifiers. The best methods were chosen and some of them were explained in more details. Furthermore, we conclude how to improve the methods in speed as well as in accuracy and propose directions for further work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Second-order motion descriptors for efficient action recognition

Article 28 October 2020

Hierarchical Gaussian descriptor based on local pooling for action recognition

Article 12 November 2018

Action recognition with multi-scale trajectory-pooled 3D convolutional descriptors

Article 05 October 2017

References

Jain, M., Gemert, J.C., Snoek, C.G.M.: What do 15,000 object categories tell us about classifying and localizing actions? In: CVPR, pp. 46–55 (2015)
Google Scholar
Peng, X., Zou, C.Q., Qiao, Y., Peng, Q.: Action recognition with stacked fisher vectors. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 581–595. Springer, Heidelberg (2014)
Google Scholar
Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. In: CVPR, pp. 4305–4314 (2015)
Google Scholar
Ni, B., Moulin, P., Yang, X., Yan, S.: Motion part regularization: improving action recognition via trajectory group selection. In: CVPR, pp. 3698–3706 (2015)
Google Scholar
Lan, Z., Lin, X., Li, X., Hauptmann, A.G., Raj, B.: Beyond gaussian pyramid: multi-skip feature stacking for action recognition. In: CVPR, pp. 204–212 (2015)
Google Scholar
Fernando, B., Gavves, E., Oramas, J., Ghodrati, A., Tuytelaars, T.: Modeling video evolution for action recognition. In: CVPR, pp. 5378–5387 (2015)
Google Scholar
Shi, F., Laganiere, R., Petriu, E.: Gradient boundary histograms for action recognition. In: WACV, pp. 1107–1114 (2015)
Google Scholar
Peng, X., Wang, L., Wang, X., Qiao, Y.: Bag of visual words and fusion methods for action recognition: comprehensive study and good practice. In: arXiv preprint arxiv:1405.4506 [cs.CV] (2014)
Wang, H., Oneata, D., Verbeek, J., Schmid, C.: A robust and efficient video representation for action recognition. In: arXiv preprint arxiv:1504.05524 [cs.CV] (2015)
Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: NIPS, pp. 568–576 (2014)
Google Scholar
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: ICCV, pp. 3551–3558 (2013)
Google Scholar
Liu, L., Shao, L., Li, X., Lu, K.: Learning spatio-temporal representations for action recognition: a genetic programming approach. IEEE Trans. Cybern. 46, 158–170 (2015)
Article Google Scholar
Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: CVPR, pp. 3169–3176 (2011)
Google Scholar
Wang, L., Xiong, Y., Wang, Z., Qiao, Y.: Towards good practices for very deep two-stream ConvNets. In: arXiv preprint arxiv:1507.02159 [cs.CV] (2015)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: arXiv preprint arxiv:1412.0767 [cs.CV] (2015)
Xu, Z., Yang, Y., Hauptmann, A.G.: A discriminative CNN video representation for event detection. In: CVPR, pp. 1798–1807 (2015)
Google Scholar
Xu, Z., Zhu, L., Yang, Y., Hauptmann, A.G.: UTS-CMU at THUMOS2015. In: THUMOS challenge 2015 (2015)
Google Scholar
Gorban, A., Idrees, H., Jiang, Y.-G., Roshan Zamir, A., Laptev, I., Shah, M., Sukthankar, R.: THUMOS challenge: action recognition with large number of classes (2015). http://www.thumos.info/
Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 16 (2011)
Article Google Scholar
Ke, S.-R., Thuc, H.L.U., Lee, Y.-J., Hwang, J.-N., Yoo, J.-H., Choi, K.-H.: A review on video-based human activity recognition. Computers 2(2), 88–131 (2013)
Article Google Scholar
Cheng, G., Wan, Y., Saudagar, A.N., Namuduri, K., Buckles, B.P.: Advances in human action recognition: a survey. In: arXiv preprint arxiv:1501.05964 [cs.CV] (2015)
Wang, H., Ullah, M.M., Kläser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC, pp. 124.1–124.11 (2009)
Google Scholar
Uijlings, J., Duta, I.C., Sangineto, E., Sebe, N.: Video classification with densely extracted HOG/HOF/MBH features: an evaluation of the accuracy/computational efficiency trade-off. IJMIR 4(1), 33–44 (2014)
Google Scholar
Oneata, D., Verbeek, J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: ICCV, pp. 1817–1824 (2013)
Google Scholar
Kantorov, V., Laptev, I.: Efficient feature extraction, encoding and classification for action recognition. In: CVPR, pp. 2593–2600 (2014)
Google Scholar
Laptev, I., Lindeberg, T.: Space-time interest points. In: ICCV, pp. 432–439 (2003)
Google Scholar
Shi, F., Petriu, E.M., Cordeiro, A.: Human action recognition from local part model. In: Proceedings of the IEEE International Haptic Audio Visual Environments and Games (HAVE) Workshop, pp. 35–38 (2011)
Google Scholar
Shi, F., Petriu, E., Laganiere, R.: Sampling strategies for real-time action recognition. In: CVPR, pp. 2595–2602 (2013)
Google Scholar
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, T., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: CVPR, pp. 1725–1732 (2014)
Google Scholar
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: ICCV, pp. 1550–5499 (2011)
Google Scholar
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR, pp. 2929–2936 (2009)
Google Scholar
Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. In: arXiv preprint arxiv:1212.0402 [cs.CV] (2012)
Berg, A., Deng, J., Fei-Fei, L.: Large scale visual recognition challenge (ILSVRC) (2010). http://www.image-net.org/challenges/LSVRC/2010

Download references

Acknowledgments

This work has been supported by the National Centre for Research and Development (project UOD-DEM-1-183/001 “Intelligent video analysis system for behavior and event recognition in surveillance networks”).

Author information

Authors and Affiliations

Polish-Japanese Academy of Information Technology, Koszykowa 86, 02-008, Warszawa, Poland
Sławomir Wojciechowski, Marek Kulbacki, Jakub Segen, Rafał Wyciślok, Artur Bąk, Kamil Wereszczyński & Konrad Wojciechowski
Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100, Gliwice, Poland
Kamil Wereszczyński

Authors

Sławomir Wojciechowski
View author publications
You can also search for this author in PubMed Google Scholar
Marek Kulbacki
View author publications
You can also search for this author in PubMed Google Scholar
Jakub Segen
View author publications
You can also search for this author in PubMed Google Scholar
Rafał Wyciślok
View author publications
You can also search for this author in PubMed Google Scholar
Artur Bąk
View author publications
You can also search for this author in PubMed Google Scholar
Kamil Wereszczyński
View author publications
You can also search for this author in PubMed Google Scholar
Konrad Wojciechowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marek Kulbacki .

Editor information

Editors and Affiliations

Wrocław University of Technology, Wrocław, Poland
Ngoc Thanh Nguyen
Wrocław University of Technology, Wrocław, Poland
Bogdan Trawiński
Iwate Prefectural University, Takizawa, Japan
Hamido Fujita
National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wojciechowski, S. et al. (2016). Selected Space-Time Based Methods for Action Recognition. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, TP. (eds) Intelligent Information and Database Systems. ACIIDS 2016. Lecture Notes in Computer Science(), vol 9622. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49390-8_41

Download citation

DOI: https://doi.org/10.1007/978-3-662-49390-8_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-49389-2
Online ISBN: 978-3-662-49390-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics