[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Selected Space-Time Based Methods for Action Recognition

  • Conference paper
Intelligent Information and Database Systems (ACIIDS 2016)

Abstract

A survey on very recent and efficient space-time methods for action recognition is presented. We select the methods with highest accuracy achieved on the challenging datasets such as: HMDB51, UCF101 and Hollywood2. This research focuses on two main space-time based approaches, namely the hand-crafted and deep learning features. We intuitively explain the selected pipelines and review good practices used in state-of-the-art methods including the best descriptors, encoding methods, deep architectures and classifiers. The best methods were chosen and some of them were explained in more details. Furthermore, we conclude how to improve the methods in speed as well as in accuracy and propose directions for further work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Jain, M., Gemert, J.C., Snoek, C.G.M.: What do 15,000 object categories tell us about classifying and localizing actions? In: CVPR, pp. 46–55 (2015)

    Google Scholar 

  2. Peng, X., Zou, C.Q., Qiao, Y., Peng, Q.: Action recognition with stacked fisher vectors. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 581–595. Springer, Heidelberg (2014)

    Google Scholar 

  3. Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. In: CVPR, pp. 4305–4314 (2015)

    Google Scholar 

  4. Ni, B., Moulin, P., Yang, X., Yan, S.: Motion part regularization: improving action recognition via trajectory group selection. In: CVPR, pp. 3698–3706 (2015)

    Google Scholar 

  5. Lan, Z., Lin, X., Li, X., Hauptmann, A.G., Raj, B.: Beyond gaussian pyramid: multi-skip feature stacking for action recognition. In: CVPR, pp. 204–212 (2015)

    Google Scholar 

  6. Fernando, B., Gavves, E., Oramas, J., Ghodrati, A., Tuytelaars, T.: Modeling video evolution for action recognition. In: CVPR, pp. 5378–5387 (2015)

    Google Scholar 

  7. Shi, F., Laganiere, R., Petriu, E.: Gradient boundary histograms for action recognition. In: WACV, pp. 1107–1114 (2015)

    Google Scholar 

  8. Peng, X., Wang, L., Wang, X., Qiao, Y.: Bag of visual words and fusion methods for action recognition: comprehensive study and good practice. In: arXiv preprint arxiv:1405.4506 [cs.CV] (2014)

  9. Wang, H., Oneata, D., Verbeek, J., Schmid, C.: A robust and efficient video representation for action recognition. In: arXiv preprint arxiv:1504.05524 [cs.CV] (2015)

    Google Scholar 

  10. Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: NIPS, pp. 568–576 (2014)

    Google Scholar 

  11. Wang, H., Schmid, C.: Action recognition with improved trajectories. In: ICCV, pp. 3551–3558 (2013)

    Google Scholar 

  12. Liu, L., Shao, L., Li, X., Lu, K.: Learning spatio-temporal representations for action recognition: a genetic programming approach. IEEE Trans. Cybern. 46, 158–170 (2015)

    Article  Google Scholar 

  13. Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: CVPR, pp. 3169–3176 (2011)

    Google Scholar 

  14. Wang, L., Xiong, Y., Wang, Z., Qiao, Y.: Towards good practices for very deep two-stream ConvNets. In: arXiv preprint arxiv:1507.02159 [cs.CV] (2015)

  15. Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: arXiv preprint arxiv:1412.0767 [cs.CV] (2015)

  16. Xu, Z., Yang, Y., Hauptmann, A.G.: A discriminative CNN video representation for event detection. In: CVPR, pp. 1798–1807 (2015)

    Google Scholar 

  17. Xu, Z., Zhu, L., Yang, Y., Hauptmann, A.G.: UTS-CMU at THUMOS2015. In: THUMOS challenge 2015 (2015)

    Google Scholar 

  18. Gorban, A., Idrees, H., Jiang, Y.-G., Roshan Zamir, A., Laptev, I., Shah, M., Sukthankar, R.: THUMOS challenge: action recognition with large number of classes (2015). http://www.thumos.info/

  19. Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 16 (2011)

    Article  Google Scholar 

  20. Ke, S.-R., Thuc, H.L.U., Lee, Y.-J., Hwang, J.-N., Yoo, J.-H., Choi, K.-H.: A review on video-based human activity recognition. Computers 2(2), 88–131 (2013)

    Article  Google Scholar 

  21. Cheng, G., Wan, Y., Saudagar, A.N., Namuduri, K., Buckles, B.P.: Advances in human action recognition: a survey. In: arXiv preprint arxiv:1501.05964 [cs.CV] (2015)

  22. Wang, H., Ullah, M.M., Kläser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC, pp. 124.1–124.11 (2009)

    Google Scholar 

  23. Uijlings, J., Duta, I.C., Sangineto, E., Sebe, N.: Video classification with densely extracted HOG/HOF/MBH features: an evaluation of the accuracy/computational efficiency trade-off. IJMIR 4(1), 33–44 (2014)

    Google Scholar 

  24. Oneata, D., Verbeek, J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: ICCV, pp. 1817–1824 (2013)

    Google Scholar 

  25. Kantorov, V., Laptev, I.: Efficient feature extraction, encoding and classification for action recognition. In: CVPR, pp. 2593–2600 (2014)

    Google Scholar 

  26. Laptev, I., Lindeberg, T.: Space-time interest points. In: ICCV, pp. 432–439 (2003)

    Google Scholar 

  27. Shi, F., Petriu, E.M., Cordeiro, A.: Human action recognition from local part model. In: Proceedings of the IEEE International Haptic Audio Visual Environments and Games (HAVE) Workshop, pp. 35–38 (2011)

    Google Scholar 

  28. Shi, F., Petriu, E., Laganiere, R.: Sampling strategies for real-time action recognition. In: CVPR, pp. 2595–2602 (2013)

    Google Scholar 

  29. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, T., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: CVPR, pp. 1725–1732 (2014)

    Google Scholar 

  30. Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: ICCV, pp. 1550–5499 (2011)

    Google Scholar 

  31. Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR, pp. 2929–2936 (2009)

    Google Scholar 

  32. Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. In: arXiv preprint arxiv:1212.0402 [cs.CV] (2012)

  33. Berg, A., Deng, J., Fei-Fei, L.: Large scale visual recognition challenge (ILSVRC) (2010). http://www.image-net.org/challenges/LSVRC/2010

Download references

Acknowledgments

This work has been supported by the National Centre for Research and Development (project UOD-DEM-1-183/001 “Intelligent video analysis system for behavior and event recognition in surveillance networks”).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marek Kulbacki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wojciechowski, S. et al. (2016). Selected Space-Time Based Methods for Action Recognition. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, TP. (eds) Intelligent Information and Database Systems. ACIIDS 2016. Lecture Notes in Computer Science(), vol 9622. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49390-8_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-49390-8_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-49389-2

  • Online ISBN: 978-3-662-49390-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics