[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-319-42996-0_10guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Joint Recognition and Segmentation of Actions via Probabilistic Integration of Spatio-Temporal Fisher Vectors

Published: 19 April 2016 Publication History

Abstract

We propose a hierarchical approach to multi-action recognition that performs joint classification and segmentation. Aï źgiven video containing several consecutive actions is processed via a sequence of overlapping temporal windows. Each frame in a temporal window is represented through selective low-level spatio-temporal features which efficiently capture relevant local dynamics. Features from each window are represented as a Fisher vector, which captures first and second order statistics. Instead of directly classifying each Fisher vector, it is converted into a vector of class probabilities. The final classification decision for each frame is then obtained by integrating the class probabilities at the frame level, which exploits the overlapping of the temporal windows. Experiments were performed on two datasets: s-KTH aï źstitched version of the KTH dataset to simulate multi-actions, and the challenging CMU-MMAC dataset. On s-KTH, the proposed approach achieves an accuracy of 85.0ï ź%, significantly outperforming two recent approaches based on GMMs and HMMs which obtained 78.3ï ź% and 71.2ï ź%, respectively. On CMU-MMAC, the proposed approach achieves an accuracy of 40.9ï ź%, outperforming the GMM and HMM approaches which obtained 33.7ï ź% and 38.4ï ź%, respectively. Furthermore, the proposed system is on average 40 times faster than the GMM based approach.

References

[1]
Buchsbaum, D., Canini, K.R., Griffiths, T.: Segmenting and recognizing human action using low-level video features. In: Annual Conference of the Cognitive Science Society 2011
[2]
Hoai, M., Lan, Z.Z., De la Torre, F.: Joint segmentation and classification of human actions in video. In: IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 3265---3272 2011
[3]
Shi, Q., Wang, L., Cheng, L., Smola, A.: Discriminative human action segmentation and recognition using semi-Markov model. In: IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 1---8 2008
[4]
Cheng, Y., Fan, Q., Pankanti, S., Choudhary, A.: Temporal sequence modeling for video event detection. In: Conference on Computer Vision and Pattern Recognition CVPR, pp. 2235---2242 2014
[5]
Borzeshi, E., Perez Concha, O., Xu, R., Piccardi, M.: Joint action segmentation and classification by an extended hidden Markov model. IEEE Sig. Process. Lett. 20, 1207---1210 2013
[6]
Carvajal, J., Sanderson, C., McCool, C., Lovell, B.C.: Multi-action recognition via stochastic modelling of optical flow and gradients. In: Workshop on Machine Learning for Sensory Data Analysis MLSDA, pp. 19---24. ACM 2014. http://dx.doi.org/10.1145/2689746.2689748
[7]
Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. Adv. Neural Inf. Process. Syst. 11, 487---493 1998
[8]
Lasserre, J., Bishop, C.M.: Generative or discriminative? Getting the best of both worlds. In: Bernardo, J., Bayarri, M., Berger, J., Dawid, A., Heckerman, D., Smith, A., West, M. eds. Bayesian Statistics, vol. 8, pp. 3---24. Oxford University Press, Oxford 2007
[9]
Csurka, G., Perronnin, F.: Fisher vectors: beyond bag-of-visual-words image representations. In: Richard, P., Braz, J. eds. VISIGRAPP 2010. CCIS, vol. 229, pp. 28---42. Springer, Heidelberg 2011
[10]
Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.: Image classification with the Fisher vector: theory and practice. Int. J. Comput. Vis. 105, 222---245 2013
[11]
Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. Br. Mach. Vis. Conf. BMVC 1241---124, 11 2009
[12]
Oneata, D., Verbeek, J., Schmid, C.: Action and event recognition with Fisher vectors on a compact feature set. In: International Conference on Computer Vision ICCV, pp. 1817---1824 2013
[13]
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: International Conference on Computer Vision ICCV 2013
[14]
Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64, 107---123 2005
[15]
Cao, L., Tian, Y., Liu, Z., Yao, B., Zhang, Z., Huang, T.: Action detection using multiple spatial-temporal interest point features. In: International Conference on Multimedia and Expo ICME, pp. 340---345 2010
[16]
Kliper-Gross, O., Gurovich, Y., Hassner, T., Wolf, L.: Motion interchange patterns for action recognition in unconstrained videos. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. eds. ECCV 2012, Part VI. LNCS, vol. 7577, pp. 256---269. Springer, Heidelberg 2012
[17]
Ali, S., Shah, M.: Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 32, 288---303 2010
[18]
Guo, K., Ishwar, P., Konrad, J.: Action recognition from video using feature covariance matrices. IEEE Trans. Image Process. 22, 2479---2494 2013
[19]
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York 2006
[20]
Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265---292 2001
[21]
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classifiers 10, 61---74 1999
[22]
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. Int. Conf. Pattern Recogn. ICPR 3, 32---36 2004
[23]
De la Torre, F., Hodgins, J.K., Montano, J., Valcarcel, S.: Detailed human data acquisition of kitchen activities: the CMU-multimodal activity database CMU-MMAC. In: CHI Workshop on Developing Shared Home Behavior Datasets to Advance HCI and Ubiquitous Computing Research 2009
[24]
Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: Salah, A.A., Lepri, B. eds. HBU 2011. LNCS, vol. 7065, pp. 29---39. Springer, Heidelberg 2011
[25]
Spriggs, E.H., Torre, F.D.L., Hebert, M.: Temporal segmentation and activity classification from first-person sensing. In: IEEE Workshop on Egocentric Vision, CVPR 2009
[26]
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep Fisher networks for large-scale image classification. In: Burges, C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K. eds. Advances in Neural Information Processing Systems, vol. 26, pp. 163---171 2013
[27]
Parkhi, O.M., Simonyan, K., Vedaldi, A., Zisserman, A.: A compact and discriminative face track descriptor. In: Conference on Computer Vision and Pattern Recognition CVPR 2014

Cited By

View all
  • (2017)Real-time Action Recognition Based on Key Frame DetectionProceedings of the 9th International Conference on Machine Learning and Computing10.1145/3055635.3056569(272-277)Online publication date: 24-Feb-2017

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Revised Selected Papers of the PAKDD 2016 Workshops on Trends and Applications in Knowledge Discovery and Data Mining - Volume 9794
April 2016
269 pages
ISBN:9783319429953
  • Editors:
  • Huiping Cao,
  • Jinyan Li,
  • Ruili Wang

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 19 April 2016

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2017)Real-time Action Recognition Based on Key Frame DetectionProceedings of the 9th International Conference on Machine Learning and Computing10.1145/3055635.3056569(272-277)Online publication date: 24-Feb-2017

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media