[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

ScoringNet: Learning Key Fragment for Action Quality Assessment with Ranking Loss in Skilled Sports

  • Conference paper
  • First Online:
Computer Vision – ACCV 2018 (ACCV 2018)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11366))

Included in the following conference series:

Abstract

Nowadays, scoring athletes’ performance in skilled sports automatically has drawn more and more attention from the academic community. However, extracting effective features and predicting reasonable scores for a long skilled sport video still beset researchers. In this paper, we introduce the ScoringNet, a novel network consisting of key fragment segmentation (KFS) and score prediction (SP), to address these two problems. To get the effective features, we design KFS to obtain key fragments and remove irrelevant fragments by semantic video segmentation. Then a 3D convolutional neural network extracts features from each key fragment. In score prediction, we fuse the ranking loss into the traditional loss function to make the predictions more reasonable in terms of both the score value and the ranking aspects. Through the deep learning, we narrow the gap between the predictions and ground-truth scores as well as making the predictions satisfy the ranking constraint. Widely experiments convincingly show that our method achieves the state-of-the-art results on three datasets.

This work was partially supported by 973 Program under contract No. 2015CB351802, Natural Science Foundation of China under contracts Nos. 61390511, 61472398, 61532018.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. List of Olympic Games Scandals and Controversies. https://en.wikipedia.org/wiki/List_of_Olympic_Games_boycotts. Accessed 26 Mar 2018

  2. Vault. https://en.wikipedia.org/wiki/Vault_(gymnastics). 2.1.2. Accessed 4 June 2018

  3. FINA Diving Rules. http://www.fina.org/sites/default/files/2017-2021_diving_16032018.pdf. D8.1.3. Accessed 12 Sept 2017

  4. FINA Diving Rules. http://www.fina.org/sites/default/files/2017-2021_diving_16032018.pdf. APPENDIX 4. Accessed 12 Sept 2017

  5. Pirsiavash, H., Vondrick, C., Torralba, A.: Assessing the quality of actions. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 556–571. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_36

    Chapter  Google Scholar 

  6. Tao, L., et al.: A comparative study of pose representation and dynamics modelling for online motion quality assessment. Comput. Vis. Image Underst. 148, 136–152 (2016)

    Article  Google Scholar 

  7. Parmar, P., Morris, B.: Human motion assessment in real time using recurrent self-organization. In: 25th IEEE International Symposium on Robot and Human Interactive Communication, New York, USA, pp. 71–76 (2016)

    Google Scholar 

  8. Parisi, G., Magg, S., Wermter, S.: Measuring the quality of exercises. In: 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Florida, USA, pp. 2241–2244 (2016)

    Google Scholar 

  9. Parmar, P., Morris, B.: Learning to score olympic events. In: 30th IEEE Conference on Computer Vision and Pattern Recognition Work Shop, pp. 76–84. IEEE, Hawaii (2017)

    Google Scholar 

  10. Zia, A., Sharma, Y., Bettadapura, V., Sarin, E.L., Clements, M.A., Essa, I.: Automated assessment of surgical skills using frequency analysis. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 430–438. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24553-9_53

    Chapter  Google Scholar 

  11. Baptista, R., Antunes, M., Aouada, D., Ottersten, B.: Video-based feedback for assisting physical activity. In: 12th International Conference on Computer Vision Theory and Applications, Porto, Portugal, pp. 430–438 (2017)

    Google Scholar 

  12. Carvajal, J., Wiliem, A., Sanderson, C., Lovell, B.: Towards Miss Universe automatic prediction: the evening gown competition. In: 23rd International Conference on Pattern Recognition, pp. 1089–1094. IEEE, Cancun (2016)

    Google Scholar 

  13. Du, T., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: International Conference on Computer Vision, pp. 4489–4497. IEEE, Santiago (2015)

    Google Scholar 

  14. Venkataraman, V., Vlachos, I., Turaga, P.: Dynamical regularity for action analysis. In: 26th British Machine Vision Conference, pp. 67.1–67.12. British Machine Vision Association, Swansea (2015)

    Google Scholar 

  15. Soomro, K., Zamir, A., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)

  16. Chai, X., Liu, Z., Li, Y., Yin, F., Chen, X.: SignInstructor: an effective tool for sign language vocabulary learning. In: 4th Asian Conference on Pattern Recognition, Nanjing, China (2017)

    Google Scholar 

  17. Le, Q., Zou, W., Yeung, S., Ng, A.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: 24th IEEE Conference on Computer Vision and Pattern Recognition, pp. 3361–3368. IEEE, Colorado Springs (2011)

    Google Scholar 

  18. Kingma, D.,Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv preprint arXiv:1412.6980 (2014)

  19. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38

    Chapter  Google Scholar 

  20. Szegedya, C., et al.: Going deeper with convolutions. In: 28th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE, Boston (2015)

    Google Scholar 

  21. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv preprint arXiv:1508.01991 (2015)

  22. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: 28th IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440. IEEE, Boston (2015)

    Google Scholar 

  23. Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4733. IEEE, Hawaii (2017)

    Google Scholar 

  24. Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3D residual networks. In: International Conference on Computer Vision, pp. 5534–5542. IEEE, Venice (2017)

    Google Scholar 

  25. Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: 21st Conference on Neural Information Processing Systems, pp. 568–576. MIT Press, Montreal (2014)

    Google Scholar 

  26. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: 19th Conference on Neural Information Processing Systems, pp. 1097–1105. MIT Press, Lake Tahoe (2012)

    Google Scholar 

  27. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)

    Google Scholar 

  28. Laptev, I., Lindeberg, T.: On space-time interest points. Int. J. Comput. Vis. 64, 107–123 (2005)

    Article  Google Scholar 

  29. Huang, G., Liu, Z., Laurens, V.D.M., Weinberger, K.Q.: Densely connected convolutional networks. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 2261–2269. IEEE, Hawaii (2017)

    Google Scholar 

  30. Doughty, H., Damen, D., Mayol-Cuevas, W.: Who’s better? Who’s best? Pairwise deep ranking for skill determination. In: 31st IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Salt Lake City (2018)

    Google Scholar 

  31. Xiang, X., Tian, Y., Reiter, A., Hager, G.D., Tran, T.D.: S3D: stacking segmental P3D for action quality assessment. In: IEEE International Conference on Image Processing. IEEE, Athens (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xilin Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Y., Chai, X., Chen, X. (2019). ScoringNet: Learning Key Fragment for Action Quality Assessment with Ranking Loss in Skilled Sports. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11366. Springer, Cham. https://doi.org/10.1007/978-3-030-20876-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20876-9_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20875-2

  • Online ISBN: 978-3-030-20876-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics