ScoringNet: Learning Key Fragment for Action Quality Assessment with Ranking Loss in Skilled Sports

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11366))

Included in the following conference series:

Asian Conference on Computer Vision

2408 Accesses
13 Citations

Abstract

Nowadays, scoring athletes’ performance in skilled sports automatically has drawn more and more attention from the academic community. However, extracting effective features and predicting reasonable scores for a long skilled sport video still beset researchers. In this paper, we introduce the ScoringNet, a novel network consisting of key fragment segmentation (KFS) and score prediction (SP), to address these two problems. To get the effective features, we design KFS to obtain key fragments and remove irrelevant fragments by semantic video segmentation. Then a 3D convolutional neural network extracts features from each key fragment. In score prediction, we fuse the ranking loss into the traditional loss function to make the predictions more reasonable in terms of both the score value and the ranking aspects. Through the deep learning, we narrow the gap between the predictions and ground-truth scores as well as making the predictions satisfy the ranking constraint. Widely experiments convincingly show that our method achieves the state-of-the-art results on three datasets.

This work was partially supported by 973 Program under contract No. 2015CB351802, Natural Science Foundation of China under contracts Nos. 61390511, 61472398, 61532018.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Gaussian guided frame sequence encoder network for action quality assessment

Article Open access 27 October 2022

Pairwise Contrastive Learning Network for Action Quality Assessment

Assessing action quality with semantic-sequence performance regression and densely distributed sample weighting

Article 29 February 2024

References

List of Olympic Games Scandals and Controversies. https://en.wikipedia.org/wiki/List_of_Olympic_Games_boycotts. Accessed 26 Mar 2018
Vault. https://en.wikipedia.org/wiki/Vault_(gymnastics). 2.1.2. Accessed 4 June 2018
FINA Diving Rules. http://www.fina.org/sites/default/files/2017-2021_diving_16032018.pdf. D8.1.3. Accessed 12 Sept 2017
FINA Diving Rules. http://www.fina.org/sites/default/files/2017-2021_diving_16032018.pdf. APPENDIX 4. Accessed 12 Sept 2017
Pirsiavash, H., Vondrick, C., Torralba, A.: Assessing the quality of actions. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 556–571. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_36
Chapter Google Scholar
Tao, L., et al.: A comparative study of pose representation and dynamics modelling for online motion quality assessment. Comput. Vis. Image Underst. 148, 136–152 (2016)
Article Google Scholar
Parmar, P., Morris, B.: Human motion assessment in real time using recurrent self-organization. In: 25th IEEE International Symposium on Robot and Human Interactive Communication, New York, USA, pp. 71–76 (2016)
Google Scholar
Parisi, G., Magg, S., Wermter, S.: Measuring the quality of exercises. In: 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Florida, USA, pp. 2241–2244 (2016)
Google Scholar
Parmar, P., Morris, B.: Learning to score olympic events. In: 30th IEEE Conference on Computer Vision and Pattern Recognition Work Shop, pp. 76–84. IEEE, Hawaii (2017)
Google Scholar
Zia, A., Sharma, Y., Bettadapura, V., Sarin, E.L., Clements, M.A., Essa, I.: Automated assessment of surgical skills using frequency analysis. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 430–438. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24553-9_53
Chapter Google Scholar
Baptista, R., Antunes, M., Aouada, D., Ottersten, B.: Video-based feedback for assisting physical activity. In: 12th International Conference on Computer Vision Theory and Applications, Porto, Portugal, pp. 430–438 (2017)
Google Scholar
Carvajal, J., Wiliem, A., Sanderson, C., Lovell, B.: Towards Miss Universe automatic prediction: the evening gown competition. In: 23rd International Conference on Pattern Recognition, pp. 1089–1094. IEEE, Cancun (2016)
Google Scholar
Du, T., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: International Conference on Computer Vision, pp. 4489–4497. IEEE, Santiago (2015)
Google Scholar
Venkataraman, V., Vlachos, I., Turaga, P.: Dynamical regularity for action analysis. In: 26th British Machine Vision Conference, pp. 67.1–67.12. British Machine Vision Association, Swansea (2015)
Google Scholar
Soomro, K., Zamir, A., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)
Chai, X., Liu, Z., Li, Y., Yin, F., Chen, X.: SignInstructor: an effective tool for sign language vocabulary learning. In: 4th Asian Conference on Pattern Recognition, Nanjing, China (2017)
Google Scholar
Le, Q., Zou, W., Yeung, S., Ng, A.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: 24th IEEE Conference on Computer Vision and Pattern Recognition, pp. 3361–3368. IEEE, Colorado Springs (2011)
Google Scholar
Kingma, D.,Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv preprint arXiv:1412.6980 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Chapter Google Scholar
Szegedya, C., et al.: Going deeper with convolutions. In: 28th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE, Boston (2015)
Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv preprint arXiv:1508.01991 (2015)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: 28th IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440. IEEE, Boston (2015)
Google Scholar
Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4733. IEEE, Hawaii (2017)
Google Scholar
Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3D residual networks. In: International Conference on Computer Vision, pp. 5534–5542. IEEE, Venice (2017)
Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: 21st Conference on Neural Information Processing Systems, pp. 568–576. MIT Press, Montreal (2014)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: 19th Conference on Neural Information Processing Systems, pp. 1097–1105. MIT Press, Lake Tahoe (2012)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)
Google Scholar
Laptev, I., Lindeberg, T.: On space-time interest points. Int. J. Comput. Vis. 64, 107–123 (2005)
Article Google Scholar
Huang, G., Liu, Z., Laurens, V.D.M., Weinberger, K.Q.: Densely connected convolutional networks. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 2261–2269. IEEE, Hawaii (2017)
Google Scholar
Doughty, H., Damen, D., Mayol-Cuevas, W.: Who’s better? Who’s best? Pairwise deep ranking for skill determination. In: 31st IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Salt Lake City (2018)
Google Scholar
Xiang, X., Tian, Y., Reiter, A., Hager, G.D., Tran, T.D.: S3D: stacking segmental P3D for action quality assessment. In: IEEE International Conference on Image Processing. IEEE, Athens (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, 100190, China
Yongjun Li, Xiujuan Chai & Xilin Chen
Agricultural Information Institute, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
Xiujuan Chai
University of Chinese Academy of Sciences, Beijing, 100049, China
Yongjun Li & Xilin Chen

Authors

Yongjun Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiujuan Chai
View author publications
You can also search for this author in PubMed Google Scholar
Xilin Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xilin Chen .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C.V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Y., Chai, X., Chen, X. (2019). ScoringNet: Learning Key Fragment for Action Quality Assessment with Ranking Loss in Skilled Sports. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11366. Springer, Cham. https://doi.org/10.1007/978-3-030-20876-9_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-20876-9_10
Published: 26 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20875-2
Online ISBN: 978-3-030-20876-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ScoringNet: Learning Key Fragment for Action Quality Assessment with Ranking Loss in Skilled Sports

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Gaussian guided frame sequence encoder network for action quality assessment

Pairwise Contrastive Learning Network for Action Quality Assessment

Assessing action quality with semantic-sequence performance regression and densely distributed sample weighting

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

ScoringNet: Learning Key Fragment for Action Quality Assessment with Ranking Loss in Skilled Sports

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Gaussian guided frame sequence encoder network for action quality assessment

Pairwise Contrastive Learning Network for Action Quality Assessment

Assessing action quality with semantic-sequence performance regression and densely distributed sample weighting

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation