[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3422844.3423054acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Self-Supervised Small Soccer Player Detection and Tracking

Published: 12 October 2020 Publication History

Abstract

In a soccer game, the information provided by detecting and tracking brings crucial clues to further analyze and understand some tactical aspects of the game, including individual and team actions. State-of-the-art tracking algorithms achieve impressive results in scenarios on which they have been trained for, but they fail in challenging ones such as soccer games. This is frequently due to the player small relative size and the similar appearance among players of the same team. Although a straightforward solution would be to retrain these models by using a more specific dataset, the lack of such publicly available annotated datasets entails searching for other effective solutions. In this work, we propose a self-supervised pipeline which is able to detect and track low-resolution soccer players under different recording conditions without any need of ground-truth data. Extensive quantitative and qualitative experimental results are presented evaluating its performance. We also present a comparison to several state-of-the-art methods showing that both the proposed detector and the proposed tracker achieve top-tier results, in particular in the presence of small players. Code available at "https://github.com/samuro95/Self-Supervised-Small-Soccer-Player-Detection-Tracking".

Supplementary Material

ZIP File (mmsport42aux.zip)
The supplementary material contains a pdf file of 4 pages containing additional comments, explanations and experiences.

References

[1]
Yancheng Bai, Yongqiang Zhang, Mingli Ding, and Bernard Ghanem. 2018. Sodmtgan: Small object detection via multi-task generative adversarial network. In Proceedings of the European Conference on Computer Vision (ECCV). 206--221.
[2]
Philipp Bergmann, Tim Meinhardt, and Laura Leal-Taixe. 2019. Tracking without bells and whistles. In Proceedings of the IEEE international conference on computer vision. 941--951.
[3]
Navaneeth Bodla, Bharat Singh, Rama Chellappa, and Larry S Davis. 2017. SoftNMS--improving object detection with one line of code. In Proceedings of the IEEE international conference on computer vision. 5561--5569.
[4]
Guillem Brasó and Laura Leal-Taixé. 2020. Learning a neural solver for multiple object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6247--6257.
[5]
Matija Buric, Miran Pobar. 2018. Object detection in sports videos. 1034--1039. https://doi.org/10.23919/MIPRO.2018.8400189
[6]
Jianhui Chen and James J Little. 2019. Sports camera calibration via synthetic data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 0--0.
[7]
Gioele Ciaparrone, Francisco Luque Sánchez, Siham Tabik, Luigi Troiano, Roberto Tagliaferri, and Francisco Herrera. 2020. Deep learning in video multi-object tracking: A survey. Neurocomputing 381 (2020), 61--88.
[8]
Anthony Cioppa, Adrien Deliege, Maxime Istasse, Christophe De Vleeschouwer, and Marc Van Droogenbroeck. 2019. ARTHuS: Adaptive Real-Time Human Segmentation in Sports Through Online Distillation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.
[9]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.
[10]
T. D'Orazio, M. Leo, N. Mosca, P. Spagnolo, and P. L. Mazzeo. 2009. A Semiautomatic System for Ground Truth Generation of Soccer Video Sequences. In 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance. 559--564.
[11]
T. D'Orazio, M. Leo, P. Spagnolo, P. L. Mazzeo, N. Mosca, M. Nitti, and A. Distante. 2009. An Investigation Into the Feasibility of Real-Time Soccer Offside Detection From a Multiple Camera System. IEEE Transactions on Circuits and Systems for Video Technology 19, 12 (2009), 1804--1818.
[12]
Christoph Feichtenhofer, Axel Pinz, and Andrew Zisserman. 2017. Detect to track and track to detect. In Proceedings of the IEEE International Conference on Computer Vision. 3038--3046.
[13]
Silvio Giancola, Mohieddine Amine, Tarek Dghaily, and Bernard Ghanem. 2018. SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2018), 1792--179210.
[14]
Rohit Girdhar, Georgia Gkioxari, Lorenzo Torresani, Manohar Paluri, and Du Tran. 2018. Detect-and-track: Efficient pose estimation in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 350--359.
[15]
Ross Girshick. 2015. Fast R-CNN. CoRR abs/1504.08083 (2015). http://www.cvfoundation.org/openaccess/content_iccv_2015/papers/Girshick_Fast_RCNN_ICCV_2015_paper.pdf
[16]
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR '14). IEEE Computer Society, USA, 580--587. https://doi.org/10.1109/CVPR. 2014.81
[17]
Georgia Gkioxari, Ross Girshick, and Jitendra Malik. 2015. Contextual action recognition with r* cnn. In Proceedings of the IEEE international conference on computer vision. 1080--1088.
[18]
Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017).
[19]
Han-Kai Hsu, Chun-Han Yao, Yi-Hsuan Tsai, Wei-Chih Hung, Hung-Yu Tseng, Maneesh Singh, and Ming-Hsuan Yang. 2020. Progressive domain adaptation for object detection. In The IEEE Winter Conference on Applications of Computer Vision. 749--757.
[20]
Naoto Inoue, Ryosuke Furuta, Toshihiko Yamasaki, and Kiyoharu Aizawa. 2018. Cross-domain weakly-supervised object detection through progressive domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5001--5009.
[21]
Zdravko Ivankovic, Milos Rackovic, and Miodrag Ivkovic. 2013. Automatic player position detection in basketball games. Multimedia Tools and Applications 72 (2013), 2741--2767.
[22]
Rangachar Kasturi, Dmitry Goldgof, Padmanabhan Soundararajan, Vasant Manohar, John Garofolo, Rachel Bowers, Matthew Boonstra, Valentina Korzhova, and Jing Zhang. 2008. Framework for performance evaluation of face, text, and vehicle detection and tracking in video: Data, metrics, and protocol. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 2 (2008), 319--336.
[23]
Mate Kisantal, Zbigniew Wojna, Jakub Murawski, Jacek Naruniec, and Kyunghyun Cho. 2019. Augmentation for small object detection. arXiv preprint arXiv:1902.07296 (2019).
[24]
Jacek Komorowski, Grzegorz Kurzejamski, and Grzegorz Sarwas. 2020. FootAndBall: Integrated player and ball detector. In VISIGRAPP.
[25]
Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1--2 (1955), 83--97.
[26]
Laura Leal-Taixé, Cristian Canton-Ferrer, and Konrad Schindler. 2016. Learning by tracking: Siamese CNN for robust target association. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 33--40.
[27]
Laura Leal-Taixé, Anton Milan, Konrad Schindler, Daniel Cremers, Ian Reid, and Stefan Roth. 2017. Tracking the trackers: an analysis of the state of the art in multiple object tracking. arXiv preprint arXiv:1704.02781 (2017).
[28]
Jianan Li, Xiaodan Liang, Yunchao Wei, Tingfa Xu, Jiashi Feng, and Shuicheng Yan. 2017. Perceptual generative adversarial networks for small object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1222--1230.
[29]
Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature Pyramid Networks for Object Detection. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and Larry Zitnick. 2014. Microsoft COCO: Common Objects in Context. In ECCV (eccv ed.). European Conference on Computer Vision. https://www.microsoft.com/en-us/research/publication/microsoft-cococommon-objects-in-context/
[31]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott E. Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single Shot MultiBox Detector. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 9905), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer, 21--37. https://doi.org/10.1007/978--3--319--46448-0_2
[32]
Keyu Lu, Jianhui Chen, James J. Little, and Hangen He. 2017. Light Cascaded Convolutional Neural Networks for Accurate Player Detection. In British Machine Vision Conference 2017, BMVC 2017, London, UK, September 4--7, 2017. BMVA Press. https://www.dropbox.com/s/ydsw8sfqq8tx3fa/0171.pdf?dl=1
[33]
Slawomir Mackowiak. 2013. Segmentation of Football Video Broadcast. International Journal of Electronics and Telecommunications 59 (2013), 75--84.
[34]
M. Manafifard, H. Ebadi, and H. Abrishami Moghaddam. 2017. Multi-Player Detection in Soccer Broadcast Videos Using a Blob-Guided Particle Swarm Optimization Method. Multimedia Tools Appl. 76, 10 (May 2017), 12251--12280. https://doi.org/10.1007/s11042-016--3625--6
[35]
Mehrtash Manafifard, Hamid Ebadi, and H Abrishami Moghaddam. 2017. A survey on player tracking in soccer videos. Computer Vision and Image Understanding 159 (2017), 19--46.
[36]
Guanghan Ning, Jian Pei, and Heng Huang. 2020. Lighttrack: A generic framework for online top-down human pose tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 1034--1035.
[37]
Rafael Padilla, Sergio L Netto, and Eduardo AB da Silva. 2020. A Survey on Performance Metrics for Object-Detection Algorithms. In 2020 International Conference on Systems, Signals and Image Processing (IWSSIP). IEEE, 237--242.
[38]
Svein Arne Pettersen, Dag Johansen, Håvard Johansen, Vegard Berg-Johansen, Vamsidhar Reddy Gaddam, Asgeir Mortensen, Ragnar Langseth, Carsten Griwodz, Håkon Kvale Stensland, and Pål Halvorsen. 2014. Soccer Video and Player Position Dataset. In Proceedings of the 5th ACM Multimedia Systems Conference (Singapore, Singapore) (MMSys '14). Association for Computing Machinery, New York, NY, USA, 18--23. https://doi.org/10.1145/2557642.2563677
[39]
Nan Ran, Longteng Kong, Yunhong Wang, and Qingjie Liu. 2019. A robust multi-athlete tracking algorithm by exploiting discriminant features and longterm dependencies. In International Conference on Multimedia Modeling. Springer, 411--423.
[40]
Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, and Ali Farhadi. 2015. You Only Look Once: Unified, Real-Time Object Detection. CoRR abs/1506.02640 (2015). arXiv:1506.02640 http://arxiv.org/abs/1506.02640
[41]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). Curran Associates, Inc., 91-- 99. http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-objectdetection-with-region-proposal-networks.pdf
[42]
Yun Ren, Changren Zhu, and Shunping Xiao. 2018. Small object detection in optical remote sensing images via modified faster R-CNN. Applied Sciences 8, 5 (2018), 813.
[43]
Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance measures and a data set for multi-target, multi-camera tracking. In European Conference on Computer Vision. Springer, 17--35.
[44]
Aruni RoyChowdhury, Prithvijit Chakrabarty, Ashish Singh, SouYoung Jin, Huaizu Jiang, Liangliang Cao, and Erik Learned-Miller. 2019. Automatic adaptation of object detectors to new domains using self-training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 780--790.
[45]
H. Shih. 2018. A Survey of Content-Aware Video Analysis for Sports. IEEE Transactions on Circuits and Systems for Video Technology 28, 5 (2018), 1212--1231.
[46]
Bing Shuai, Andrew G Berneshawi, Davide Modolo, and Joseph Tighe. 2020. MultiObject Tracking with Siamese Track-RCNN. arXiv preprint arXiv:2004.07786 (2020).
[47]
R. Theagarajan, F. Pala, X. Zhang, and B. Bhanu. 2018. Soccer: Who Has the Ball? Generating Visual Analytics and Player Statistics. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 1830--18308.
[48]
Zhongdao Wang, Liang Zheng, Yixuan Liu, and Shengjin Wang. 2019. Towards real-time multi-object tracking. arXiv preprint arXiv:1909.12605 (2019).
[49]
Bin Xiao, Haiping Wu, and Yichen Wei. 2018. Simple baselines for human pose estimation and tracking. In Proceedings of the European conference on computer vision (ECCV). 466--481.
[50]
Jianfeng Xu, Lertniphonphan Kanokphan, and Kazuyuki Tasaka. 2018. Fast and Accurate Object Detection Using Image Cropping/Resizing in Multi-View 4K Sports Videos. In Proceedings of the 1st International Workshop on Multimedia Content Analysis in Sports (Seoul, Republic of Korea) (MMSports'18). Association for Computing Machinery, New York, NY, USA, 97--103. https://doi.org/10.1145/ 3265845.3265852
[51]
Bo Yang and Ram Nevatia. 2012. An online learned CRF model for multi-target tracking. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2034--2041.
[52]
Junqing Yu, Aiping Lei, Zikai Song, Tingting Wang, Hengyou Cai, and Na Feng. 2018. Comprehensive Dataset of Broadcast Soccer Videos. 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) (2018), 418--423.
[53]
Yifu Zhan, Chunyu Wang, Xinggang Wang, Wenjun Zeng, and Wenyu Liu. 2020. A Simple Baseline for Multi-Object Tracking. arXiv preprint arXiv:2004.01888 (2020).

Cited By

View all
  • (2024)Multimodal Shot Prediction Based on Spatial-Temporal Interaction between Players in Soccer VideosApplied Sciences10.3390/app1411484714:11(4847)Online publication date: 3-Jun-2024
  • (2024)The Eye in the Sky—A Method to Obtain On-Field Locations of Australian Rules Football AthletesAI10.3390/ai50200385:2(733-745)Online publication date: 16-May-2024
  • (2024)FootyVision: Multi-Object Tracking, Localisation, and Augmentation of Players and Ball in Football VideoProceedings of the 2024 9th International Conference on Multimedia and Image Processing10.1145/3665026.3665029(15-25)Online publication date: 20-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MMSports '20: Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports
October 2020
66 pages
ISBN:9781450381499
DOI:10.1145/3422844
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cnn
  2. multi-player tracking
  3. neural networks
  4. player detection
  5. self-supervised
  6. single camera
  7. small object detection
  8. soccer

Qualifiers

  • Research-article

Funding Sources

  • H2020-MSCA-RISE-2017
  • RED2018-102511-T
  • ENS Paris-Saclay
  • MICINN/FEDER UE project

Conference

MM '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 29 of 49 submissions, 59%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)67
  • Downloads (Last 6 weeks)4
Reflects downloads up to 18 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Multimodal Shot Prediction Based on Spatial-Temporal Interaction between Players in Soccer VideosApplied Sciences10.3390/app1411484714:11(4847)Online publication date: 3-Jun-2024
  • (2024)The Eye in the Sky—A Method to Obtain On-Field Locations of Australian Rules Football AthletesAI10.3390/ai50200385:2(733-745)Online publication date: 16-May-2024
  • (2024)FootyVision: Multi-Object Tracking, Localisation, and Augmentation of Players and Ball in Football VideoProceedings of the 2024 9th International Conference on Multimedia and Image Processing10.1145/3665026.3665029(15-25)Online publication date: 20-Apr-2024
  • (2024)Focus bank: an innovative mechanism for improving the performance of focus-and-detect algorithms in tracking multiple soccer players2024 9th International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)10.1109/ICIIBMS62405.2024.10792772(616-624)Online publication date: 21-Nov-2024
  • (2024)TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00340(3357-3366)Online publication date: 17-Jun-2024
  • (2024)SoccerNet-Depth: a Scalable Dataset for Monocular Depth Estimation in Sports Videos2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00333(3280-3282)Online publication date: 17-Jun-2024
  • (2024)MV-Soccer: Motion-Vector Augmented Instance Segmentation for Soccer Player Tracking2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00330(3245-3255)Online publication date: 17-Jun-2024
  • (2024)Perspective Transform Based YOLO With Weighted Intersect Fusion for Forecasting the Possession Sequence of the Live Football GameIEEE Access10.1109/ACCESS.2024.340237012(75542-75558)Online publication date: 2024
  • (2024)EIoU-distance loss: an automated team-wise player detection and tracking with jersey colour recognition in soccerConnection Science10.1080/09540091.2023.229199136:1Online publication date: 3-Feb-2024
  • (2024)DCTrackerKnowledge-Based Systems10.1016/j.knosys.2024.112528304:COnline publication date: 25-Nov-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media