A novel finetuned YOLOv6 transfer learning model for real-time object detection

Chhaya Gupta¹,
Nasib Singh Gill¹,
Preeti Gulia¹ &
…
Jyotir Moy Chatterjee²

1663 Accesses
29 Citations
Explore all metrics

A Correction to this article was published on 09 May 2023

This article has been updated

Abstract

Object detection and object recognition are the most important applications of computer vision. To pursue the task of object detection efficiently, a model with higher detection accuracy is required. Increasing the detection accuracy of the model increases the model’s size and computation cost. Therefore, it becomes a challenge to use deep learning in embedded environments. To overcome this problem, the current research suggests a transfer-learning-based model for real-time object detection that enhances the YOLO algorithm's effectiveness. The model utilizes YOLOv6 as a baseline model. This study proposes a pruning and finetuning algorithm as well as a transfer learning algorithm for enhancing the proposed model’s efficiency in terms of detection accuracy and inference speed. This paper also focuses on how the proposed model will be able to identify all objects (indoor as well as outdoor) in a scene and provides a voice output to warn the user about nearby and faraway objects. To receive the audio feedback, Google Text-to-Speech (gTTs) library is used. The model is trained on the MS-COCO dataset. The proposed model is compared with the Tensorflow Single Shot Detector model, Faster RCNN model, Mask RCNN model, YOLOv4, and baseline YOLOv6 model. After pruning the YOLOv6 baseline model by 30%, 40%, and 50%, the finetuned YOLOv6 framework hits 37.8% higher average precision (AP) with 1235 frames per second (FPS).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

SlimYOLOv4: lightweight object detector based on YOLOv4

Article 10 February 2022

V-DETR: Pure Transformer for End-to-End Object Detection

A Short Survey on Real-Time Object Detection and Its Challenges

Data availability

Data will be made available on appropriate request.

Change history

09 May 2023
A Correction to this paper has been published: https://doi.org/10.1007/s11554-023-01313-8

References

Zhang, J., Wang, P., Zhao, Z., Su, F.: Pruned-YOLO: learning efficient object detector using model pruning. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 12894 LNCS, 34–45 (2021). https://doi.org/10.1007/978-3-030-86380-7_4/COVER/
Li, Y., Ge, Z., Yu, G., Yang, J., Wang, Z., Shi, Y., Sun, J., Li, Z.: BEVDepth: acquisition of reliable depth for multi-view 3D object detection. arXiv preprint. https://doi.org/10.48550/arXiv.2206.10092 (2022)
Xu, Q., Zhong, Y., Neumann, U.: Behind the curtain: learning occluded shapes for 3D object detection. Proc. AAAI Conf. Artif. Intell. 36, 2893–2901 (2022). https://doi.org/10.1609/aaai.v36i3.20194
Article Google Scholar
Sun, W., Dai, L., Zhang, X., Chang, P., He, X.: RSOD: real-time small object detection algorithm in UAV-based traffic monitoring. Appl. Intell. 52, 8448–8463 (2022). https://doi.org/10.1007/s10489-021-02893-3
Article Google Scholar
KhoshboreshMasouleh, M., Shah-Hosseini, R.: Development and evaluation of a deep learning model for real-time ground vehicle semantic segmentation from UAV-based thermal infrared imagery. ISPRS J. Photogramm. Remote Sens. 155, 172–186 (2019). https://doi.org/10.1016/j.isprsjprs.2019.07.009
Article Google Scholar
Hou, L., Chen, C., Wang, S., Wu, Y., Chen, X.: Multi-object detection method in construction machinery swarm operations based on the improved YOLOv4 model. Sensors. 22, 1–14 (2022)
Article Google Scholar
Mauri, A., Khemmar, R., Decoux, B., Haddad, M., Boutteau, R.: Lightweight convolutional neural network for real-time 3D object detection in road and railway environments. J. Real-Time Image Process. 19, 499–516 (2022). https://doi.org/10.1007/s11554-022-01202-6
Article Google Scholar
Martinez-Alpiste, I., Golcarenarenji, G., Wang, Q., Alcaraz-Calero, J.M.: Smartphone-based real-time object recognition architecture for portable and constrained systems. J. Real-Time Image Process. 19, 103–115 (2022). https://doi.org/10.1007/s11554-021-01164-1
Article Google Scholar
Hu, J., Wang, T., Zhu, S.: Multi-view aggregation for real-time accurate object detection of a moving camera. J. Real-Time Image Process. (2022). https://doi.org/10.1007/s11554-022-01253-9
Article Google Scholar
Zhang, J., Ye, Z., Jin, X., Wang, J., Zhang, J.: Real-time traffic sign detection based on multiscale attention and spatial information aggregator. J. Real-Time Image Process. (2022). https://doi.org/10.1007/s11554-022-01252-w
Article Google Scholar
Saponara, S., Elhanashi, A., Zheng, Q.: Developing a real-time social distancing detection system based on YOLOv4-tiny and bird-eye view for COVID-19. J. Real-Time Image Process. 19, 551–563 (2022). https://doi.org/10.1007/s11554-022-01203-5
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788 (2016)
Nikkath Bushra, S., Shobana, G., Uma Maheswari, K., Subramanian, N.: Smart video survillance based weapon identification using yolov5. 351–357 (2022). https://doi.org/10.1109/ICESIC53714.2022.9783499
Xia, R., Li, G., Huang, Z., Pang, Y., Qi, M.: Transformers only look once with nonlinear combination for real-time object detection. Neural Comput. Appl. (2022). https://doi.org/10.1007/s00521-022-07333-y
Article Google Scholar
Junayed, M.S., Islam, M.B., Imani, H., Aydin, T.: PDS-Net: a novel point and depth-wise separable convolution for real-time object detection. Int. J. Multimed. Inf. Retr. 11, 171–188 (2022). https://doi.org/10.1007/s13735-022-00229-6
Article Google Scholar
Kadhim, M., Oleiwi, B.: Blind assistive system based on real time object recognition using machine learning. Eng. Technol. J. 40, 159–165 (2022). https://doi.org/10.30684/etj.v40i1.1933
Article Google Scholar
Ashiq, F., Asif, M., Ahmad, M.B., Zafar, S., Masood, K., Mahmood, T., Mahmood, M.T., Lee, I.H.: CNN-based object recognition and tracking system to assist visually impaired people. IEEE Access. 10, 14819–14834 (2022). https://doi.org/10.1109/ACCESS.2022.3148036
Article Google Scholar
Gupta, C., Gill, N.S., Gulia, P.: SSDT : distance tracking model based on deep learning. Int. J. Electr. Comput. Eng. Syst. 13, 339–348 (2022). https://doi.org/10.32985/ijeces.13.5.2
Article Google Scholar
Gupta, C., Gill, N.S.: Coronamask: a face mask detector for real-time data. Int. J. Adv. Trends Comput. Sci. Eng. 9, 5624–5630 (2020). https://doi.org/10.30534/ijatcse/2020/212942020
Article Google Scholar
Cai, Y., Yuan, G., Li, H., Niu, W., Li, Y., Tang, X., Ren, B., Wang, Y.: A compression-compilation co-design framework towards real-time object detection on mobile devices. 35th AAAI Conf. Artif. Intell. AAAI 2021. 18: 1597–1600 (2021)
Chen, C., Wang, G., Peng, C., Fang, Y., Zhang, D., Qin, H.: Exploring rich and efficient spatial temporal interactions for real-time video salient object detection. IEEE Trans. Image Process. 30, 3995–4007 (2021). https://doi.org/10.1109/TIP.2021.3068644
Article Google Scholar
What’s New in YOLOv6?, https://blog.roboflow.com/yolov6/
Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., Nie, W., Li, Y., Zhang, B, 30m., Liang, Y., Zhou, L., Xu, X., Chu, X., Wei, X., Wei, X.: YOLOv6: A single-stage object detection framework for industrial applications. (2022)
Zhang, H., Wang, Y., Dayoub, F., Sünderhauf, N.: VarifocalNet: An IoU-aware dense object detector. Proc. IEEE Comput. Soc. Conf Comput. Vis. Pattern Recognit. (2021). https://doi.org/10.1109/CVPR46437.2021.00841
Article Google Scholar
Li, X., Wang, W., Wu, L., Chen, S., Hu, X., Li, J., Tang, J., Yang, J.: Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural Inf. Process. Syst. 2020, 1–11 (2020)
Google Scholar
Bonnaerens, M., Freiberger, M., Dambre, J.: Anchor pruning for object detection. Comput. Vis. Image Underst. 221, 1035 (2022). https://doi.org/10.1016/j.cviu.2022.103445
Article Google Scholar
Zhong, Y., Wang, J., Peng, J., Zhang, L.: Anchor box optimization for object detection. Proc. - 2020 IEEE Winter Conf. Appl. Comput. Vision, WACV 2020. 1275–1283 (2020). https://doi.org/10.1109/WACV45572.2020.9093498
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: Common objects in context. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 8693 LNCS, 740–755 (2014). https://doi.org/10.1007/978-3-319-10602-1_48/COVER/
COCO - Common objects in context, https://cocodataset.org/#download
Mehta, R., Ozturk, C.: Object detection at 200 frames per second. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 11133 LNCS, 659–675 (2019). https://doi.org/10.1007/978-3-030-11021-5_41

Download references

Author information

Authors and Affiliations

Department of Computer Science and Applications, Maharshi Dayanand University, Rohtak, India
Chhaya Gupta, Nasib Singh Gill & Preeti Gulia
Department of IT, Lord Buddha Education Foundation(Asia Pacific University), Kathmandu, Nepal
Jyotir Moy Chatterjee

Authors

Chhaya Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Nasib Singh Gill
View author publications
You can also search for this author in PubMed Google Scholar
Preeti Gulia
View author publications
You can also search for this author in PubMed Google Scholar
Jyotir Moy Chatterjee
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors have equal contribution in this manuscript.

Corresponding author

Correspondence to Jyotir Moy Chatterjee.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: In this article Author Jyotir Moy Chatterjee affiliation wrongly mention. It has been corrected.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gupta, C., Gill, N.S., Gulia, P. et al. A novel finetuned YOLOv6 transfer learning model for real-time object detection. J Real-Time Image Proc 20, 42 (2023). https://doi.org/10.1007/s11554-023-01299-3

Download citation

Received: 18 October 2022
Accepted: 25 March 2023
Published: 10 April 2023
DOI: https://doi.org/10.1007/s11554-023-01299-3

A novel finetuned YOLOv6 transfer learning model for real-time object detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SlimYOLOv4: lightweight object detector based on YOLOv4

V-DETR: Pure Transformer for End-to-End Object Detection

A Short Survey on Real-Time Object Detection and Its Challenges

Data availability

Change history

09 May 2023

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

A novel finetuned YOLOv6 transfer learning model for real-time object detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SlimYOLOv4: lightweight object detector based on YOLOv4

V-DETR: Pure Transformer for End-to-End Object Detection

A Short Survey on Real-Time Object Detection and Its Challenges

Data availability

Change history

09 May 2023

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now