Abstract
You only look once (YOLO) is a state-of-the-art object detection model which has a novel architecture that balances model complexity with the inference time. Among YOLO versions, YOLOv7 has a lightweight backbone network called E-ELAN that allows it to learn more efficiently without affecting the gradient path. However, YOLOv7 models face classification difficulties when dealing with classes that have a similar shape and texture like personal protective equipment (PPE). In other words, the Glass versus NoGlass PPE objects almost appear similar when the image is captured at a distance. To mitigate this issue and further improve the classification performance of YOLOv7, a modified version called the contrastive-based model is introduced in this work. The basic concept is that a contrast loss branch function has been added, which assists the YOLOv7 model in differentiating and pushing instances from different classes in the embedding space. To validate the effectiveness of the implemented contrastive-based YOLO, it has been evaluated on two different datasets which are CHV and our own indoor collected dataset named JRCAI. The dataset contains 12 different types of PPE classes. Notably, we have annotated both datasets for the studied 12 PPE objects. The experimental results showed that the proposed model outperforms the standard YOLOv7 model by 2% in mAP@0.5 measure. Furthermore, the proposed model outperformed other YOLO variants as well as cutting-edge object detection models such as YOLOv8, Faster-RCNN, and DAB-DETR.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The CHV dataset is publicly available, and its annotation is available based on request.
References
Nan Y, Zhang H, Zeng Y, Zheng J, Ge Y (2023) Intelligent detection of multi-class pitaya fruits in target picking row based on WGB-YOLO network. Comput Electron Agric 208:107780. https://doi.org/10.1016/j.compag.2023.107780
Dang F, Chen D, Lu Y, Li Z (2023) YOLOWeeds: a novel benchmark of YOLO object detectors for multi-class weed detection in cotton production systems. Comput Electron Agric 205:107655. https://doi.org/10.1016/j.compag.2023.107655
Wang X, Zhao Q, Jiang P, Zheng Y, Yuan L, Yuan P (2022) LDS-YOLO: a lightweight small object detection method for dead trees from shelter forest. Comput Electron Agric 198:107035. https://doi.org/10.1016/j.compag.2022.107035
Su Y, Liu Q, Xie W, Hu P (2022) YOLO-LOGO: a transformer-based YOLO segmentation model for breast mass detection and segmentation in digital mammograms. Comput Methods Programs Biomed 221:106903. https://doi.org/10.1016/j.cmpb.2022.106903
Salman ME, Çakirsoy Çakar G, Azimjonov J, Kösem M, Cedimoğlu İH (2022) Automated prostate cancer grading and diagnosis system using deep learning-based Yolo object detection algorithm. Expert Syst Appl 201:117148. https://doi.org/10.1016/j.eswa.2022.117148
Souza BJ, Stefenon SF, Singh G, Freire RZ (2023) Hybrid-YOLO for classification of insulators defects in transmission lines based on UAV. Int J Electr Power Energy Syst 148:108982. https://doi.org/10.1016/j.ijepes.2023.108982
Zhao C, Shu X, Yan X, Zuo X, Zhu F (2023) RDD-YOLO: a modified YOLO for detection of steel surface defects. Measurement 214:112776. https://doi.org/10.1016/j.measurement.2023.112776
Putra YC, Wijayanto AW (2023) Automatic detection and counting of oil palm trees using remote sensing and object-based deep learning. Rem Sens Appl Soc Environ 29:100914. https://doi.org/10.1016/j.rsase.2022.100914
Li R, Shen Y (2023) YOLOSR-IST: a deep learning method for small target detection in infrared remote sensing images based on super-resolution and YOLO. Sign Process 208:108962. https://doi.org/10.1016/j.sigpro.2023.108962
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Wang C-Y, Bochkovskiy A, Liao H-YM (2023) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7464–7475
Jocher G, Chaurasia A, Qiu J (2023) YOLO by ultralytics. URL Httpsgithub Comultralyticsultralytics
Xia X, Chai X, Li Z, Zhang N, Sun T (2023) MTYOLOX: multi-transformers-enabled YOLO for tree-level apple inflorescences detection and density mapping. Comput Electron Agric 209:107803. https://doi.org/10.1016/j.compag.2023.107803
Yu G, Cai R, Su J, Hou M, Deng R (2023) U-YOLOv7: a network for underwater organism detection. Ecol Inform 75:102108. https://doi.org/10.1016/j.ecoinf.2023.102108
Ye G, Qu J, Tao J, Dai W, Mao Y, Jin Q (2023) Autonomous surface crack identification of concrete structures based on the YOLOv7 algorithm. J Build Eng 73:106688. https://doi.org/10.1016/j.jobe.2023.106688
Zhu B, Xiao G, Zhang Y, Gao H (2023) Multi-classification recognition and quantitative characterization of surface defects in belt grinding based on YOLOv7. Measurement 216:112937. https://doi.org/10.1016/j.measurement.2023.112937
Park M, Tran DQ, Bak J, Park S (2023) Small and overlapping worker detection at construction sites. Autom Constr 151:104856. https://doi.org/10.1016/j.autcon.2023.104856
Wang Z, Wu Y, Yang L, Thirunavukarasu A, Evison C, Zhao Y (2021) Fast personal protective equipment detection for real construction sites using deep learning approaches. Sensors 21(10):3478. https://doi.org/10.3390/s21103478
Lee J-Y, Choi W-S, Choi S-H (2023) Verification and performance comparison of CNN-based algorithms for two-step helmet-wearing detection. Expert Syst Appl 225:120096. https://doi.org/10.1016/j.eswa.2023.120096
Iannizzotto G, Lo Bello L, Patti G (2021) Personal Protection equipment detection system for embedded devices based on DNN and fuzzy logic. Exp Syst Appl 184:115447. https://doi.org/10.1016/j.eswa.2021.115447
Ke X, Chen W, Guo W (2022) 100+ FPS detector of personal protective equipment for worker safety: a deep learning approach for green edge computing. Peer-Peer Netw Appl 15(2):950–972. https://doi.org/10.1007/s12083-021-01258-4
Xu ZP, Zhang Y, Cheng J, Ge G (2022) Safety helmet wearing detection based on YOLOv5 of attention mechanism. J Phys Conf Ser 2213(1):012038. https://doi.org/10.1088/1742-6596/2213/1/012038
Chen W, Li C, Guo H (2023) A lightweight face-assisted object detection model for welding helmet use. Expert Syst Appl 221:119764. https://doi.org/10.1016/j.eswa.2023.119764
Chen Z, Zhang F, Liu H, Wang L, Zhang Q, Guo L (2023) Real-time detection algorithm of helmet and reflective vest based on improved YOLOv5. J Real-Time Image Process 20(1):4. https://doi.org/10.1007/s11554-023-01268-w
Wei Z, Wu N, Li F, Wang K, Zhang W (2023) MoCo4SRec: a momentum contrastive learning framework for sequential recommendation. Expert Syst Appl 223:119911. https://doi.org/10.1016/j.eswa.2023.119911
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738
Xu C, Li D, Yang M (2022) Adversarial momentum-contrastive pre-training. Patt Recognit Lett 160:172–179. https://doi.org/10.1016/j.patrec.2022.07.005
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, PMLR, pp 1597–1607
Heidler K et al (2023) Self-supervised audiovisual representation learning for remote sensing data. Int J Appl Earth Obs Geoinformation 116:103130. https://doi.org/10.1016/j.jag.2022.103130
Yang X, Zhang Z, Cui R (2022) TimeCLR: a self-supervised contrastive learning framework for univariate time series representation. Knowl Based Syst 245:108606. https://doi.org/10.1016/j.knosys.2022.108606
Grill J-B et al (2020) Bootstrap your own latent-a new approach to self-supervised learning. Adv Neural Inf Process Syst 33:21271–21284
Wei L, Xie L, Zhou W, Li H, Tian Q (2023) Exploring the diversity and invariance in yourself for visual pre-training task. Patt Recognit 139:109437. https://doi.org/10.1016/j.patcog.2023.109437
Chen X, He K (2021) Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15750–15758
Li K et al (2023) DeAF: a multimodal deep learning framework for disease prediction. Comput Biol Med 156:106715. https://doi.org/10.1016/j.compbiomed.2023.106715
van den Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. ArXiv Prepr. ArXiv180703748
Liu Y et al (2022) Contrastive predictive coding with transformer for video representation learning. Neurocomputing 482:154–162. https://doi.org/10.1016/j.neucom.2021.11.031
Zhu S, Zheng W, Pang H (2023) CPAE: contrastive predictive autoencoder for unsupervised pre-training in health status prediction. Comput Methods Programs Biomed 234:107484. https://doi.org/10.1016/j.cmpb.2023.107484
Li J, Zhou P, Xiong C, Hoi SC (2020) Prototypical contrastive learning of unsupervised representations, ArXiv Prepr. ArXiv200504966
Lopez-Martin M, Sanchez-Esguevillas A, Arribas JI, Carro B (2022) Supervised contrastive learning over prototype-label embeddings for network intrusion detection. Inf Fusion 79:200–228. https://doi.org/10.1016/j.inffus.2021.09.014
Wang Z, Wu Y, Yang L. Real-time PPE detection & open dataset. https://github.com/ZijianWang-ZW/PPE_detection
Liu, S et al. (2022) DAB-DETR: dynamic anchor boxes are better queries for DETR. https://doi.org/10.48550/ARXIV.2201.12329
Koeshidayatullah A, Al-Azani S, Baraboshkin EE, Alfarraj M (2022) FaciesViT: vision transformer for an improved core lithofacies prediction. Front Earth Sci 10:992442. https://doi.org/10.3389/feart.2022.992442
Acknowledgements
The authors would like to acknowledge the support received from the Saudi Data and AI Authority (SDAIA) and King Fahd University of Petroleum and Minerals (KFUPM) under the SDAIA-KFUPM Joint Research Center for Artificial Intelligence Grant JRC-AI-UCG-01.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest in this work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Samma, H., Al-Azani, S., Luqman, H. et al. Contrastive-based YOLOv7 for personal protective equipment detection. Neural Comput & Applic 36, 2445–2457 (2024). https://doi.org/10.1007/s00521-023-09212-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-09212-6