RDD-YOLOv5: Road Defect Detection Algorithm with Self-Attention Based on Unmanned Aerial Vehicle Inspection
<p>The Road Defect Dataset contains six scenarios: countryside road, campus road, feeder road, arterial, highway, and expressway. The variety and complexity of the scenarios improve the robustness of the model.</p> "> Figure 2
<p>Statistical patterns of Road Defect Dataset. (<b>a</b>) Pie chart of the percentage of the four defect classes. (<b>b</b>) Histogram of the aspect ratio distribution pattern of the instances, including Road Defect Dataset, China Drone, CrackTree200, and CrackForest Dataset. (<b>c</b>) Line chart of the distribution of individual image instances across different datasets. (<b>d</b>) Box plot of the proportion of road defect categories within images.</p> "> Figure 3
<p>Structure of RDD-YOLOv5. SW Block is the swim transformer block, which contains the shifted windows multi-head self-attention. EVC Module is the explicit vision center, focusing on capturing global and local information. CBG, as a basic module, is the combination of the convolution layer, batch normalization, and GELU activation.</p> "> Figure 4
<p>Structure of SW Block in RDD-YOLOv5. SW Block consists of window and shift window multi-head self-attention. This figure simplifies the process of applying residual connections to each module.</p> "> Figure 5
<p>The structure of two modules of EVC Module. (<b>a</b>) The structure of Lightweight MLP. <math display="inline"><semantics> <mrow> <msub> <mi>X</mi> <mrow> <mi>o</mi> <mi>u</mi> <mi>t</mi> <mn>1</mn> </mrow> </msub> </mrow> </semantics></math> represents the first output from lightweight MLP. (<b>b</b>) The structure of LVC. <math display="inline"><semantics> <mo>⊕</mo> </semantics></math> represents the channel-wise addition. <math display="inline"><semantics> <mo>⊗</mo> </semantics></math> represents the channel-wise multiplication. <math display="inline"><semantics> <mrow> <msub> <mi>X</mi> <mrow> <mi>o</mi> <mi>u</mi> <mi>t</mi> <mn>2</mn> </mrow> </msub> </mrow> </semantics></math> is the output of LVC.</p> "> Figure 6
<p>Comparison of SiLU, ELU, and GELU activations between the original function and derivative function and the structure of CBG. (<b>a</b>) Comparison between SiLU, ELU, and GELU activations. (<b>b</b>) Comparison of SiLU, ELU, and GELU activations in the derivative function. (<b>c</b>) Structure of CBG (Convolution + BatchNormalization + GELU).</p> "> Figure 7
<p>The diagram of the UAV flight altitude and flight speed. The blue and orange lines represent the road lines. The green lines represent the road green belt. (<b>a</b>) Camera shooting height diagram. (<b>b</b>) UAV continuous capturing and image overlap schematic.</p> "> Figure 8
<p>The performances of strategies in YOLOv5s in ablation experiments. (<b>a</b>) is the evaluation of mAP@0.5 and (<b>b</b>) is mAP@0.5:0.95. They represent the ability of individual modules under different metrics.</p> "> Figure 9
<p>Evaluation metrics for different modules in the ablation experiments at various confidence levels. (<b>a</b>) displays the precision. (<b>b</b>) displays the recall. (<b>c</b>) represents the relationship between precision and recall. (<b>d</b>) represents the ablation models of F1 score (2 ∗ P ∗ R/(P + R)).</p> ">
Abstract
:1. Introduction
- A Road defect dataset is built. It includes a common category of road dataset and covers multiple road backgrounds and traffic conditions with precise manual annotations. To ensure its validity and universality, a few tricks of data augmentations are implemented on the dataset, which is suitable for road defect images based on the UAV perspective.
- An improved YOLOv5 algorithm named RDD-YOLOv5 (Road Defect Detection YOLOv5) is proposed. Considering the complexity and diversity of road defects, bottleneckCSP(C3) is replaced with a self-attentive mechanism model called SW Block, and a spatial explicit vision center called EVC Block is added to the neck to capture long-range dependencies and aggregate local critical regions. Finally, the activation function is replaced with GELUs to boost its fitting ability.
- The experiment establishes a UAV flight platform, including an accurate mathematical model for flight altitude and flight speed, to improve UAVs’ efficiency and image quality in collecting road defect images. A few experimental tricks are provided to enhance the performance further. In this case, the anchors are recalculated to improve the positioning accuracy for the abnormally shaped bounding boxes included in the dataset developed in this study. Additionally, label smoothing is applied to mitigate the impact of manual annotation errors in training, enhancing precision.
2. Related Work
2.1. Detection Carrier
2.2. Detection Method
3. Dataset
3.1. Dataset Collection
3.2. Dataset Characteristic
3.3. Dataset Augmentation
- Gaussian blur
- 2.
- Gaussian Noise
- 3.
- Poisson noise
- 4.
- Brightness adjustment
- 5.
- hue adjustment
4. Methods
4.1. Introduction of YOLOv5
4.2. Overview of RDD-YOLOv5
4.3. SW Block (Swin Transformer)
4.4. EVC Block (Explicit Vision Center Block)
4.5. Convolution Batch Normalization GeLU (CBG)
5. Experiments
5.1. Flight Setup
5.1.1. Flight Altitude
5.1.2. Flight Speed
5.2. Operation Environment
5.3. Experimental Tricks
5.3.1. K-Means++
5.3.2. Label Smoothing
5.4. Evaluation Metrics
5.5. Ablation Experiments
5.6. Comparison Experiments
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Sekar, A.; Perumal, V. CFC-GAN: Forecasting Road Surface Crack Using Forecasted Crack Generative Adversarial Network. IEEE Trans. Intell. Transp. Syst. 2022, 23, 21378–21391. [Google Scholar] [CrossRef]
- Zhou, J. Wavelet-based pavement distress detection and evaluation. Opt. Eng. 2006, 45, 027007. [Google Scholar] [CrossRef]
- Teschke, K.; Nicol, A.M.; Davies, H. Whole Body Vibration and Back Disorders among Motor Vehicle Drivers and Heavy Equipment Operators: A Review of the Scientific Evidence; University of British Columbia Library: Vancouver, BC, USA, 1999. [Google Scholar] [CrossRef]
- Granlund, J.; Ahlin, K.; Lundström, R. Whole-Body Vibration when Riding on Rough Roads; Swedish National Road Administration: Burlang, Sweden, 2000. [Google Scholar]
- Silva, N.; Shah, V.; Soares, J.; Rodrigues, H. Road anomalies detection system evaluation. Sensors 2018, 18, 1984. [Google Scholar] [CrossRef] [PubMed]
- Yang, F.; Zhang, L.; Yu, S.; Prokhorov, D.; Mei, X.; Ling, H. Feature pyramid and hierarchical boosting network for pavement crack detection. IEEE Trans. Intell. Transp. Syst. 2019, 21, 1525–1535. [Google Scholar] [CrossRef]
- Mei, Q.; Gül, M. A cost effective solution for pavement crack inspection using cameras and deep neural networks. Constr. Build. Mater. 2020, 256, 119397. [Google Scholar] [CrossRef]
- Ma, J.; Zhao, X.; He, S.; Song, H.; Zhao, X.; Song, H.; Cheng, L.; Wang, J.; Yuan, Z.; Huang, F. Review of pavement detection technology. J. Traffic Transp. Engineering 2017, 17, 121–137. [Google Scholar]
- Kim, J.Y. Development of New Automated Crack Measurement Algorithm Using Laser Images of Pavement Surface; The University of Iowa: Iowa City, IA, USA, 2008. [Google Scholar]
- Wang, K.C.; Gong, W. Real-time automated survey system of pavement cracking in parallel environment. J. Infrastruct. Syst. 2005, 11, 154–164. [Google Scholar] [CrossRef]
- Huang, Y.; Xu, B. Automatic inspection of pavement cracking distress. J. Electron. Imaging 2006, 15, 013017. [Google Scholar] [CrossRef]
- Chen, B.; Miao, X. Distribution line pole detection and counting based on YOLO using UAV inspection line video. J. Electr. Eng. Technol. 2020, 15, 441–448. [Google Scholar] [CrossRef]
- Hassan, S.-A.; Rahim, T.; Shin, S.-Y. An Improved Deep Convolutional Neural Network-Based Autonomous Road Inspection Scheme Using Unmanned Aerial Vehicles. Electronics 2021, 10, 2764. [Google Scholar] [CrossRef]
- Rivas, A.; Chamoso, P.; González-Briones, A.; Corchado, J.M. Detection of cattle using drones and convolutional neural networks. Sensors 2018, 18, 2048. [Google Scholar] [CrossRef] [PubMed]
- Dang, L.M.; Hassan, S.I.; Suhyeon, I.; kumar Sangaiah, A.; Mehmood, I.; Rho, S.; Seo, S.; Moon, H. UAV based wilt detection system via convolutional neural networks. Sustain. Comput. Inform. Syst. 2020, 28, 100250. [Google Scholar] [CrossRef]
- Zhu, J.; Zhong, J.; Ma, T.; Huang, X.; Zhang, W.; Zhou, Y. Pavement distress detection using convolutional neural networks with images captured via UAV. Autom. Constr. 2022, 133, 103991. [Google Scholar] [CrossRef]
- Salman, M.; Mathavan, S.; Kamal, K.; Rahman, M. Pavement crack detection using the Gabor filter. In Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013), The Hague, The Netherlands, 6–9 October 2013; pp. 2039–2044. [Google Scholar]
- Abdellatif, M.; Peel, H.; Cohn, A.G.; Fuentes, R. Pavement crack detection from hyperspectral images using a novel asphalt crack index. Remote Sens. 2020, 12, 3084. [Google Scholar] [CrossRef]
- Yong, H.; Chun-Xia, Z. A Local Binary Pattern Based Methods for Pavement Crack Detection. J. Pattern Recognit. Res. 2010, 5, 140–147. [Google Scholar]
- Cord, A.; Chambon, S. Automatic road defect detection by textural pattern recognition based on AdaBoost. Comput. Aided Civ. Infrastruct. Eng. 2012, 27, 244–259. [Google Scholar] [CrossRef]
- Hong, Y.; Lee, S.J.; Yoo, S.B. AugMoCrack: Augmented morphological attention network for weakly supervised crack detection. Electron. Lett. 2022, 58, 651–653. [Google Scholar] [CrossRef]
- Ong, J.C.; Lau, S.L.; Ismadi, M.-Z.; Wang, X. Feature pyramid network with self-guided attention refinement module for crack segmentation. Struct. Health Monit. 2023, 22, 672–688. [Google Scholar] [CrossRef]
- Singh, J.; Shekhar, S. Road damage detection and classification in smartphone captured images using mask r-cnn. arXiv 2018, arXiv:1811.04535. [Google Scholar]
- Arya, D.; Maeda, H.; Ghosh, S.K.; Toshniwal, D.; Sekimoto, Y. RDD2020: An annotated image dataset for automatic road damage detection using deep learning. Data Brief 2021, 36, 107133. [Google Scholar] [CrossRef]
- Du, Y.; Pan, N.; Xu, Z.; Deng, F.; Shen, Y.; Kang, H. Pavement distress detection and classification based on YOLO network. Int. J. Pavement Eng. 2021, 22, 1659–1672. [Google Scholar] [CrossRef]
- Mao, Z.; Zhao, C.; Zheng, Y.; Mao, Y.; Li, H.; Hua, L.; Liu, Y. Research on detection method of pavement diseases based on Unmanned Aerial Vehicle (UAV). In Proceedings of the 2020 International Conference on Image, Video Processing and Artificial Intelligence, Shanghai, China, 21–23 August 2020; pp. 127–132. [Google Scholar]
- Wu, C.; Ye, M.; Zhang, J.; Ma, Y. YOLO-LWNet: A lightweight road damage object detection network for mobile terminal devices. Sensors 2023, 23, 3268. [Google Scholar] [CrossRef] [PubMed]
- Quan, Y.; Zhang, D.; Zhang, L.; Tang, J. Centralized Feature Pyramid for Object Detection. arXiv 2022, arXiv:2210.02093. [Google Scholar] [CrossRef] [PubMed]
- Guo, Z.; Wang, C.; Yang, G.; Huang, Z.; Li, G. Msft-yolo: Improved yolov5 based on transformer for detecting defects of steel surface. Sensors 2022, 22, 3467. [Google Scholar] [CrossRef]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
- Hendrycks, D.; Gimpel, K. Gaussian Error Linear Units (GELUs). arXiv 2016, arXiv:1606.08415. [Google Scholar]
- Glorot, X.; Bordes, A.; Bengio, Y. Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence And Statistics, 2011, Fort Lauderdale, FL, USA, 11–13 April 2011; pp. 315–323. [Google Scholar]
- Clevert, D.-A.; Unterthiner, T.; Hochreiter, S. Fast and accurate deep network learning by exponential linear units (elus). arXiv 2015, arXiv:1511.07289. [Google Scholar]
- Grabowski, D.; Szczodrak, M.; Czyżewski, A. Economical methods for measuring road surface roughness. Metrol. Meas. Syst. 2018, 25, 533–549. [Google Scholar] [CrossRef]
- Gavilan, M.; Balcones, D.; Marcos, O.; Llorca, D.F.; Sotelo, M.A.; Parra, I.; Ocana, M.; Aliseda, P.; Yarza, P.; Amirola, A. Adaptive road crack detection system by pavement classification. Sensors 2011, 11, 9628–9657. [Google Scholar] [CrossRef]
- Luo, R. Research of Pavement Crack Detection Algorithm Based on Image Processing; Anhui Polytechnic University: Wuhu, China, 2017. [Google Scholar]
- Wang, K. Elements of automated survey of pavements and a 3D methodology. J. Mod. Transp. 2011, 19, 51–57. [Google Scholar] [CrossRef]
- Mejias, L.; Campoy, P.; Saripalli, S.; Sukhatme, G.S. A visual servoing approach for tracking features in urban areas using an autonomous helicopter. In Proceedings of the IEEE International Conference on Robotics & Automation, 2015, Seattle, WA, USA, 26–30 May 2015. [Google Scholar]
- Chen, J.; Geng, S.; Yan, Y.; Huang, D.; Liu, H.; Li, Y. Vehicle Re-identification Method Based on Vehicle Attribute and Mutual Exclusion Between Cameras. arXiv 2021, arXiv:2104.14882. [Google Scholar]
- Lee, J.-H.; Yoon, S.-S.; Kim, I.-H.; Jung, H.-J. Diagnosis of crack damage on structures based on image processing techniques and R-CNN using unmanned aerial vehicle (UAV). In Proceedings of the Sensors and Smart Structures Technologies for Civil, Mechanical, and Aerospace Systems 2018, Denver, CO, USA, 5–8 March 2018; pp. 265–272. [Google Scholar]
- Jin, Z. Research on Highway Inspection System Based on UAV Autonomous Flight; Wuhan Textile University: Wuhan, China, 2022. [Google Scholar]
- Joo, Y.J. Detection method for road pavement defect of UAV imagery based on computer vision. J. Korean Soc. Surv. Geod. Photogramm. Cartogr. 2017, 35, 599–608. [Google Scholar]
- Belacel, N.; Raval, H.B.; Punnen, A.P. Learning multicriteria fuzzy classification method PROAFTN from data. Comput. Oper. Res. 2007, 34, 1885–1898. [Google Scholar] [CrossRef]
- Li, M.; Jia, J.; Lu, X.; Zhang, Y.; Tolba, A. A Method of Surface Defect Detection of Irregular Industrial. Wirel. Commun. Mob. Comput. 2021, 2021, 6630802. [Google Scholar] [CrossRef]
- Oliveira, H.; Correia, P.L. Road Surface Crack Detection: Improved Segmentation with Pixel-based Refinement. In Proceedings of the 25th European Signal Processing Conference (EUSIPCO), Kos, Greece, 28 August–2 September 2017; pp. 2026–2030. [Google Scholar]
- Shi, Y.; Cui, L.M.; Qi, Z.Q.; Meng, F.; Chen, Z.S. Automatic Road Crack Detection Using Random Structured Forests. IEEE Trans. Intell. Transp. Syst. 2016, 17, 3434–3445. [Google Scholar] [CrossRef]
- Wang, P.; Hu, Y.; Dai, Y.; Tian, M. Asphalt pavement pothole detection and segmentation based on wavelet energy field. Math. Probl. Eng. 2017, 2017, 1604130. [Google Scholar] [CrossRef]
- Oliveira, H.; Correia, P.L. Supervised strategies for cracks detection in images of road pavement flexible surfaces. In Proceedings of the European Signal Processing Conference, 2008, Lausanne, Switzerland, 25–29 August 2008. [Google Scholar]
- Cao, Z.; Mao, Q.Q.; Wang, Q.Z. Crack Tree: Automatic crack detection from pavement images. Pattern Recogn. Lett. 2012, 33, 227–238. [Google Scholar]
- Amhaz, R.; Chambon, S.; Idier, J.; Baltazart, V. Automatic crack detection on two-dimensional pavement images: An algorithm based on minimal path selection. IEEE Trans. Intell. Transp. Syst. 2016, 17, 2718–2729. [Google Scholar] [CrossRef]
- Luo, J.; Yang, Z.; Li, S.; Wu, Y. FPCB Surface Defect Detection: A Decoupled Two-Stage Object Detection Framework. IEEE Trans. Instrum. Meas. 2021, 70, 5012311. [Google Scholar] [CrossRef]
- Zhang, S.; Wen, L.; Bian, X.; Lei, Z.; Li, S.Z. Single-Shot Refinement Neural Network for Object Detection. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 11–18 December 2015. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Wang, Q.; Mao, J.; Zhai, X.; Gui, J.; Shen, W.; Liu, Y. Improvements of YoloV3 for road damage detection. J. Phys. Conf. Ser. 2021, 1903, 012008. [Google Scholar] [CrossRef]
- Naseer, M.; Ranasinghe, K.; Khan, S.; Hayat, M.; Shahbaz Khan, F.; Yang, M.-H. Intriguing Properties of Vision Transformers. arXiv 2021, arXiv:2105.10497. [Google Scholar] [CrossRef]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios. In Proceedings of the IEEE/CVF International Conference on Computer Vision 2021, Montreal, BC, Canada, 11–17 October 2021. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), 2018, Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Jo, H.; Na, Y.-H.; Song, J.-B. Data augmentation using synthesized images for object detection. In Proceedings of the 2017 17th International Conference on Control, Automation and Systems (ICCAS), 2017, Jeju, Republic of Korea, 18–21 October 2017; pp. 1035–1038. [Google Scholar]
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. mixup: Beyond empirical risk minimization. arXiv 2017, arXiv:1710.09412. [Google Scholar]
- Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference On Computer Vision, 2019, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6023–6032. [Google Scholar]
- Harris, E.; Marcu, A.; Painter, M.; Niranjan, M.; Prügel-Bennett, A.; Hare, J. Fmix: Enhancing mixed sample data augmentation. arXiv 2020, arXiv:2002.12047. [Google Scholar]
- Elfwing, S.; Uchibe, E.; Doya, K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw. 2018, 107, 3–11. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference On Computer Vision, 2017, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Houlsby, N. An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Zalama, E.; Gómez-García-Bermejo, J.; Medina, R.; Llamas, J. Road crack detection using visual features extracted by Gabor filters. Comput. Aided Civ. Infrastruct. Eng. 2014, 29, 342–358. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Maeda, H.; Sekimoto, Y.; Seto, T.; Kashiyama, T.; Omata, H. Road damage detection and classification using deep neural networks with smartphone images. Comput.-Aided Civ. Infrastruct. Eng. 2018, 33, 1127–1141. [Google Scholar] [CrossRef]
- Arya, D.; Maeda, H.; Ghosh, S.K.; Toshniwal, D.; Sekimoto, Y. Rdd2022: A multi-national image dataset for automatic road damage detection. arXiv 2022, arXiv:2209.08538. [Google Scholar]
From | n | Params | Module | Arguments |
---|---|---|---|---|
−1 | 1 | 3520 | CBG | 64, 6, 2, 2 |
−1 | 1 | 18,560 | CBG | 128, 64, 3, 2 |
−1 | 1 | 18,816 | C3 | 128 |
−1 | 1 | 73,984 | CBG | 256, 3, 2 |
−1 | 2 | 115,712 | C3 | 256 |
−1 | 1 | 295,424 | CBG | 512, 3, 2 |
−1 | 3 | 625,152 | C3 | 512 |
−1 | 1 | 1,180,672 | CBG | 1024, 3, 2 |
−1 | 2 | 2,109,456 | SW Block | 1024 |
−1 | 1 | 656,896 | SPPF | 1024, 5 |
−1 | 1 | 131,584 | CBG | 512, 1, 1 |
−1 | 1 | 0 | Upsample | None |
−1 | 1 | 4,287,680 | EVC Block | 256, 256 |
[−1, 6] | 1 | 0 | Concat | 1 |
−1 | 1 | 361,984 | C3 | 512, 256, 1, False |
−1 | 1 | 33,024 | CBG | 256, 128, 1, 1 |
−1 | 1 | 0 | Upsample | None |
−1 | 1 | 1,077,952 | EVC Block | 128, 128 |
[−1, 4] | 1 | 0 | Concat | 1 |
−1 | 1 | 90,880 | C3 | 256, 128, 1, False |
−1 | 1 | 147,712 | CBG | 128, 128, 3, 2 |
[−1, 15] | 1 | 0 | Concat | 1 |
−1 | 1 | 296,448 | C3 | 256, 256, 1, False |
−1 | 1 | 590,336 | CBG | 256, 256, 3, 2 |
[−1, 10] | 1 | 0 | Concat | 1 |
−1 | 1 | 1,182,720 | C3 | 512, 512, 1, False |
[19, 22, 25] | 1 | 16,182 | Detect | anchors |
Infrastructure | Traffic Signs | Traffic Lights | Street Lights | Roadside Trees | Utility Poles |
---|---|---|---|---|---|
Altitude (m) | 4.5 | 3~7 | 6~12 | <15 | 12~17 |
Camera Lens | 24 mm | 35 mm | 50 mm |
---|---|---|---|
GSD | H/55 | H/80 | H/114 |
Road Classes | Sensor Size (mm) | Focal Length (mm) | Flight Altitude (m) | Flights Speed (m/s) | t (s) |
---|---|---|---|---|---|
Low-grade roads | 17.3 × 13 | 24 | 5~8 | 1.2 | 2 |
High-grade roads | 17.3 × 13 | 24 | 15~20 | 1.2 | 2 |
Object Scale | Anchor Size |
---|---|
Large | 19,27, 59,14, 17,64 |
Medium | 39,64, 24,122, 132,23 |
Small | 32,280, 361,44, 54,316 |
Model | K-Means++ | SW | EVC | GELU | Precision | Recall | [email protected] | [email protected]:.95 |
---|---|---|---|---|---|---|---|---|
YOLOv5 | 🗴 | 🗴 | 🗴 | 🗴 | 94.61 | 85.88 | 89.17 | 60.16 |
✓ | 🗴 | 🗴 | 🗴 | 96.15 | 87.71 | 90.89 | 62.37 | |
🗴 | ✓ | 🗴 | 🗴 | 96.47 | 87.48 | 90.10 | 60.22 | |
🗴 | 🗴 | ✓ | 🗴 | 95.93 | 87.19 | 91.16 | 65.66 | |
🗴 | 🗴 | 🗴 | ✓ | 95.81 | 86.26 | 89.91 | 62.93 | |
✓ | ✓ | ✓ | ✓ | 96.32 | 88.85 | 91.68 | 64.12 |
Model | [email protected] (%) | [email protected]:v0.95 (%) | F1 Score (%) | Volume (MB) |
---|---|---|---|---|
Mask R-CNN | \ | \ | 51.50 | 311.9 |
Faster R-CNN | 63.46 | \ | \ | 90.7 |
SSD | 57.12 | \ | \ | 92.1 |
YOLOv3-tiny | 79.74 | 40.78 | 79.99 | 16.6 |
YOLOv4-cfp-tiny | 68.45 | 34.33 | 60.00 | 45.1 |
YOLOv5 | 89.17 | 60.16 | 90.48 | 14.3 |
YOLOv7 | 91.42 | 65.78 | 92.68 | 73.1 |
RDD-YOLOv5 | 91.68 | 64.12 | 92.43 | 24.0 |
Dataset | Model | [email protected] (%) | [email protected]:0.95 (%) | Volume (MB) |
---|---|---|---|---|
CrackTree200 | Mask R-CNN | \ | 59.10 | 311.7 |
YOLOv4-cfp-tiny | 64.76 | 37.38 | 22.4 | |
YOLOv5 | 61.76 | 34.42 | 13.7 | |
RDD-YOLOv5 | 65.32 | 34.59 | 25.8 | |
CrackForest Dataset | Mask R-CNN | \ | 59.10 | 311.8 |
YOLOv4-cfp-tiny | 62.00 | 20.17 | 22.4 | |
YOLOv5 | 58.78 | 37.01 | 13.7 | |
RDD-YOLOv5 | 63.54 | 37.94 | 25.8 | |
ChinaDrone | Mask R-CNN | \ | 38.00 | 311.8 |
YOLOv4-cfp-tiny | 63.56 | 35.42 | 22.4 | |
YOLOv5 | 60.45 | 38.39 | 13.7 | |
RDD-YOLOv5 | 62.76 | 40.35 | 25.9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jiang, Y.; Yan, H.; Zhang, Y.; Wu, K.; Liu, R.; Lin, C. RDD-YOLOv5: Road Defect Detection Algorithm with Self-Attention Based on Unmanned Aerial Vehicle Inspection. Sensors 2023, 23, 8241. https://doi.org/10.3390/s23198241
Jiang Y, Yan H, Zhang Y, Wu K, Liu R, Lin C. RDD-YOLOv5: Road Defect Detection Algorithm with Self-Attention Based on Unmanned Aerial Vehicle Inspection. Sensors. 2023; 23(19):8241. https://doi.org/10.3390/s23198241
Chicago/Turabian StyleJiang, Yutian, Haotian Yan, Yiru Zhang, Keqiang Wu, Ruiyuan Liu, and Ciyun Lin. 2023. "RDD-YOLOv5: Road Defect Detection Algorithm with Self-Attention Based on Unmanned Aerial Vehicle Inspection" Sensors 23, no. 19: 8241. https://doi.org/10.3390/s23198241
APA StyleJiang, Y., Yan, H., Zhang, Y., Wu, K., Liu, R., & Lin, C. (2023). RDD-YOLOv5: Road Defect Detection Algorithm with Self-Attention Based on Unmanned Aerial Vehicle Inspection. Sensors, 23(19), 8241. https://doi.org/10.3390/s23198241