Enhancing Bolt Object Detection via AIGC-Driven Data Augmentation for Automated Construction Inspection
<p>Schematic diagram of the Stable Diffusion model architecture.</p> "> Figure 2
<p>The structure of the ViT model.</p> "> Figure 3
<p>Flowchart of the training process of the CLIP model.</p> "> Figure 4
<p>Structural diagram of the U-Net model.</p> "> Figure 5
<p>Workflow chart of the VAE decoder.</p> "> Figure 6
<p>Workflow chart of Stable Diffusion.</p> "> Figure 7
<p>Text prompt in Stable Diffusion WebUI (Version 1.10.0).</p> "> Figure 8
<p>LoRA fine-tuning model process.</p> "> Figure 9
<p>The fine-tuning result of Dreambooth.</p> "> Figure 10
<p>The effect diagram of LoRA fine-tuning.</p> "> Figure 11
<p>Images of data augmentation.</p> "> Figure 12
<p>Diagram of the training process of YOLO (The precision-recall curve for 9 groups).</p> "> Figure 12 Cont.
<p>Diagram of the training process of YOLO (The precision-recall curve for 9 groups).</p> "> Figure 13
<p>The confusion matrix for the training process of YOLO.</p> "> Figure 13 Cont.
<p>The confusion matrix for the training process of YOLO.</p> ">
Abstract
:1. Introduction
2. The Fabrication of a Bolt Target Detection Dataset Based on AIGC Technology
2.1. A Brief Introduction to AIGC
2.2. Image Generation Algorithm
2.3. A Comprehensive Analysis of Stable Diffusion
- (1)
- Contrastive Language–Image Pre-training Model
- (2)
- U-Net-Based Image Generator
- (3)
- VAE Decoding Unit
2.4. The Processing Procedure for Stable Diffusion
2.5. Fine-Tuning of Stable Diffusion Based on LoRA
2.6. Production of Bolt Target Detection Dataset
3. The YOLO Algorithm and Its Performance Validation
3.1. Experimental Configuration and Parameters
3.2. Evaluation Indexes
3.3. Experimental Environment and Hyperparameters
3.4. Experimental Results
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Xu, L.; Zhao, Y.; Zhai, Y.; Huang, L.; Ruan, C. Small Object Detection in UAV Images Based on YOLOv8n. Int. J. Comput. Int. Sys. 2024, 17, 223. [Google Scholar] [CrossRef]
- Lawal, O.M. Real-Time Cucurbit Fruit Detection in Greenhouse Using Improved YOLO Series Algorithm. Precis. Agric. 2024, 25, 347–359. [Google Scholar] [CrossRef]
- Nan, G.; Zhao, Y.; Lin, C.; Ye, Q. General Optimization Methods for YOLO Series Object Detection in Remote Sensing Images. IEEE Signal Process. Lett. 2024, 31, 2860–2864. [Google Scholar] [CrossRef]
- Gao, Z.; Li, Y.; Chen, Z.; Asif, M.; Xu, L.; Li, X.; Aaron Gulliver, T. Intelligent Spectrum Sensing of Consumer IoT Based on GAN-GRU-YOLO. IEEE Trans. Consum. Electron. 2024, 70, 6140–6148. [Google Scholar] [CrossRef]
- Yang, Y.; Cheng, H.; Du, K.; Liang, B.; Hu, W.; Luo, B.; Zhang, K. Microscale Damage Modeling of Bolt-Hole Contact Interface during the Bolt Installation Process of Composite Structure. Compos. Struct. 2022, 291, 115561. [Google Scholar] [CrossRef]
- Champati, A.; Voggu, S.; Lute, V. Detection of Damage in Bolted Steel Structures Using Vibration Signature Analysis. J. Vib. Eng. Technol. 2024, 12, 1399–1412. [Google Scholar] [CrossRef]
- Li, X.; Zheng, B.; Chen, Y.; Zou, C. A hybrid methodology for estimating train-induced rigid foundation building vibrations. Constr. Build. Mater. 2025, 460, 139852. [Google Scholar] [CrossRef]
- Zou, C.; Li, X.; He, C.; Zhou, S. An efficient method for estimating building dynamic response due to train operations in tunnel considering transmission path from source to receiver. Comput. Struct. 2024, 305, 107555. [Google Scholar] [CrossRef]
- Li, Z.; Shao, P.; Zhao, M.; Yan, K.; Liu, G.; Wan, L.; Xu, X.; Li, K. Optimized Deep Learning for Steel Bridge Bolt Corrosion Detection and Classification. J. Constr. Steel Res. 2024, 215, 108570. [Google Scholar] [CrossRef]
- Tao, Z.; Zhang, D.; Tu, D.; He, L.; Zou, C. Prediction of train-induced ground-borne vibration transmission considering parametric uncertainties. Probab. Eng. Mech. 2025, 79, 103731. [Google Scholar] [CrossRef]
- Huang, H.; Wang, Y.; Pang, Q. Analysis and Prediction of Wind Turbine Bolts Based on GPR Method. J. Mech. Sci. Technol. 2023, 37, 1155–1164. [Google Scholar] [CrossRef]
- Tao, T.; Yang, Y.; Yang, T.; Liu, S.; Guo, X.; Wang, H.; Liu, Z.; Chen, W.; Liang, C.; Long, K.; et al. Time-Domain Fatigue Damage Assessment for Wind Turbine Tower Bolts under Yaw Optimization Control at Offshore Wind Farm. Ocean Eng. 2024, 303, 117706. [Google Scholar] [CrossRef]
- Chen, Y.; Zhao, Z.; Liu, J.; Tan, S.; Liu, C. Application of Generative AI-Based Data Augmentation Technique in Transformer Winding Deformation Fault Diagnosis. Eng. Fail. Anal. 2024, 159, 108115. [Google Scholar] [CrossRef]
- Li, F.; Ge, J.; Wang, X.; Zhao, G.; Yu, X.; Li, X. Privacy-Preserving Vertical Federated Broad Learning System for Artificial Intelligence Generated Image Content. J. Real-Time Image Proc. 2024, 21, 14. [Google Scholar] [CrossRef]
- Wang, B.; Yang, F. Lightweight and Privacy-Preserving Hierarchical Federated Learning Mechanism for Artificial Intelligence-Generated Image Content. J. Real-Time Image Proc. 2024, 21, 149. [Google Scholar] [CrossRef]
- Zhang, J.; Sun, L.; Jin, C.; Gao, J.; Li, X.; Luo, J.; Pan, Z.; Tang, Y.; Wang, J. Recent Advances in Artificial Intelligence Generated Content. Front. Inform. Technol. Electron. Eng. 2024, 25, 1–5. [Google Scholar] [CrossRef]
- Vijendran, M.; Deng, J.; Chen, S.; Ho, E.S.L.; Shum, H.P.H. Artificial Intelligence for Geometry-Based Feature Extraction, Analysis and Synthesis in Artistic Images: A Survey. Artif. Intell. Rev. 2024, 58, 64. [Google Scholar] [CrossRef]
- Frid-Adar, M.; Diamant, I.; Klang, E.; Amitai, M.; Goldberger, J.; Greenspan, H. GAN-Based Synthetic Medical Image Augmentation for Increased CNN Performance in Liver Lesion Classification. Neurocomputing 2018, 321, 321–331. [Google Scholar] [CrossRef]
- Jia, Y. A Comprehensive Review of Diffusion Models in AI-Generated Content for Image Applications. ACE 2024, 94, 197–202. [Google Scholar] [CrossRef]
- Zuo, X.; Tian, Z.; Yin, M.; Dang, L.; Qiao, B.; Liu, Y.; Xie, Y. Remote sensing super-resolution image generation based on residual diffusion mode. J. Henan Norm. Univ. 2025, 1–8. [Google Scholar]
- Yang, J.; Zhang, H. Development and Challenges of Generative Artificial Intelligence in Education and Art. HSET 2024, 85, 1334–1347. [Google Scholar] [CrossRef]
- Shao, L.; Chen, B.; Zhang, Z.; Zhang, Z.; Chen, X. Artificial Intelligence Generated Content (AIGC) in Medicine: A Narrative Review. Math. Biosci. Eng. 2024, 21, 1672–1711. [Google Scholar] [CrossRef] [PubMed]
- Jin, J.; Yang, M.; Hu, H.; Guo, X.; Luo, J.; Liu, Y. Empowering Design Innovation Using AI-Generated Content. J. Eng. Des. 2025, 36, 1–18. [Google Scholar] [CrossRef]
- Li, B.; Yang, P.; Sun, Y.; Hu, Z.; Yi, M. Advances and Challenges in Artificial Intelligence Text Generation. Front. Inform. Technol. Electron. Eng. 2024, 25, 64–83. [Google Scholar] [CrossRef]
- Safiya, K.M.; Pandian, R. A Real-Time Image Captioning Framework Using Computer Vision to Help the Visually Impaired. Multimed. Tools Appl. 2023, 83, 59413–59438. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. arXiv 2022, arXiv:1312.6114. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Sohl-Dickstein, J.; Weiss, E.A.; Maheswaranathan, N.; Ganguli, S. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. In International Conference on Machine Learning; PMLR: Cambridge, MA, USA, 2015; pp. 2256–2265. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models from Natural Language Supervision. In International Conference on Machine Learning; PMLR: Cambridge, MA, USA, 2015; pp. 8748–8763. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2023, arXiv:1706.03762. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale 2021. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer International Publishing: Cham, Switzerland, 2015; Volume 9351, pp. 234–241. [Google Scholar]
- Yuan, Z.; Li, L.; Wang, Z.; Zhang, X. Watermarking for Stable Diffusion Models. IEEE Internet Things J. 2024, 11, 35238–35249. [Google Scholar] [CrossRef]
- Tai, Y.; Yang, K.; Peng, T.; Huang, Z.; Zhang, Z. Defect Image Sample Generation with Diffusion Prior for Steel Surface Defect Recognition. IEEE Trans. Autom. Sci. Eng. 2024, 1–13. [Google Scholar] [CrossRef]
- Jiang, X.; Wang, Z.; Liu, W. Information Dissemination in Dynamic Hypernetwork. Phys. A 2019, 532, 121578. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, X.; Xie, L.; Qi, Z.; Shan, Y.; Wang, W.; Luo, P. StyleAdapter: A Unified Stylized Image Generation Model. Int. J. Comput. Vis. 2024. [Google Scholar] [CrossRef]
- Zhang, M.; Yang, J.; Xian, Y.; Li, W.; Gu, J.; Meng, W.; Zhang, J.; Zhang, X. AG-SDM: Aquascape Generation Based on Stable Diffusion Model with Low-rank Adaptation. Comput. Anim. Virtual Worlds 2024, 35, e2252. [Google Scholar] [CrossRef]
Name | Parameter Information |
---|---|
CPU | Intel Core-i7-13650HX (Intel Corporation, Santa Clara, CA, USA) |
GPU | NVIDIA GeForce RTX 4060 8 GB (NVIDIA Corporation, Santa Clara, CA, USA) |
Memory | 32 GB |
Operating system | Windows 10 (Microsoft Corporation, Redmond, WA, USA) |
Development language | Python |
Name | Parameter Information |
---|---|
Epochs | 100 |
Optimizer | AdamW |
Initial learning rate | 0.001 |
Batch size | 16 |
Image size | 640 × 640 |
Learning rate decay strategy | Cosine Annealing |
Experimental Categorization | Dataset | Model Selection |
---|---|---|
Group 1 | Original | YOLOv5 |
Group 2 | DB-SD Augmented | YOLOv5 |
Group 3 | LoRA-SD Augmented | YOLOv5 |
Group 4 | Original | YOLOv8 |
Group 5 | DB-SD Augmented | YOLOv8 |
Group 6 | LoRA-SD Augmented | YOLOv8 |
Group 7 | Original | YOLOv11 |
Group 8 | DB-SD Augmented | YOLOv11 |
Group 9 | LoRA-SD Augmented | YOLOv11 |
Models | Datasets | Epochs | AP | mAP | ||
---|---|---|---|---|---|---|
Bolt | Corrosion | Loosened | ||||
YOLOv5n | Original | 100 | 0.961 | 0.857 | 0.955 | 0.924 |
DB-SD Augmented | 0.961 | 0.847 | 0.953 | 0.920 | ||
LoRA-SD Augmented | 0.952 | 0.981 | 0.927 | 0.953 | ||
YOLOv8n | Original | 0.941 | 0.872 | 0.945 | 0.919 | |
DB-SD Augmented | 0.964 | 0.808 | 0.916 | 0.896 | ||
LoRA-SD Augmented | 0.965 | 0.989 | 0.930 | 0.961 | ||
YOLOv11n | Original | 0.946 | 0.868 | 0.937 | 0.917 | |
DB-SD Augmented | 0.951 | 0.869 | 0.933 | 0.918 | ||
LoRA-SD Augmented | 0.970 | 0.995 | 0.947 | 0.970 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, J.; Han, B.; Zhang, Y.; Huang, C.; Qiu, S.; Feng, W.; Liu, Z.; Zou, C. Enhancing Bolt Object Detection via AIGC-Driven Data Augmentation for Automated Construction Inspection. Buildings 2025, 15, 819. https://doi.org/10.3390/buildings15050819
Wu J, Han B, Zhang Y, Huang C, Qiu S, Feng W, Liu Z, Zou C. Enhancing Bolt Object Detection via AIGC-Driven Data Augmentation for Automated Construction Inspection. Buildings. 2025; 15(5):819. https://doi.org/10.3390/buildings15050819
Chicago/Turabian StyleWu, Jie, Beilin Han, Yihang Zhang, Chuyue Huang, Shengqiang Qiu, Wang Feng, Zhiwei Liu, and Chao Zou. 2025. "Enhancing Bolt Object Detection via AIGC-Driven Data Augmentation for Automated Construction Inspection" Buildings 15, no. 5: 819. https://doi.org/10.3390/buildings15050819
APA StyleWu, J., Han, B., Zhang, Y., Huang, C., Qiu, S., Feng, W., Liu, Z., & Zou, C. (2025). Enhancing Bolt Object Detection via AIGC-Driven Data Augmentation for Automated Construction Inspection. Buildings, 15(5), 819. https://doi.org/10.3390/buildings15050819