Multiscale object detection based on channel and data enhancement at construction sites

Hengyou Wang ORCID: orcid.org/0000-0001-6693-0161^1,3,
Yanfei Song¹,
Lianzhi Huo²,
Linlin Chen^1,3 &
…
Qiang He^1,3

531 Accesses
8 Citations
Explore all metrics

Abstract

Object detection based on computer vision techniques plays an important role in the safety monitoring of large-scene construction sites. However, current object detection algorithms typically have poor performance on small targets. In this study, an enhanced multiscale object detection algorithm is developed to solve the problem of poor detection performance due to scale changes at construction sites. First, a scale-aware data automatic augmentation is defined to learn a data augmentation strategy. Then, to mitigate information loss caused by channel reduction when using feature pyramid network, we propose a method based on subpixel convolution to perform channel enhancement and upsampling, and add a bottom-up path to enhance the complete feature hierarchy with accurate localization signals in the lower layers. Experimental results show that the proposed algorithm achieves better accuracy on the construction site (MOCS) data set and the MS COCO data set. For example, compared with the Faster R-CNN detector with the ResNet-50 backbone network on the MOCS data set and MS COCO data set, the average accuracy increased by \(8.0\%\) and \(1.5\%\), respectively. In particular, the average accuracy of small targets increased by \(10.3\%\) and \(3.4\%\), respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

ISOD: improved small object detection based on extended scale feature pyramid network

Article 28 March 2024

Semantic Segmentation-based Visual Detection of Construction Objects on Oversized Excavation Sites

UJN-Traffic: A Benchmark Dataset for Performance Evaluation of Traffic Element Classification

References

Vasuhi, S., Vaidehi, V.: Target detection and tracking for video surveillance. WSEAS Trans. Signal Process. 10, 179–188 (2014)
Google Scholar
Zhang, D.D., Lei, L.I.: Face detection system based on pcanet-rf. Comput. Technol. Dev. 26(2), 31–34 (2016)
Google Scholar
Martinez-Martin, E., Del Pobil, A.P.: Object detection and recognition for assistive robots: experimentation and implementation. IEEE Robot. Automat. Magazine 24(3), 123–138 (2017)
Article Google Scholar
Kim, D., Liu, M., Lee, S., Kamat, V.R.: Remote proximity monitoring between mobile construction resources using camera-mounted uavs. Autom. Constr. 99, 168–182 (2019)
Article Google Scholar
Roberts, D., Golparvar-Fard, M.: End-to-end vision-based detection, tracking and activity analysis of earthmoving equipment filmed at ground level. Autom. Constr. 105, 102811 (2019)
Article Google Scholar
Fang, Q., Li, H., Luo, X., Ding, L., Luo, H., Rose, T.M., An, W.: Detecting non-hardhat-use by a deep learning method from far-field surveillance videos. Autom. Constr. 85, 1–9 (2018)
Article Google Scholar
Xuehui, A., Li, Z., Zuguang, L., Chengzhi, W., Pengfei, L., Zhiwei, L.: Dataset and benchmark for detecting moving objects in construction sites. Autom. Constr. 122, 103482 (2021)
Article Google Scholar
Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)
Article Google Scholar
Zou, W.W., Yuen, P.C.: Very low resolution face recognition problem. IEEE Trans. Image Process. 21(1), 327–340 (2012)
Article MathSciNet MATH Google Scholar
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., Lin, D.: Libra r-cnn: Towards balanced learning for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 821–830 (2019)
Ghiasi, G., Lin, T.-Y., Le, Q.V.: Nas-fpn: Learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7036–7045 (2019)
Fang, H.-S., Sun, J., Wang, R., Gou, M., Li, Y.-L., Lu, C.: Instaboost: Boosting instance segmentation via probability map guided copy-pasting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 682–691 (2019)
Dwibedi, D., Misra, I., Hebert, M.: Cut, paste and learn: Surprisingly easy synthesis for instance detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1301–1310 (2017)
Chen, Y., Li, Y., Kong, T., Qi, L., Chu, R., Li, L., Jia, J.: Scale-aware automatic augmentation for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9563–9572 (2021)
Farhadi, A., Redmon, J.: Yolov3: An incremental improvement. In: Computer Vision and Pattern Recognition, pp. 1804–2767. Springer Berlin/Heidelberg, Germany (2018)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37 (2016). Springer
Fu, C.-Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Info. Process. Syst 28, 91–99 (2015)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Guo, C., Fan, B., Zhang, Q., Xiang, S., Pan, C.: Augfpn: Improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12595–12604 (2020)
Singh, B., Najibi, M., Davis, L.S.: Sniper: Efficient multi-scale training. arXiv preprint arXiv:1805.09300 (2018)
Singh, B., Davis, L.S.: An analysis of scale invariance in object detection snip. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3578–3587 (2018)
Fang, H.-S., Sun, J., Wang, R., Gou, M., Li, Y.-L., Lu, C.: Instaboost: Boosting instance segmentation via probability map guided copy-pasting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 682–691 (2019)
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 113–123 (2019)
Zoph, B., Cubuk, E.D., Ghiasi, G., Lin, T.-Y., Shlens, J., Le, Q.V.: Learning data augmentation strategies for object detection. In: European Conference on Computer Vision, pp. 566–583 (2020). Springer
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016)
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: Proceedings of the Aaai Conference on Artificial Intelligence, vol. 33, pp. 4780–4789 (2019)
Glenn Jocher: Yolov5. https://github.com/ultralytics/yolov5, (2021)
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)

Download references

Acknowledgements

This study was supported in part by the National Natural Science Foundation of China (Nos. 62072024, 41971396, and 61971290), the Research Ability Enhancement Program for Young Teachers of Beijing University of Civil Engineering and Architecture (No. X21024), the Outstanding Youth Program of Beijing University of Civil Engineering and Architecture, the BUCEA Post Graduate Innovation Project, and R &D Program of Beijing Municipal Education Commission(Nos. KM202110016001, KM202210016002).

Author information

Authors and Affiliations

School of Science, Beijing University of Civil Engineering and Architecture, Beijing, 100044, China
Hengyou Wang, Yanfei Song, Linlin Chen & Qiang He
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100094, China
Lianzhi Huo
Institute of Big Data Modeling and Technology, Beijing University of Civil Engineering and Architecture, Beijing, 100044, China
Hengyou Wang, Linlin Chen & Qiang He

Authors

Hengyou Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yanfei Song
View author publications
You can also search for this author in PubMed Google Scholar
Lianzhi Huo
View author publications
You can also search for this author in PubMed Google Scholar
Linlin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Qiang He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hengyou Wang.

Additional information

Communicated by B-K Bao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, H., Song, Y., Huo, L. et al. Multiscale object detection based on channel and data enhancement at construction sites. Multimedia Systems 29, 49–58 (2023). https://doi.org/10.1007/s00530-022-00983-x

Download citation

Received: 04 December 2021
Accepted: 10 July 2022
Published: 28 July 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s00530-022-00983-x

Multiscale object detection based on channel and data enhancement at construction sites

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ISOD: improved small object detection based on extended scale feature pyramid network

Semantic Segmentation-based Visual Detection of Construction Objects on Oversized Excavation Sites

UJN-Traffic: A Benchmark Dataset for Performance Evaluation of Traffic Element Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Multiscale object detection based on channel and data enhancement at construction sites

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ISOD: improved small object detection based on extended scale feature pyramid network

Semantic Segmentation-based Visual Detection of Construction Objects on Oversized Excavation Sites

UJN-Traffic: A Benchmark Dataset for Performance Evaluation of Traffic Element Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation