More Web Proxy on the site http://driver.im/

research-article

Gaussian guided IoU: : A better metric for balanced learning on object detection

Authors:

Xiaoping LiAuthors Info & Claims

IET Computer Vision, Volume 16, Issue 6

Pages 556 - 566

https://doi.org/10.1049/cvi2.12113

Published: 01 June 2022 Publication History

Abstract

Most anchor‐based detectors use intersection over union (IoU) to assign targets to anchors during training. However, IoU did not pay enough attention to the proximity of the anchor's centre to the centre of the truth box, resulting in two issues: (1) the most slender objects were given just one anchor, resulting in insufficient supervision information for slender objects during training; (2) IoU cannot accurately represent the degree of alignment between the feature's receptive field at the anchor's centre and the object. As a result, some features with good alignment degrees are missing, while others with poor alignment degrees are used, reducing the model's localisation accuracy. To address these issues, we first created a Gaussian Guided IoU (GGIoU), which prioritises the proximity of the anchor's centre to the truth box's centre. We then proposed GGIoU‐balanced learning methods, including GGIoU‐guided assignment strategy and GGIoU‐balanced localisation loss. This method can assign multiple anchors to each slender object, favouring features that are well‐aligned with the objects during the training process. A large number of experiments show that GGIoU‐balanced learning can solve the aforementioned problems and significantly improve the detection model's performance.

References

[1]

Fu, CY., et al.: DSSD: deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)

[2]

Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)

[3]

Ren, S., et al.: Faster R‐CNN: towards real‐time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)

[4]

Wu, S., Li, X., Wang, X.: IOU‐aware single‐stage object detector for accurate localization. Image Vis. Comput. 97, 103911 (2020)

[5]

Ye, X.‐Y., et al.: A two‐stage real‐time YOLOv2‐based road marking detector with lightweight spatial transformation‐invariant classification. Image Vis. Comput. 102, 103978 (2020)

[6]

Kong, T., et al.: FoveaBox: beyound anchor‐based object detection. IEEE Trans. Image Process. 29, 7389–7398 (2020)

[7]

Redmon, J., et al.: You only look once: unified, real‐time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

[8]

Tian, Z., et al.: FCOS: fully convolutional one‐stage object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9627–9636 (2019)

[9]

Yu, J., et al.: Unitbox: an advanced object detection network. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 516–520 (2016)

[10]

Lin, T.‐Y., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

[11]

Liu, W., et al.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)

[12]

Everingham, M., et al.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

[13]

Lin, T.‐Y., et al.: Microsoft COCO: common objects in context. In: European Conference on Computer Vision, pp. 740–755. Springer (2014)

[14]

Rezatofighi, H., et al.: Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019)

[15]

Zheng, Z., et al.: Distance‐IOU loss: faster and better learning for bounding box regression. Proc. AAAI Conf. Artif. Intell. 34, 12993–13000 (2020)

[16]

Zhang, X., et al.: Learning to match anchors for visual object detection. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)

[17]

Ke, W., et al.: Multiple anchor learning for visual object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10206–10215 (2020)

[18]

Zhang, S., et al.: Bridging the gap between anchor‐based and anchor‐free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9759–9768 (2020)

[19]

Cai, Z., Vasconcelos, N.: Cascade R‐CNN: delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162 (2018)

[20]

He, K., et al.: Mask R‐CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

[21]

Dai, J., et al.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 764–773 (2017)

[22]

Zhu, X., et al.: Deformable ConvNets v2: more deformable, better results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9308–9316 (2019)

[23]

Wang, J., et al.: Region proposal by guided anchoring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2965–2974 (2019)

[24]

Chen, Y., et al.: Revisiting feature alignment for one‐stage object detection. arXiv preprint arXiv:1908.01570 (2019)

[25]

Wu, S., et al.: Iou‐balanced loss functions for single‐stage object detection. arXiv preprint arXiv:1908.05641 (2019)

[26]

Yang, J., et al.: Scd: a stacked carton dataset for detection and segmentation. arXiv preprint arXiv:2102.12808 (2021)

[27]

Gou, L., et al.: Carton dataset synthesis method for loading‐and‐unloading carton detection based on deep learning. Int. J. Adv. Manuf. Technol., 1–18 (2022)

[28]

Chen, K., et al.: MMdetection: open MMlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)

[29]

Goyal, P., et al.: Accurate, large minibatch SGD: training ImageNet in 1 hour. arXiv preprint arXiv:1706.02677 (2017)

[30]

Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

Recommendations

IoU-guided Siamese region proposal network for real-time visual tracking
Abstract
Recently, region proposal network (RPN) has been combined with the Siamese network for tracking and shown excellent accuracy and high efficiency. However, the low correlation between the classification score and localization accuracy ...
Classification-IoU Joint Label Assignment for End-to-End Object Detection
Pattern Recognition and Computer Vision
Abstract
Using prediction-aware label assignment to choose positive samples has proven to be the key to remove Non-maximum Suppression (NMS) and reach end-to-end object detection. However, existing prediction-aware label assignment methods combine the ...
IoU-Balanced loss functions for single-stage object detection
Abstract
Single-stage object detectors have been widely applied in computer vision applications due to their high efficiency. However, the loss functions adopted by single-stage detectors hurt the localization accuracy seriously. Firstly, the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IET Computer Vision

IET Computer Vision Volume 16, Issue 6

September 2022

88 pages

EISSN:1751-9640

DOI:10.1049/cvi2.v16.6

Issue’s Table of Contents

© 2022 The Authors. IET Computer Vision published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology.

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 01 June 2022

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents