More Web Proxy on the site http://driver.im/

research-article

Correlation loss: enforcing correlation between classification and localization

AUTHORs:

Fehmi Kahraman,

Emre AkbasAuthors Info & Claims

AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

Article No.: 121, Pages 1087 - 1095

https://doi.org/10.1609/aaai.v37i1.25190

Published: 07 February 2023 Publication History

Abstract

Object detectors are conventionally trained by a weighted sum of classification and localization losses. Recent studies (e.g., predicting IoU with an auxiliary head, Generalized Focal Loss, Rank & Sort Loss) have shown that forcing these two loss terms to interact with each other in non-conventional ways creates a useful inductive bias and improves performance. Inspired by these works, we focus on the correlation between classification and localization and make two main contributions: (i) We provide an analysis about the effects of correlation between classification and localization tasks in object detectors. We identify why correlation affects the performance of various NMS-based and NMS-free detectors, and we devise measures to evaluate the effect of correlation and use them to analyze common detectors. (ii) Motivated by our observations, e.g., that NMS-free detectors can also benefit from correlation, we propose Correlation Loss, a novel plug-in loss function that improves the performance of various object detectors by directly optimizing correlation coefficients: E.g., Correlation Loss on Sparse R-CNN, an NMS-free method, yields 1.6 AP gain on COCO and 1.8 AP gain on Cityscapes dataset. Our best model on Sparse R-CNN reaches 51.0 AP without test-time augmentation on COCO test-dev, reaching state-of-the-art.

References

[1]

Blondel, M.; Teboul, O.; Berthet, Q.; and Djolonga, J. 2020. Fast differentiable sorting and ranking. In International Conference on Machine Learning (ICML).

[2]

Bolya, D.; Zhou, C.; Xiao, F.; and Lee, Y. J. 2019. YOLACT: Real-time Instance Segmentation. In IEEE/CVF International Conference on Computer Vision (ICCV).

[3]

Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; and Zagoruyko, S. 2020. End-to-End Object Detection with Transformers. In European Conference on Computer Vision (ECCV).

[4]

Chen, K.; Lin, W.; li, J.; See, J.; Wang, J.; and Zou, J. 2020. AP-Loss for Accurate One-Stage Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 1-1.

[5]

Chen, K.; Wang, J.; Pang, J.; Cao, Y.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Xu, J.; Zhang, Z.; Cheng, D.; Zhu, C.; Cheng, T.; Zhao, Q.; Li, B.; Lu, X.; Zhu, R.; Wu, Y.; Dai, J.; Wang, J.; Shi, J.; Ouyang, W.; Loy, C. C.; and Lin, D. 2019. MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv, 1906.07155.

[6]

Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; and Schiele, B. 2016. The Cityscapes Dataset for Semantic Urban Scene Understanding. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]

Dai, X.; Chen, Y.; Yang, J.; Zhang, P.; Yuan, L.; and Zhang, L. 2021. Dynamic DETR: End-to-End Object Detection With Dynamic Attention. In IEEE/CVF International Conference on Computer Vision (ICCV).

[8]

Feng, C.; Zhong, Y.; Gao, Y.; Scott, M. R.; and Huang, W. 2021. TOOD: Task-aligned One-stage Object Detection. In The International Conference on Computer Vision (ICCV).

[9]

He, K.; Gkioxari, G.; Dollar, P.; and Girshick, R. 2017. Mask R-CNN. In IEEE/CVF International Conference on Computer Vision (ICCV).

[10]

He, Y.; Zhu, C.; Wang, J.; Savvides, M.; and Zhang, X. 2019. Bounding Box Regression With Uncertainty for Accurate Object Detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]

Huang, Z.; Huang, L.; Gong, Y.; Huang, C.; and Wang, X. 2019. Mask Scoring R-CNN. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]

Jiang, B.; Luo, R.; Mao, J.; Xiao, T.; and Jiang, Y. 2018. Acquisition of Localization Confidence for Accurate Object Detection. In The European Conference on Computer Vision (ECCV).

[13]

Kim, K.; and Lee, H. S. 2020. Probabilistic Anchor Assignment with IoU Prediction for Object Detection. In The European Conference on Computer Vision (ECCV).

[14]

Kong, T.; Sun, F.; Liu, H.; Jiang, Y.; Li, L.; and Shi, J. 2020. FoveaBox: Beyound Anchor-Based Object Detection. IEEE Transactions on Image Processing, 29: 7389-7398.

Digital Library

[15]

Law, H.; and Deng, J. 2018. CornerNet: Detecting Objects as Paired Keypoints. In The European Conference on Computer Vision (ECCV).

[16]

Li, S.; He, C.; Li, R.; and Zhang, L. 2022. A Dual Weighting Label Assignment Scheme for Object Detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]

Li, X.; Wang, W.; Hu, X.; Li, J.; Tang, J.; and Yang, J. 2019. Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]

Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; and Yang, J. 2020. Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection. In Advances in Neural Information Processing Systems (NeurIPS).

[19]

Lin, T.; Dollár, P.; Girshick, R. B.; He, K.; Hariharan, B.; and Belongie, S. J. 2017. Feature Pyramid Networks for Object Detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]

Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; and Dollár, P. 2020. Focal Loss for Dense Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 42(2): 318-327.

[21]

Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; and Zitnick, C. L. 2014. Microsoft COCO: Common Objects in Context. In The European Conference on Computer Vision (ECCV).

[22]

Liu, J.; Li, D.; Zheng, R.; Tian, L.; and Shan, Y. 2021. RankDetNet: Delving Into Ranking Constraints for Object Detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 264-273.

[23]

Oksuz, K.; Cam, B. C.; Akbas, E.; and Kalkan, S. 2020. A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection. In Advances in Neural Information Processing Systems (NeurIPS).

[24]

Oksuz, K.; Cam, B. C.; Akbas, E.; and Kalkan, S. 2021a. Rank & Sort Loss for Object Detection and Instance Segmentation. In The International Conference on Computer Vision (ICCV).

[25]

Oksuz, K.; Cam, B. C.; Kalkan, S.; and Akbas, E. 2021b. One Metric to Measure them All: Localisation Recall Precision (LRP) for Evaluating Visual Detection Tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1-1.

[26]

Ren, S.; He, K.; Girshick, R.; and Sun, J. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 39(6): 1137-1149.

Digital Library

[27]

Roh, B.; Shin, J.; Shin, W.; and Kim, S. 2022. Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity. In The International Conference on Learning Representations (ICLR).

[28]

Sun, P.; Jiang, Y.; Xie, E.; Shao, W.; Yuan, Z.; Wang, C.; and Luo, P. 2021a. What Makes for End-to-End Object Detection? In International Conference on Machine Learning (ICML).

[29]

Sun, P.; Zhang, R.; Jiang, Y.; Kong, T.; Xu, C.; Zhan, W.; Tomizuka, M.; Li, L.; Yuan, Z.; Wang, C.; and Luo, P. 2021b. SparseR-CNN: End-to-End Object Detection with Learnable Proposals. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]

Sun, Z.; Cao, S.; Yang, Y.; and Kitani, K. M. 2021c. Rethinking Transformer-Based Set Prediction for Object Detection. In IEEE/CVF International Conference on Computer Vision (ICCV).

[31]

Tian, Z.; Shen, C.; Chen, H.; and He, T. 2019. FCOS: Fully Convolutional One-Stage Object Detection. In IEEE/CVF International Conference on Computer Vision (ICCV).

[32]

Wang, J.; Chen, K.; Xu, R.; Liu, Z.; Loy, C. C.; and Lin, D. 2019. CARAFE: Content-Aware ReAssembly of FEatures. In IEEE/CVF International Conference on Computer Vision (ICCV).

[33]

Zhang, H.; Wang, Y.; Dayoub, F.; and Sunderhauf, N. 2021. VarifocalNet: An IoU-aware Dense Object Detector. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]

Zhang, S.; Chi, C.; Yao, Y.; Lei, Z.; and Li, S. Z. 2020. Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]

Zhang, X.; Wan, F.; Liu, C.; Ji, R.; and Ye, Q. 2019. FreeAnchor: Learning to Match Anchors for Visual Object Detection. In Advances in Neural Information Processing Systems (NeurIPS).

[36]

Zhu, X.; Hu, H.; Lin, S.; and Dai, J. 2019. Deformable ConvNets V2: More Deformable, Better Results. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]

Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; and Dai, J. 2021. Deformable {DETR}: Deformable Transformers for End-to-End Object Detection. In International Conference on Learning Representations (ICLR).

Recommendations

A study of spatial packet loss correlation in 802.11 wireless networks
LCN '10: Proceedings of the 2010 IEEE 35th Conference on Local Computer Networks

This paper examines the spatial correlation of packet loss events in IEEE 802.11 wireless networks for broadcast communications. We discuss limitations of previously used metrics to measure spatial loss correlation and show that the entropy correlation ...
Canonical random correlation analysis
SAC '10: Proceedings of the 2010 ACM Symposium on Applied Computing

Canonical correlation analysis (CCA) is one of the most well-known methods to extract features from multi-view data and has attracted much attention in recent years. However, classical CCA is unsupervised and does not take class label information into ...
Sparse canonical correlation analysis for recognition
ICIMCS '15: Proceedings of the 7th International Conference on Internet Multimedia Computing and Service

Canonical correlation analysis (CCA) is one promising feature extraction and subspace learning method for multivariate vectors by exploiting the correlation between two multidimensional varia-bles in a linear way. Hence CCA has been widely employed in ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

February 2023

16496 pages

ISBN:978-1-57735-880-0

Copyright © 2023 Association for the Advancement of Artificial Intelligence.

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 07 February 2023

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents