[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Unbiased scene graph generation using the self-distillation method

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Scene graph generation (SGG) aims to build a structural representation for the image with the object instance and the relations between object pairs. Due to the long-tail distribution of the dataset labeling, scene graph generation models must adopt the debiasing method during the learning process. In this paper, we propose to integrating a novel self-distillation method into the existing SGG models and the experimental results have shown competitive debiasing performance. Further analysis of its effectiveness with causal inference theory has indicated that our method can be considered as a new intervention method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Zellers, R., Yatskar, M., Thomson, S., Choi, Y.: Neural motifs: scene graph parsing with global context (2018)

  2. Tang, K., Niu, Y., Huang, J., Shi, J., Zhang, H.: Unbiased scene graph generation from biased training. In: Conference on Computer Vision and Pattern Recognition (2020)

  3. Yu, J., Chai, Y., Hu, Y., Wu, Q.: Cogtree: cognition tree loss for unbiased scene graph generation. In: IJCAI (2021)

  4. Yan, S., Shen, C., Jin, Z., Huang, J., Jiang, R., Chen, Y., Hua, X.: Pcpl: Predicate-correlation perception learning for unbiased scene graph generation. In: Proceedings of the 28th ACM International Conference on Multimedia (2020)

  5. Chen, D., Liang, X., Wang, Y., Gao, W.: Soft transfer learning via gradient diagnosis for visual relationship detection. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1118–1126 (2019)

  6. Chiou, M.-J., Ding, H., Yan, H., Wang, C., Zimmermann, R., Feng, J.: Recovering the unbiased scene graphs from the biased ones. In: Proceedings of the 29th ACM International Conference on Multimedia (2021)

  7. Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)

    Book  Google Scholar 

  8. Ge, Y., Choi, C.L., Zhang, X., Zhao, P., Zhu, F., Zhao, R., Li, H.: Self-distillation with batch knowledge ensembling improves imagenet classification. arXiv preprint arXiv:2104.13298 (2021)

  9. Xu, D., Zhu, Y., Choy, C.B., Fei-Fei, L.: Scene graph generation by iterative message passing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5419 (2017)

  10. Tang, K., Zhang, H., Wu, B., Luo, W., Liu, W.: Learning to compose dynamic tree structures for visual contexts. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6612–6621 (2019)

  11. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  12. Fang, Z., et al.: A guiding teaching and dual adversarial learning framework for a single image dehazing. Vis. Comput. 38(11), 3563–3575 (2022)

    Article  Google Scholar 

  13. Yu, J., et al.: Action matching network: open-set action recognition using spatio-temporal representation matching. Vis. Comput. 36, 1457–1471 (2020)

    Article  Google Scholar 

  14. Wang, H., et al.: Attentional and adversarial feature mimic for efficient object detection. Vis. Comput. 39(2), 639–650 (2023)

    Article  Google Scholar 

  15. Chang, Yuan, et al.: VTNCT: an image-based virtual try-on network by combining feature with pixel transformation. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02480-8

    Article  Google Scholar 

  16. Wang, W., Wang, R.R., Chen, X.: Topic scene graph generation by attention distillation from caption. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 15880–15890 (2021)

  17. Chen, X., Jiang, M., Zhao, Q.: Self-distillation for few-shot image captioning. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 545–555 (2021)

  18. Zhang, Y., Xiang, T., Hospedales, T.M., Lu, H.: Deep mutual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4320–4328 (2018)

  19. Zhang, L., Song, J., Gao, A., Chen, J., Bao, C., Ma, K.: Be your own teacher: improve the performance of convolutional neural networks via self distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3713–3722 (2019)

  20. Xiang, L., Ding, G., Han, J.: Learning from multiple experts: Self- paced knowledge distillation for long-tailed classification. In: European Conference on Computer Vision, pp. 247–263. Springer (2020)

  21. Guo, H., Wang, S.: Long-tailed multi-label visual recognition by collaborative training on uniform and re-balanced samplings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15089–15098 (2021)

  22. Li, T., Wang, L., Wu, G.: Self supervision to distillation for long- tailed visual recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 630–639 (2021)

  23. Reichenbach, H.: The Direction of Time, vol. 65. Univ of California Press, California (1956)

    Google Scholar 

  24. Tang, K.: A scene graph generation codebase. https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch

  25. Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2015)

    Article  Google Scholar 

  26. Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936-944 (2017)

  27. Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5987-5995 (2017)

  28. Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., & Yu, S.X.: Large-scale long-tailed recognition in an open world. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2532-2541 (2019)

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Grant No. 62077009, 62177006) and partially supported by the Guangdong Provincial Natural Science Foundation (Grant No. 2214050002868) and Zhuhai Science and Technology Planning Project(Grant No. ZH22036201210161PWC.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun He.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest with regard to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, B., Hao, Z., Yu, L. et al. Unbiased scene graph generation using the self-distillation method. Vis Comput 40, 2381–2390 (2024). https://doi.org/10.1007/s00371-023-02924-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-02924-9

Keywords

Navigation