Abstract
Scene graph generation (SGG) aims to build a structural representation for the image with the object instance and the relations between object pairs. Due to the long-tail distribution of the dataset labeling, scene graph generation models must adopt the debiasing method during the learning process. In this paper, we propose to integrating a novel self-distillation method into the existing SGG models and the experimental results have shown competitive debiasing performance. Further analysis of its effectiveness with causal inference theory has indicated that our method can be considered as a new intervention method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Zellers, R., Yatskar, M., Thomson, S., Choi, Y.: Neural motifs: scene graph parsing with global context (2018)
Tang, K., Niu, Y., Huang, J., Shi, J., Zhang, H.: Unbiased scene graph generation from biased training. In: Conference on Computer Vision and Pattern Recognition (2020)
Yu, J., Chai, Y., Hu, Y., Wu, Q.: Cogtree: cognition tree loss for unbiased scene graph generation. In: IJCAI (2021)
Yan, S., Shen, C., Jin, Z., Huang, J., Jiang, R., Chen, Y., Hua, X.: Pcpl: Predicate-correlation perception learning for unbiased scene graph generation. In: Proceedings of the 28th ACM International Conference on Multimedia (2020)
Chen, D., Liang, X., Wang, Y., Gao, W.: Soft transfer learning via gradient diagnosis for visual relationship detection. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1118–1126 (2019)
Chiou, M.-J., Ding, H., Yan, H., Wang, C., Zimmermann, R., Feng, J.: Recovering the unbiased scene graphs from the biased ones. In: Proceedings of the 29th ACM International Conference on Multimedia (2021)
Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)
Ge, Y., Choi, C.L., Zhang, X., Zhao, P., Zhu, F., Zhao, R., Li, H.: Self-distillation with batch knowledge ensembling improves imagenet classification. arXiv preprint arXiv:2104.13298 (2021)
Xu, D., Zhu, Y., Choy, C.B., Fei-Fei, L.: Scene graph generation by iterative message passing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5419 (2017)
Tang, K., Zhang, H., Wu, B., Luo, W., Liu, W.: Learning to compose dynamic tree structures for visual contexts. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6612–6621 (2019)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Fang, Z., et al.: A guiding teaching and dual adversarial learning framework for a single image dehazing. Vis. Comput. 38(11), 3563–3575 (2022)
Yu, J., et al.: Action matching network: open-set action recognition using spatio-temporal representation matching. Vis. Comput. 36, 1457–1471 (2020)
Wang, H., et al.: Attentional and adversarial feature mimic for efficient object detection. Vis. Comput. 39(2), 639–650 (2023)
Chang, Yuan, et al.: VTNCT: an image-based virtual try-on network by combining feature with pixel transformation. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02480-8
Wang, W., Wang, R.R., Chen, X.: Topic scene graph generation by attention distillation from caption. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 15880–15890 (2021)
Chen, X., Jiang, M., Zhao, Q.: Self-distillation for few-shot image captioning. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 545–555 (2021)
Zhang, Y., Xiang, T., Hospedales, T.M., Lu, H.: Deep mutual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4320–4328 (2018)
Zhang, L., Song, J., Gao, A., Chen, J., Bao, C., Ma, K.: Be your own teacher: improve the performance of convolutional neural networks via self distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3713–3722 (2019)
Xiang, L., Ding, G., Han, J.: Learning from multiple experts: Self- paced knowledge distillation for long-tailed classification. In: European Conference on Computer Vision, pp. 247–263. Springer (2020)
Guo, H., Wang, S.: Long-tailed multi-label visual recognition by collaborative training on uniform and re-balanced samplings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15089–15098 (2021)
Li, T., Wang, L., Wu, G.: Self supervision to distillation for long- tailed visual recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 630–639 (2021)
Reichenbach, H.: The Direction of Time, vol. 65. Univ of California Press, California (1956)
Tang, K.: A scene graph generation codebase. https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2015)
Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936-944 (2017)
Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5987-5995 (2017)
Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., & Yu, S.X.: Large-scale long-tailed recognition in an open world. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2532-2541 (2019)
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Grant No. 62077009, 62177006) and partially supported by the Guangdong Provincial Natural Science Foundation (Grant No. 2214050002868) and Zhuhai Science and Technology Planning Project(Grant No. ZH22036201210161PWC.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest with regard to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, B., Hao, Z., Yu, L. et al. Unbiased scene graph generation using the self-distillation method. Vis Comput 40, 2381–2390 (2024). https://doi.org/10.1007/s00371-023-02924-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-02924-9