[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Interpretable Attention Guided Network for Fine-Grained Visual Classification

  • Conference paper
  • First Online:
Pattern Recognition. ICPR International Workshops and Challenges (ICPR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12664))

Included in the following conference series:

Abstract

Fine-grained visual classification (FGVC) is challenging but more critical than traditional classification tasks. It requires distinguishing different subcategories with the inherently subtle intra-class object variations. Previous works focus on enhancing the feature representation ability using multiple granularities and discriminative regions based on the attention strategy or bounding boxes. However, these methods highly rely on deep neural networks which lack interpretability. We propose an Interpretable Attention Guided Network (IAGN) for fine-grained visual classification. The contributions of our method include: i) an attention guided framework which can guide the network to extract discriminitive regions in an interpretable way; ii) a progressive training mechanism obtained to distill knowledge stage by stage to fuse features of various granularities; iii) the first interpretable FGVC method with a competitive performance on several standard FGVC benchmark datasets.

The work was supported in part by National Natural Science Foundation of China under Grants 62076016 and 61672079. This work is supported by Shenzhen Science and Technology Program KQTD2016112515134654. Baochang Zhang is the correspondence author who is also with Shenzhen Academy of Aerospace Technology, Shenzhen, China.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 71.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 89.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Berg, T., Belhumeur, P.N.: POOF: part-based one-vs.-one features for fine-grained categorization, face verification, and attribute estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 955–962 (2013)

    Google Scholar 

  2. Berg, T., Liu, J., Woo Lee, S., Alexander, M.L., Jacobs, D.W., Belhumeur, P.N.: Birdsnap: large-scale fine-grained visual categorization of birds. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2011–2018 (2014)

    Google Scholar 

  3. Chang, D., et al.: The devil is in the channels: mutual-channel loss for fine-grained image classification. IEEE Trans. Image Process. 29, 4683–4695 (2020)

    Article  Google Scholar 

  4. Chen, Y., Bai, Y., Zhang, W., Mei, T.: Destruction and construction learning for fine-grained image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5157–5166 (2019)

    Google Scholar 

  5. Cho, T.S., Avidan, S., Freeman, W.T.: A probabilistic image jigsaw puzzle solver. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 183–190. IEEE (2010)

    Google Scholar 

  6. Cui, Y., Zhou, F., Wang, J., Liu, X., Lin, Y., Belongie, S.: Kernel pooling for convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2930 (2017)

    Google Scholar 

  7. Ding, Y., Zhou, Y., Zhu, Y., Ye, Q., Jiao, J.: Selective sparse sampling for fine-grained image recognition. In: IEEE International Conference on Computer Vision (2020)

    Google Scholar 

  8. Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4438–4446 (2017)

    Google Scholar 

  9. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  10. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  11. Huang, S., Xu, Z., Tao, D., Zhang, Y.: Part-stacked CNN for fine-grained visual categorization. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1173–1182 (2016)

    Google Scholar 

  12. Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: 4th International IEEE Workshop on 3D Representation and Recognition. Sydney, Australia (2013)

    Google Scholar 

  13. Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNNs for fine-grained visual recognition. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2017)

    Google Scholar 

  14. Maji, S., Kannala, J., Rahtu, E., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. Technical report (2013)

    Google Scholar 

  15. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, pp. 8026–8037 (2019)

    Google Scholar 

  16. Peng, Y., He, X., Zhao, J.: Object-part attention model for fine-grained image classification. IEEE Trans. Image Process. 27(3), 1487–1500 (2017)

    Article  MathSciNet  Google Scholar 

  17. Rodríguez, P., Gonfaus, J.M., Cucurull, G., Roca, F.X., Gonzàlez, J.: Attend and rectify: a gated attention mechanism for fine-grained recovery. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part VIII. LNCS, vol. 11212, pp. 357–372. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_22

    Chapter  Google Scholar 

  18. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: IEEE International Conference on Computer Vision, pp. 618–626 (2017)

    Google Scholar 

  19. Son, K., Hays, J., Cooper, D.B.: Solving square jigsaw puzzles with loop constraints. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VI. LNCS, vol. 8694, pp. 32–46. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_3

    Chapter  Google Scholar 

  20. Wang, Y., Morariu, V.I., Davis, L.S.: Learning a discriminative filter bank within a CNN for fine-grained recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4148–4157 (2018)

    Google Scholar 

  21. Wei, C., et al.: Iterative reorganization with weak spatial constraints: solving arbitrary jigsaw puzzles for unsupervised representation learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1910–1919 (2019)

    Google Scholar 

  22. Welinder, P., et al.: Caltech-UCSD Birds 200. Technical Report CNS-TR-2010-001, California Institute of Technology (2010)

    Google Scholar 

  23. Yang, Z., Luo, T., Wang, D., Hu, Z., Gao, J., Wang, L.: Learning to navigate for fine-grained classification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part XIV. LNCS, vol. 11218, pp. 438–454. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_26

    Chapter  Google Scholar 

  24. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

  25. Zhang, L., Huang, S., Liu, W., Tao, D.: Learning a mixture of granularity-specific experts for fine-grained categorization. In: IEEE International Conference on Computer Vision, pp. 8331–8340 (2019)

    Google Scholar 

  26. Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_54

    Chapter  Google Scholar 

  27. Zhao, B., Wu, X., Feng, J., Peng, Q., Yan, S.: Diversified visual attention networks for fine-grained object classification. IEEE Trans. Multimed. 19(6), 1245–1256 (2017)

    Article  Google Scholar 

  28. Zheng, H., Fu, J., Mei, T., Luo, J.: Learning multi-attention convolutional neural network for fine-grained image recognition. In: IEEE International Conference on Computer Vision, pp. 5209–5217 (2017)

    Google Scholar 

  29. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs. arXiv preprint arXiv:1412.6856 (2014)

  30. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Baochang Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, Z., Duan, X., Zhao, B., Lü, J., Zhang, B. (2021). Interpretable Attention Guided Network for Fine-Grained Visual Classification. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12664. Springer, Cham. https://doi.org/10.1007/978-3-030-68799-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68799-1_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68798-4

  • Online ISBN: 978-3-030-68799-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics