[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Residual Attention Encoding Neural Network for Terrain Texture Classification

  • Conference paper
  • First Online:
Pattern Recognition (ACPR 2019)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12047))

Included in the following conference series:

  • 1413 Accesses

Abstract

Terrain texture classification plays an important role in computer vision applications, such as robot navigation, autonomous driving, etc. Traditional methods based on hand-crafted features often have sub-optimal performances due to the inefficiency in modeling the complex terrain variations. In this paper, we propose a residual attention encoding network (RAENet) for terrain texture classification. Specifically, RAENet incorporates a stack of residual attention blocks (RABs) and an encoding block (EB). By generating attention feature maps jointly with residual learning, RAB is different from the usually used which only combine feature from the current layer with the former one layer. RAB combines all the preceding layers to the current layer and is not only minimize the information loss in the convolution process, but also enhance the weights of the features that are conducive to distinguish between different classes. Then EB further adopts orderless encoder to keep the invariance to spatial layout in order to extract feature details before classification. The effectiveness of RAENet is evaluated on two terrain texture datasets. Experimental results show that RAENet achieves state-of-the-art performance.

This work is partially supported by National Natural Science Foundation of China under Grant Nos 61872188, U1713208, 61602244, 61672287, 61702262, 61773215.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. University of Oulu texture database. http://www.outex.oulu.fi/temp/

  2. Abadi, M., et al.: Tensorflow: A system for large-scale machine learning (2016)

    Google Scholar 

  3. Akl, A., Yaacoub, C., Donias, M., Costa, J.P.D., Germain, C.: A survey of exemplar-based texture synthesis methods. Comput. Vis. Image Underst. 172, 12–24 (2018)

    Article  Google Scholar 

  4. Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. IEEE Trans. Pattern Anal. Mach. Intell. TPAMI 99, 1 (2017)

    Google Scholar 

  5. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. TPAMI 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  6. Cao, C., Liu, X., Yi, Y., Yu, Y., Huang, T.S.: Look and think twice: capturing top-down visual attention with feedback convolutional neural networks. In: IEEE International Conference on Computer Vision, ICCV (2016)

    Google Scholar 

  7. Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. Computer Science (2014)

    Google Scholar 

  8. Chen, L., et al.: SCA-CNN: Spatial and channel-wise attention in convolutional networks for image captioning. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2017)

    Google Scholar 

  9. Chu, X., Yang, W., Ouyang, W., Ma, C., Yuille, A.L., Wang, X.: Multi-context attention for human pose estimation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2017)

    Google Scholar 

  10. Cimpoi, M., Maji, S., Kokkinos, I., Vedaldi, A.: Deep filter banks for texture recognition, description, and segmentation. Int. J. Comput. Vis. 118(1), 65–94 (2016)

    Article  MathSciNet  Google Scholar 

  11. Csurka, G.: Visual categorization with bags of keypoints. Workshop Stat. Learn. Comput. Vis. 44(247), 1–22 (2004)

    Google Scholar 

  12. Di, H., Shan, C., Ardabilian, M., Wang, Y., Chen, L.: Local binary patterns and its application to facial image analysis: a survey. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 41(6), 765–781 (2011)

    Article  Google Scholar 

  13. Gatys, L.A., Ecker, A.S., Bethge, M.: Texture synthesis using convolutional neural networks. Adv. Neural Inf. Process. Syst. 70(1), 262–270 (2015)

    Google Scholar 

  14. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: International Conference on Artificial Intelligence and Statistics (2011)

    Google Scholar 

  15. Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Bing, S., Liu, T., Wang, X., Gang, W.: Recent advances in convolutional neural networks. Computer Science (2015)

    Google Scholar 

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2016)

    Google Scholar 

  17. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part IV. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38

    Chapter  Google Scholar 

  18. Hinton, G.E.: Rectified linear units improve restricted boltzmann machines Vinod Nair. In: Proceedings of the 27th International Conference on Machine Learning, ICML (2010)

    Google Scholar 

  19. Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. TPAMI 99, 1 (2017)

    Google Scholar 

  20. Huang, G., Liu, Z., Laurens, V.D.M., Weinberger, K.Q.: Densely connected convolutional networks. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2017)

    Google Scholar 

  21. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, ICML (2015)

    Google Scholar 

  22. Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2(3), 194–203 (2001)

    Article  Google Scholar 

  23. Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their location in images. In: IEEE International Conference on Computer Vision, ICCV (2005)

    Google Scholar 

  24. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Proceedings of the Conference on Machine Learning (1998)

    Google Scholar 

  25. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. Computer Science (2014)

    Google Scholar 

  26. Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, NIPS (2012)

    Google Scholar 

  27. Larochelle, H., Hinton, G.E.: Learning to combine foveal glimpses with a third-order Boltzmann machine. In: International Conference on Neural Information Processing Systems, NIPS (2010)

    Google Scholar 

  28. Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  29. Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. Int. J. Comput. Vis. IJCV 43(1), 29–44 (2001)

    Article  Google Scholar 

  30. Li, F.F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2005)

    Google Scholar 

  31. Lin, M., Chen, Q., Yan, S.: Network in network. Computer Science (2013)

    Google Scholar 

  32. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  33. Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual attention. In: International Conference on Neural Information Processing Systems, NIPS, vol. 3 (2014)

    Google Scholar 

  34. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part VIII. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29

    Chapter  Google Scholar 

  35. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. TPAMI 24(7), 971–987 (2002)

    Article  Google Scholar 

  36. Wang, Q., Teng, Z., Xing, J., Gao, J., Hu., W., Maybank, S.: Learning attentions: Residual attentional siamese network for high performance online visual track. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2018)

    Google Scholar 

  37. Raad, L., Davy, A., Desolneux, A., Morel, J.M.: A survey of exemplar-based texture synthesis. Arxiv (2017)

    Google Scholar 

  38. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2012)

    Google Scholar 

  39. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)

    Google Scholar 

  40. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  41. Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2015)

    Google Scholar 

  42. Varma, M., Zisserman, A.: A statistical approach to texture classification from single images. Int. J. Comput. Vis. 62(1–2), 61–81 (2005)

    Article  Google Scholar 

  43. Xu, K.: Show, attend and tell: Neural image caption generation with visual attention. In: Proceedings of the 34nd International Conference on Machine Learning, ICML (2017)

    Google Scholar 

  44. Xue, J., Zhang, H., Dana, K.: Deep texture manifold for ground terrain recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2018)

    Google Scholar 

  45. Zhang, H., et al.: Context encoding for semantic segmentation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2018)

    Google Scholar 

  46. Zhang, H., Xue, J., Dana, K.: Deep ten: texture encoding network. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2017)

    Google Scholar 

  47. Zhang, X., Wang, T., Qi, J.: Progressive attention guided recurrent network for salient object detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhong Jin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Song, X., Yang, J., Jin, Z. (2020). Residual Attention Encoding Neural Network for Terrain Texture Classification. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12047. Springer, Cham. https://doi.org/10.1007/978-3-030-41299-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41299-9_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41298-2

  • Online ISBN: 978-3-030-41299-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics