Satellite Image Scene Classification via ConvNet With Context Aggregation

Zhao Zhou¹⁸,
Yingbin Zheng¹⁸,
Hao Ye¹⁸,
Jian Pu¹⁹ &
…
Gufei Sun²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11165))

Included in the following conference series:

Pacific Rim Conference on Multimedia

2598 Accesses
13 Citations

Abstract

Scene classification is a fundamental problem to understand the high-resolution remote sensing imagery. Recently, convolutional neural network (ConvNet) has achieved remarkable performance in different tasks, and significant efforts have been made to develop various representations for satellite image scene classification. In this paper, we present a novel representation based on a ConvNet with context aggregation. The proposed two-pathway ResNet (ResNet-TP) architecture adopts the ResNet [1] as backbone, and the two pathways allow the network to model both local details and regional context. The ResNet-TP based representation is generated by global average pooling on the last convolutional layers from both pathways. Experiments on two scene classification datasets, UCM Land Use and NWPU-RESISC45, show that the proposed mechanism achieves promising improvements over state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 71.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 89.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Remote Sensing Scene Classification by Multi-granularity Fused CNN

Local feature acquisition and global context understanding network for very high-resolution land cover classification

Article Open access 01 June 2024

Enhancing Remote Sensing Scene Classification with Channel-Spatial CNN (CS-CNN)

Notes

1.
The download link can be found from https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py.
2.
The stochastic gradient descent (SGD) is used with batch size of 64 and momentum of 0.9. The learning rate is initially set to be 0.01 and is divided by 10 every 30 epochs.

References

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)
Google Scholar
Liu, Q., Hang, R., Song, H., Li, Z.: Learning multiscale deep features for high-resolution satellite image scene classification. IEEE Trans. Geosci. Remote Sens. 56(1), 117–126 (2018)
Article Google Scholar
Scott, G.J., England, M.R., Starms, W.A., Marcum, R.A., Davis, C.H.: Training deep convolutional neural networks for land-cover classification of high-resolution imagery. IEEE Geosci. Remote Sens. Lett. 14(4), 549–553 (2017)
Article Google Scholar
Han, X., Zhong, Y., Cao, L., Zhang, L.: Pre-trained alexnet architecture with pyramid pooling and supervision for high spatial resolution remote sensing image scene classification. Remote Sens. 9(8), 848 (2017)
Article Google Scholar
Xia, G.S., et al.: Aid: a benchmark data set for performance evaluation of aerial scene classification. IEEE Trans. Geosci. Remote Sens. 55(7), 3965–3981 (2017)
Article Google Scholar
Han, X., Zhong, Y., Cao, L., Zhang, L.: Pre-trained alexnet architecture with pyramid pooling and supervision for high spatial resolution remote sensing image scene classification. Remote Sens. 9(8), 848 (2017)
Article Google Scholar
Cheng, G., Han, J., Lu, X.: Remote sensing image scene classification: benchmark and state of the art. Proc. IEEE 105(10), 1865–1883 (2017)
Article Google Scholar
Cheng, G., Li, Z., Yao, X., Guo, L., Wei, Z.: Remote sensing image scene classification using bag of convolutional features. IEEE Geosci. Remote Sens. Lett. 14(10), 1735–1739 (2017)
Article Google Scholar
Wang, G., Fan, B., Xiang, S., Pan, C.: Aggregating rich hierarchical features for scene classification in remote sensing imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 10(9), 4104–4115 (2017)
Article Google Scholar
Liu, Y., Huang, C.: Scene classification via triplet networks. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 11(1), 220–237 (2018)
Article Google Scholar
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM International Conference on Multimedia (MM), pp. 675–678 (2014)
Google Scholar
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Yang, Y., Newsam, S.D.: Bag-of-visual-words and spatial extensions for land-use classification. In: SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 270–279 (2010)
Google Scholar
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: International Conference on Learning Representations (ICLR) (2016)
Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)
Article Google Scholar
Lea, C., Flynn, M., Vidal, R., Reiter, A., Hager, G.: Temporal convolutional networks for action segmentation and detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Xu, B., Ye, H., Zheng, Y., Wang, H., Luwang, T., Jiang, Y.G.: Dense dilated network for few shot action recognition. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 379–387 (2018)
Google Scholar
Zheng, Y., Ye, H., Wang, L., Pu, J.: Learning multiviewpoint context-aware representation for RGB-D scene classification. IEEE Signal Process. Lett. 25(1), 30–34 (2018)
Article Google Scholar
Gupta, A., Rush, A.M.: Dilated convolutions for modeling long-distance genomic dependencies. arXiv preprint arXiv:1710D.01278 (2017)
Cusano, C., Napoletano, P., Schettini, R.: Remote sensing image classification exploiting multiple kernel learning. IEEE Geosci. Remote Sens. Lett. 12(11), 2331–2335 (2015)
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by grants from National Natural Science Foundation of China (No. 61602459) and Science and Technology Commission of Shanghai Municipality (No. 17511101902 and No. 18511103103).

Author information

Authors and Affiliations

Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai, China
Zhao Zhou, Yingbin Zheng & Hao Ye
East China Normal University, Shanghai, China
Jian Pu
ZhongAn Technology, Shanghai, China
Gufei Sun

Authors

Zhao Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yingbin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Hao Ye
View author publications
You can also search for this author in PubMed Google Scholar
Jian Pu
View author publications
You can also search for this author in PubMed Google Scholar
Gufei Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yingbin Zheng .

Editor information

Editors and Affiliations

Hefei University of Technology, Hefei, China
Richang Hong
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
University of Tokyo, Tokyo, Japan
Toshihiko Yamasaki
Hefei University of Technology, Hefei, China
Meng Wang
City University of Hong Kong, Hong Kong, Hong Kong
Chong-Wah Ngo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, Z., Zheng, Y., Ye, H., Pu, J., Sun, G. (2018). Satellite Image Scene Classification via ConvNet With Context Aggregation. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11165. Springer, Cham. https://doi.org/10.1007/978-3-030-00767-6_31

Download citation

DOI: https://doi.org/10.1007/978-3-030-00767-6_31
Published: 19 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00766-9
Online ISBN: 978-3-030-00767-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Satellite Image Scene Classification via ConvNet With Context Aggregation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Remote Sensing Scene Classification by Multi-granularity Fused CNN

Local feature acquisition and global context understanding network for very high-resolution land cover classification

Enhancing Remote Sensing Scene Classification with Channel-Spatial CNN (CS-CNN)

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Satellite Image Scene Classification via ConvNet With Context Aggregation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Remote Sensing Scene Classification by Multi-granularity Fused CNN

Local feature acquisition and global context understanding network for very high-resolution land cover classification

Enhancing Remote Sensing Scene Classification with Channel-Spatial CNN (CS-CNN)

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation