Sequence Image Interpolation via Separable Convolution Network
"> Figure 1
<p>Location of unmanned aerial vehicle (UAV) dataset.</p> "> Figure 2
<p>Location of Landsat-8 dataset.</p> "> Figure 3
<p>Overview of our separable convolution network architecture.</p> "> Figure 4
<p>Distribution of training, testing, and validation samples in first set of experiment: (<b>A</b>) unmanned aerial vehicle (UAV) and (<b>B</b>) Landsat-8 images; areas inside red and green box and remainder of images show distribution of training, testing, and validation samples, respectively.</p> "> Figure 5
<p>Visual effect of training and testing images in second set of experiment (<math display="inline"><semantics> <mrow> <msub> <mi>I</mi> <mn>4</mn> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi>I</mi> <mn>5</mn> </msub> </mrow> </semantics></math> show visual effect of training image; <math display="inline"><semantics> <mrow> <msubsup> <mi>I</mi> <mn>4</mn> <mo>′</mo> </msubsup> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msubsup> <mi>I</mi> <mn>5</mn> <mo>′</mo> </msubsup> </mrow> </semantics></math> show visual effect of testing image).</p> "> Figure 6
<p>Strategy for generating missing data in third set of experiment: (<b>A</b>) available UAV images from 2017 to 2019; (<b>B</b>) one generation strategy. Red, green, and cyan-blue curves show lines of first, second, and third level of interpolated result.</p> "> Figure 7
<p>Visual effect, detailed information, and pixel error between block interpolated result and reference block using (<b>A</b>) UAV and (<b>B</b>) Landsat-8 datasets with (<b>a</b>) initial image (<b>b</b>,<b>d</b>,<b>f</b>) <span class="html-italic">ℓ<sub>mse</sub></span> loss and (<b>c</b>,<b>e</b>,<b>g</b>) <span class="html-italic">ℓ<sub>c</sub></span> loss.</p> "> Figure 8
<p>Visual effect and pixel error between block interpolated result and reference block using (<b>A</b>) UAV and (<b>B</b>) Landsat-8 datasets with (<b>a</b>) initial image (<b>b</b>,<b>d</b>) our proposed method and (<b>c</b>,<b>e</b>) the method of Meyer et al.</p> "> Figure 9
<p>Visual effect and pixel error between scene interpolated result and reference scene image using (<b>a</b>) initial image (<b>b</b>,<b>d</b>) <span class="html-italic">ℓ<sub>mse</sub></span> loss and (<b>c</b>,<b>e</b>) <span class="html-italic">ℓ<sub>c</sub></span> loss (1, 2, and 3 band composite).</p> "> Figure 10
<p>Visual effect and pixel error between scene interpolated result and reference scene image using (<b>a</b>) initial image (<b>b</b>,<b>d</b>) <span class="html-italic">ℓ<sub>mse</sub></span> loss and (<b>c</b>,<b>e</b>) <span class="html-italic">ℓ<sub>c</sub></span> loss (1, 2, and 4 band composite).</p> "> Figure 11
<p>Spectral curves between scene interpolated result and reference scene image using different loss functions at different coordinates in June 2019.</p> "> Figure 12
<p>Spectral curves between scene interpolated result and reference scene image using different loss functions at different coordinates in July 2019.</p> "> Figure 13
<p>Spectral curves between scene interpolated result and reference scene image using different loss functions at different coordinates in August 2019.</p> "> Figure 14
<p>Interpolated result of UAV images using <span class="html-italic">ℓ<sub>c</sub></span> loss from 2017 to 2019 according to interpolation strategy in <a href="#remotesensing-13-00296-t002" class="html-table">Table 2</a>: existing images and interpolated results in (<b>A</b>) 2019 sequence, (<b>B</b>) 2018 sequence, and (<b>C</b>) 2017 sequence.</p> "> Figure 15
<p>(<b>a</b>) Initial image (<b>b</b>–<b>d</b>) visual effect and (<b>e</b>–<b>g</b>) pixel error between block interpolated results and reference block using stacks of 1 × 1, 2 × 2, and 3 × 3 convolution layers.</p> "> Figure 16
<p>Visual effect and pixel error between block interpolated result and reference block using (<b>a</b>) initial image (<b>b</b>,<b>d</b>) average pooling and (<b>c</b>,<b>e</b>) maximum pooling.</p> "> Figure 17
<p>(<b>a</b>) Initial image (<b>b</b>–<b>d</b>) visual effect and (<b>e</b>–<b>g</b>) pixel error between block interpolated result and reference block using testing block pairs April–May, April–October, and April–December 2018.</p> "> Figure 18
<p>(<b>a</b>) Initial image (<b>b</b>–<b>d</b>) visual effect and (<b>e</b>–<b>g</b>) pixel error between block interpolated result and reference block using separable convolution kernel sizes 11, 13, and 15.</p> ">
Abstract
:1. Introduction
- We use adaptive data-driven model for inter-scene spectral transformation of remote-sensing images, and provide a robust interpolation approach for making up the missing remote-sensing images.
- We verify, by experiments, the possibility of simulating missing remote-sensing image scenes of specified acquisition times and remote-sensing sequences at equal time intervals using the proposed data-driven spatially adaptive convolution network. This allows the processing of remote-sensing sequences to be carried out under a unified framework, instead of requiring different processing logic for each sequence due to different time intervals.
2. Related Studies
3. Materials and Methods
3.1. Datasets
3.2. Theoretical Model
3.3. Architecture of the Model
3.4. Loss Functions
3.5. Evaluation Indicator
4. Experiments and Results
4.1. Training Strategy
4.2. Testing Strategy
4.3. Experimental Details
4.4. Results
5. Discussion
5.1. Stacked Convolution Layers
5.2. Pooling Type
5.3. Temporal Gap between Testing Blocks and Model Requirements
5.4. Separable Convolution Kernel Size
6. Conclusions
- (1)
- The proposed separable convolution network model provides a new method of interpolating remote-sensing images, especially for high-spatial-resolution images. The model can better capture and simulate complex and diverse nonlinear spectral transformation between different temporal images, and get better-interpolated images based on the model.
- (2)
- Using ℓc loss can produce clearer images in the separable convolutional network compared to ℓmse loss. Using 3 × 3 convolutional layers with ReLu, max pooling, and separable convolution kernel of size 11 led to better-interpolated results in the separable convolutional network. Experiments showed that the proposed separable convolution network could be used to get interpolated images to fill in missing areas of sequence images, and produce full remote-sensing sequence images.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Guo, H.; Wang, L.; Liang, D. Big Earth Data from space: A new engine for Earth science. Sci. Bull. 2016, 61, 505–513. [Google Scholar] [CrossRef]
- Vatsavai, R.R.; Ganguly, A.; Chandola, V.; Stefanidis, A.; Klasky, S.; Shekhar, S. Spatio-temporal data mining in the era of big spatial data: Algorithms and applications. In Proceedings of the 1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, Redondo Beach, CA, USA, 6 November 2012. [Google Scholar]
- Li, R.; Zhang, X.; Liu, B. Review on Filtering and Reconstruction Algorithms of Remote Sensing Time Series Data. J. Remote Sens. 2009, 13, 335–341. [Google Scholar]
- Lu, Y.; Liu, J.; Jin, S. Image Processing on Time Series. Crim. Technol. 2004, 2, 41–42. [Google Scholar]
- Dosovitskiy, A.; Brox, T. Generating images with perceptual similarity metrics based on deep networks. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 658–666. [Google Scholar]
- Niklaus, S.; Mai, L.; Liu, F. Video frame interpolation via adaptive convolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Seaquist, J.W.; Chappell, A.; Eklundh, L. Exploring and Improving NOAA AVHRR NDVI Image Quality for African Dry-lands. Geosci. Remote Sens. Symp. 2002, 4, 2006–2008. [Google Scholar]
- Berterretche, M.; Hudak, A.T.; Cohen, W.B.; Maiersperger, T.K.; Gower, S.T.; Dungan, J. Comparison of regression and geostatistical methods for mapping Leaf Area Index (LAI) with Landsat ETM+ data over a boreal forest. Remote Sens. Environ. 2005, 96, 49–61. [Google Scholar] [CrossRef] [Green Version]
- Bhattacharjee, S.; Mitra, P.; Ghosh, S.K. Spatial Interpolation to Predict Missing Attributes in GIS Using Semantic Kriging. IEEE Trans. Geosci. Remote Sens. 2013, 52, 4771–4780. [Google Scholar] [CrossRef]
- Zhou, H.; Wang, N.; Huang, Y. Comparison and analysis of remotely sensed time series of reconstruction models at various intervals. J. Geo Inf. Sci. 2016, 18, 1410–1417. [Google Scholar]
- Chen, J.; Jönsson, P.; Tamura, M.; Gu, Z.; Matsushita, B.; Eklundh, L. A simple method for reconstructing a high-quality NDVI time-series data set based on the Savitzky? Golay filter. Remote Sens. Environ. 2004, 91, 332–344. [Google Scholar] [CrossRef]
- Hua, D.; Yu, H. Objective Evaluation Method for Image Enhancement Quality Based on Visual Information Fidelity. Micro Comput. Inf. 2012, 28, 173–175. [Google Scholar]
- Crosson, W.L.; Al-Hamdan, M.Z.; Hemmings, S.N.; Wade, G.M. A daily merged MODIS Aqua-Terra land surface temperature da-taset for the conterminous United States. Remote Sens. Environ. 2012, 119, 315–324. [Google Scholar] [CrossRef]
- Hao, G.; Wu, B.; Zhang, L.; Fu, D.; Li, Y. Application and Analysis of ESTARFM Model in spatio-temporal Variation of Serincuo Lake Area, Tibet (1976-2014). J. Geo-Inf. Sci. 2016, 18, 833–846. [Google Scholar]
- Peng, J.; Luo, W.; Ning, X.; Zou, Z. Fusion of remote sensing images based on the STARFM model. Cent. South For. Surv. Plan. 2018, 37, 32–37. [Google Scholar] [CrossRef]
- Xu, W.; Chen, R.; Huang, B.; Zhang, X.; Liu, C. Single Image Super-Resolution Based on Global Dense Feature Fusion Convolutional Network. Sensors 2019, 19, 316. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Alvarez-Vanhard, E.; Houet, T.; Mony, C.; Lecoq, L.; Corpetti, T. Can UAVs fill the gap between in situ surveys and satellites for habitat mapping? Remote Sens. Environ. 2020, 243, 111780. [Google Scholar] [CrossRef]
- Dong, C.; Loy, C.C.; He, K.; Tang, X. Image Super-Resolution Using Deep Convolutional Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 295–307. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zeiler, M.D.; Taylor, G.W.; Fergus, R. Adaptive deconvolutional networks for mid and high level feature learning. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2018–2025. [Google Scholar]
- Shi, W.; Caballero, J.; Huszar, F.; Totz, J.; Aitken, A.P.; Bishop, R.; Rueckert, D.; Wang, Z. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1874–1883. [Google Scholar]
- Bishop, C.M. Pattern Recognition and Machine Learning; Springer: New York, NY, USA, 2006. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Lu, C.-G. A generalization of shannon’s information theory. Int. J. Gen. Syst. 1999, 28, 453–490. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Meyer, S.; Cornillère, V.; Djelouah, A.; Schroers, C.; Gross, M. Deep Video Color Propagation. arXiv 2018, arXiv:1808.03232. [Google Scholar]
- Collobert, R.; Kavukcuoglu, K.; Farabet, C. Torch7: A MatLab-Like Environment for Machine Learning. BigLearn, NIPS Workshop. no. EPFL-CONF-192376. 2011. Available online: https://infoscience.epfl.ch/record/192376/ (accessed on 15 October 2020).
- Huang, X.; Shi, J.; Yang, J.; Yao, J. Evaluation of Color Image Quality Based on Mean Square Error and Peak Signal-to-Noise Ratio of Color Difference. Acta Photonica Sin. 2007, S1, 295–298. [Google Scholar]
- Ledig, C.; Theis, L.; Huszar, F.; Caballero, J.; Aitken, A.P.; Tejani, A.; Totz, J.; Wang, Z.; Shi, W. Photo-realistic single image super-resolution using a generative adversarial network. arXiv 2016, arXiv:1609.04802. [Google Scholar]
- Zhu, J.; Krähenbühl, P.; Shechtman, E.; Efros, A.A. Generative visual manipulation on the natural image manifold. Eur. Conf. Comput. Vis. 2016, 9909, 597–613. [Google Scholar]
- Sajjadi, M.S.M.; Schölkopf, B.; Hirsch, M. EnhanceNet: Single image super-resolution through automated texture synthesis. arXiv 2016, arXiv:1612.07919. [Google Scholar]
Dataset | Image Names | Image Dates |
---|---|---|
UAV | I4 | April 2019 |
I5 | May 2019 | |
I6 | June 2019 | |
I7 | July 2019 | |
I8 | August 2019 | |
Landsat-8 | I4 | April 2013 |
I7 | July 2013 | |
I9 | September 2013 | |
I11 | November 2013 | |
I12 | December 2013 |
Color of Points | Training Triplet Images | Testing Images | Output Images |
---|---|---|---|
Red | April, May, January 2019 | April, May 2018 | January 2018 |
April, May, March 2019 | March 2018 | ||
April, May, June 2019 | June 2018 | ||
April, May, July 2019 | July 2018 | ||
April, May, August 2019 | August 2018 | ||
Green | July, October, November 2017 | July, October 2018 | November 2018 |
July, August, September 2018 | July, August 2019 | September 2019 | |
July, August, October 2018 | October 2019 | ||
July, August, November 2018 | November 2019 | ||
July, August, December 2018 | December 2019 | ||
Cyan-blue | October, November, January 2018 | October, November 2017 | January 2017 |
October, November, February 2018 | February 2017 | ||
October, November, March 2018 | March 2017 | ||
October, November, April 2018 | April 2017 | ||
October, November, May 2018 | May 2017 | ||
October, November, June 2018 | June 2017 | ||
October, November, August 2018 | August 2017 | ||
October, November, September 2018 | September 2017 | ||
October, November, December 2018 | December 2017 |
Dataset | Interpolated Results (Block) | Reference Blocks | Entropy | RMSE (Pixel) |
---|---|---|---|---|
UAV | I6 | 3.719 | 1.052 | |
3.723 | 1.077 | |||
I7 | 3.441 | 1.070 | ||
3.450 | 1.294 | |||
I8 | 3.498 | 1.116 | ||
3.508 | 1.429 | |||
Landsat-8 | I9 | 3.143 | 0.817 | |
3.145 | 1.112 | |||
I11 | 3.842 | 1.233 | ||
3.846 | 1.321 | |||
I12 | 3.545 | 1.040 | ||
3.550 | 1.476 |
Dataset | Interpolated Results (Block) | Reference Blocks | Entropy | RMSE (Pixel) |
---|---|---|---|---|
UAV | I6 | 3.719 | 1.052 | |
3.701 | 1.320 | |||
I7 | 3.441 | 1.070 | ||
3.440 | 1.888 | |||
I8 | 3.498 | 1.116 | ||
3.477 | 2.369 | |||
Landsat-8 | I9 | 3.143 | 0.817 | |
3.125 | 1.550 | |||
I11 | 3.842 | 1.233 | ||
3.572 | 1.957 | |||
I12 | 3.545 | 1.040 | ||
3.541 | 1.769 |
Interpolated Results (Scene) | Reference Images | Entropy | RMSE (Pixel) |
---|---|---|---|
I6 | 3.650 | 1.124 | |
3.656 | 1.163 | ||
I7 | 3.346 | 1.017 | |
3.346 | 1.381 | ||
I8 | 3.494 | 1.210 | |
3.506 | 1.550 |
Stacked Numbers | Interpolated Results (Block) | Reference Blocks | RMSE (Pixel) |
---|---|---|---|
1 × 1 | I6 | 1.375 | |
I7 | 1.556 | ||
I8 | 1.666 | ||
2 × 2 | I6 | 1.258 | |
I7 | 1.431 | ||
I8 | 1.465 | ||
3 × 3 | I6 | 1.077 | |
I7 | 1.294 | ||
I8 | 1.429 |
Pooling Type | Interpolated Results (Block) | Reference Blocks | RMSE (Pixel) |
---|---|---|---|
Average pooling | I6 | 1.326 | |
I7 | 1.492 | ||
I8 | 1.700 | ||
Maximum pooling | I6 | 1.077 | |
I7 | 1.294 | ||
I8 | 1.429 |
Testing Image Date | Interpolated Results (Block) | Reference Blocks | RMSE (Pixel) |
---|---|---|---|
April, May 2018 | I6 | 1.341 | |
April, October 2018 | 3.912 | ||
April, December 2018 | 3.989 | ||
April, May 2018 | I7 | 1.498 | |
April, October 2018 | 5.096 | ||
April, December 2018 | 5.271 | ||
April, May 2018 | I8 | 1.653 | |
April, October 2018 | 5.313 | ||
April, December 2018 | 5.568 |
Kernel Size | Interpolated Results (Block) | Reference Blocks | RMSE (Pixel) |
---|---|---|---|
11 | I6 | 1.077 | |
I7 | 1.294 | ||
I8 | 1.429 | ||
13 | I6 | 1.178 | |
I7 | 1.446 | ||
I8 | 1.765 | ||
15 | I6 | 1.217 | |
I7 | 1.656 | ||
I8 | 1.868 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jin, X.; Tang, P.; Houet, T.; Corpetti, T.; Alvarez-Vanhard, E.G.; Zhang, Z. Sequence Image Interpolation via Separable Convolution Network. Remote Sens. 2021, 13, 296. https://doi.org/10.3390/rs13020296
Jin X, Tang P, Houet T, Corpetti T, Alvarez-Vanhard EG, Zhang Z. Sequence Image Interpolation via Separable Convolution Network. Remote Sensing. 2021; 13(2):296. https://doi.org/10.3390/rs13020296
Chicago/Turabian StyleJin, Xing, Ping Tang, Thomas Houet, Thomas Corpetti, Emilien Gence Alvarez-Vanhard, and Zheng Zhang. 2021. "Sequence Image Interpolation via Separable Convolution Network" Remote Sensing 13, no. 2: 296. https://doi.org/10.3390/rs13020296
APA StyleJin, X., Tang, P., Houet, T., Corpetti, T., Alvarez-Vanhard, E. G., & Zhang, Z. (2021). Sequence Image Interpolation via Separable Convolution Network. Remote Sensing, 13(2), 296. https://doi.org/10.3390/rs13020296