RegGAN: An End-to-End Network for Building Footprint Generation with Boundary Regularization
"> Figure 1
<p>The building masks extracted by FC-DenseNet [<a href="#B3-remotesensing-14-01835" class="html-bibr">3</a>] and RegGAN at large scale and two zoomed-in areas.</p> "> Figure 2
<p>Overview of the RegGAN.</p> "> Figure 3
<p>Aerial images in the ISPRS dataset (GSD: 5 cm/pixel).</p> "> Figure 4
<p>Aerial imagery from the INRIA dataset (GSD: 30 cm/pixel).</p> "> Figure 5
<p>Results obtained from (<b>a</b>) FCN-8s [<a href="#B11-remotesensing-14-01835" class="html-bibr">11</a>], (<b>b</b>) U-Net [<a href="#B13-remotesensing-14-01835" class="html-bibr">13</a>], (<b>c</b>) SegNet [<a href="#B12-remotesensing-14-01835" class="html-bibr">12</a>], (<b>d</b>) FC-DenseNet [<a href="#B3-remotesensing-14-01835" class="html-bibr">3</a>], (<b>e</b>) HRNet [<a href="#B14-remotesensing-14-01835" class="html-bibr">14</a>], (<b>f</b>) HA U-Net [<a href="#B34-remotesensing-14-01835" class="html-bibr">34</a>], (<b>g</b>) EPUNet [<a href="#B35-remotesensing-14-01835" class="html-bibr">35</a>], (<b>h</b>) ESFNet [<a href="#B36-remotesensing-14-01835" class="html-bibr">36</a>], (<b>i</b>) two-stage method [<a href="#B6-remotesensing-14-01835" class="html-bibr">6</a>], and (<b>j</b>) RegGAN. (<b>k</b>,<b>l</b>) are the corresponding remote sensing image and ground reference on the ISPRS dataset (GSD: 5 cm/pixel).</p> "> Figure 6
<p>Results obtained from (<b>a</b>) FCN-8s [<a href="#B11-remotesensing-14-01835" class="html-bibr">11</a>], (<b>b</b>) U-Net [<a href="#B13-remotesensing-14-01835" class="html-bibr">13</a>], (<b>c</b>) SegNet [<a href="#B12-remotesensing-14-01835" class="html-bibr">12</a>], (<b>d</b>) FC-DenseNet [<a href="#B3-remotesensing-14-01835" class="html-bibr">3</a>], (<b>e</b>) HRNet [<a href="#B14-remotesensing-14-01835" class="html-bibr">14</a>], (<b>f</b>) HA U-Net [<a href="#B34-remotesensing-14-01835" class="html-bibr">34</a>], (<b>g</b>) EPUNet [<a href="#B35-remotesensing-14-01835" class="html-bibr">35</a>], (<b>h</b>) ESFNet [<a href="#B36-remotesensing-14-01835" class="html-bibr">36</a>], (<b>i</b>) two-stage method [<a href="#B6-remotesensing-14-01835" class="html-bibr">6</a>], and (<b>j</b>) RegGAN. (<b>k</b>,<b>l</b>) are the corresponding remote sensing image and ground reference on the INRIA dataset (GSD: 30 cm/pixel).</p> "> Figure 7
<p>Results obtained from (<b>a</b>) RegGAN (no regularized loss), (<b>b</b>) RegGAN (no multiscale discriminator), and (<b>c</b>) RegGAN. (<b>d</b>,<b>e</b>) are the corresponding remote sensing image and ground reference on the ISPRS dataset (GSD: 5 cm/pixel).</p> "> Figure 8
<p>Results obtained from (<b>a</b>) RegGAN (no regularized loss), (<b>b</b>) RegGAN (no multiscale discriminator), and (<b>c</b>) RegGAN. (<b>d</b>,<b>e</b>) are the corresponding remote sensing image and ground reference on the INRIA dataset (GSD: 30 cm/pixel).</p> "> Figure 9
<p>Time efficiency of different methods.</p> ">
Abstract
:1. Introduction
2. Related Work
2.1. Semantic Segmentation of Building Footprints
2.2. Regularization of Building Footprints
3. Methodology
3.1. Overview of RegGAN
3.2. Objective Function of RegGAN
4. Experiments
4.1. Dataset
4.2. Experiment Setup
4.3. Training Details
4.4. Evaluation Metrics
4.4.1. Mask Metrics
4.4.2. Boundary Metrics
5. Results
6. Discussion
6.1. Ablation Study
6.2. Time Efficiency of Different Methods
7. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Zhang, Y. Optimisation of building detection in satellite images by combining multispectral classification and texture filtering. ISPRS J. Photogramm. Remote Sens. 1999, 54, 50–60. [Google Scholar] [CrossRef]
- Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep learning in remote sensing: A comprehensive review and list of resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef] [Green Version]
- Jégou, S.; Drozdzal, M.; Vazquez, D.; Romero, A.; Bengio, Y. The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21 July 2017; pp. 11–19. [Google Scholar]
- Ling, F.; Li, X.; Xiao, F.; Fang, S.; Du, Y. Object-based sub-pixel mapping of buildings incorporating the prior shape information from remotely sensed imagery. Int. J. Appl. Earth Obs. Geoinf. 2012, 18, 283–292. [Google Scholar] [CrossRef]
- Wei, S.; Ji, S.; Lu, M. Toward automatic building footprint delineation from aerial images using cnn and regularization. IEEE Trans. Geosci. Remote Sens. 2019, 58, 2178–2189. [Google Scholar] [CrossRef]
- Zorzi, S.; Bittner, K.; Fraundorfer, F. Machine-learned Regularization and Polygonization of Building Segmentation Masks. In Proceedings of the 2020 International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021. [Google Scholar]
- Huang, X.; Zhang, L. A multidirectional and multiscale morphological index for automatic building extraction from multispectral GeoEye-1 imagery. Photogramm. Eng. Remote Sens. 2011, 77, 721–732. [Google Scholar] [CrossRef]
- Karantzalos, K.; Argialas, D. A region-based level set segmentation for automatic detection of human-made objects from aerial and satellite images. Photogramm. Eng. Remote Sens. 2009, 75, 667–677. [Google Scholar] [CrossRef]
- Cote, M.; Saeedi, P. Automatic rooftop extraction in nadir aerial imagery of suburban regions using corners and variational level set evolution. IEEE Trans. Geosci. Remote Sens. 2012, 51, 313–328. [Google Scholar] [CrossRef]
- Senaras, C.; Ozay, M.; Vural, F.T.Y. Building detection with decision fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2013, 6, 1295–1304. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Wang, J.; Sun, K.; Cheng, T.; Jiang, B.; Deng, C.; Zhao, Y.; Liu, D.; Mu, Y.; Tan, M.; Wang, X.; et al. Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3349–3364. [Google Scholar] [CrossRef] [Green Version]
- Maggiori, E.; Tarabalka, Y.; Charpiat, G.; Alliez, P. Convolutional neural networks for large-scale remote-sensing image classification. IEEE Trans. Geosci. Remote Sens. 2016, 55, 645–657. [Google Scholar] [CrossRef] [Green Version]
- Bischke, B.; Helber, P.; Folz, J.; Borth, D.; Dengel, A. Multi-task learning for segmentation of building footprints with deep neural networks. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 1480–1484. [Google Scholar]
- Li, Q.; Shi, Y.; Huang, X.; Zhu, X.X. Building Footprint Generation by Integrating Convolution Neural Network With Feature Pairwise Conditional Random Field (FPCRF). IEEE Trans. Geosci. Remote Sens. 2020, 58, 7502–7519. [Google Scholar] [CrossRef]
- Xie, Y.; Zhu, J.; Cao, Y.; Feng, D.; Hu, M.; Li, W.; Zhang, Y.; Fu, L. Refined Extraction Of Building Outlines From High-Resolution Remote Sensing Imagery Based on a Multifeature Convolutional Neural Network and Morphological Filtering. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 1842–1855. [Google Scholar] [CrossRef]
- Xia, L.; Zhang, X.; Zhang, J.; Wu, W.; Gao, X. Refined extraction of buildings with the semantic edge-assisted approach from very high-resolution remotely sensed imagery. Int. J. Remote Sens. 2020, 41, 8352–8365. [Google Scholar] [CrossRef]
- Zhao, K.; Kang, J.; Jung, J.; Sohn, G. Building Extraction From Satellite Images Using Mask R-CNN With Building Boundary Regularization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 247–251. [Google Scholar]
- Marcos, D.; Tuia, D.; Kellenberger, B.; Zhang, L.; Bai, M.; Liao, R.; Urtasun, R. Learning deep structured active contours end-to-end. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8877–8885. [Google Scholar]
- Kass, M.; Witkin, A.; Terzopoulos, D. Snakes: Active contour models. Int. J. Comput. Vis. 1988, 1, 321–331. [Google Scholar] [CrossRef]
- Cheng, D.; Liao, R.; Fidler, S.; Urtasun, R. Darnet: Deep active ray network for building segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 7431–7439. [Google Scholar]
- Chen, Q.; Wang, L.; Waslander, S.L.; Liu, X. An end-to-end shape modeling framework for vectorized building outline generation from aerial images. ISPRS J. Photogramm. Remote Sens. 2020, 170, 114–126. [Google Scholar] [CrossRef]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Tang, M.; Perazzi, F.; Djelouah, A.; Ben Ayed, I.; Schroers, C.; Boykov, Y. On regularized losses for weakly-supervised cnn segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 507–522. [Google Scholar]
- Tang, M.; Djelouah, A.; Perazzi, F.; Boykov, Y.; Schroers, C. Normalized cut loss for weakly-supervised cnn segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1818–1827. [Google Scholar]
- Boykov, Y.; Veksler, O.; Zabih, R. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 1222–1239. [Google Scholar] [CrossRef] [Green Version]
- Shi, J.; Malik, J. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 888–905. [Google Scholar]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein gan. arXiv 2017, arXiv:1701.07875. [Google Scholar]
- ISPRS. Available online: http://www2.isprs.org/commissions/comm3/wg4/semantic-labeling.html (accessed on 1 June 2019).
- Maggiori, E.; Tarabalka, Y.; Charpiat, G.; Alliez, P. Can Semantic Labeling Methods Generalize Benchmark to Any City? The Inria Aerial Image Labeling. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017. [Google Scholar]
- Marmanis, D.; Schindler, K.; Wegner, J.D.; Galliani, S.; Datcu, M.; Stilla, U. Classification with an edge: Improving semantic image segmentation with boundary detection. ISPRS J. Photogramm. Remote Sens. 2018, 135, 158–172. [Google Scholar] [CrossRef] [Green Version]
- Xu, L.; Liu, Y.; Yang, P.; Chen, H.; Zhang, H.; Wang, D.; Zhang, X. HA U-Net: Improved Model for Building Extraction From High Resolution Remote Sensing Imagery. IEEE Access 2021, 9, 101972–101984. [Google Scholar] [CrossRef]
- Guo, H.; Shi, Q.; Marinoni, A.; Du, B.; Zhang, L. Deep building footprint update network: A semi-supervised method for updating existing building footprint from bi-temporal remote sensing images. Remote Sens. Environ. 2021, 264, 112589. [Google Scholar] [CrossRef]
- Lin, J.; Jing, W.; Song, H.; Chen, G. ESFNet: Efficient network for building extraction from high-resolution aerial images. IEEE Access 2019, 7, 54285–54294. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Li, Q.; Mou, L.; Hua, Y.; Shi, Y.; Zhu, X.X. Building Footprint Generation Through Convolutional Neural Networks With Attraction Field Representation. IEEE Trans. Geosci. Remote Sens. 2021. [Google Scholar] [CrossRef]
- Kokkinos, I. Boundary detection using f-measure-, filter-and feature-(F 3) boost. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2010; pp. 650–663. [Google Scholar]
- Sobel, I. An Isotropic 3 × 3 Gradient Operator, Machine Vision for Three–Dimensional Scenes; Freeman, H., Ed.; Academic Press: Cambridge, MA, USA, 1990; p. 376379. [Google Scholar]
- Wang, S.; Bai, M.; Mattyus, G.; Chu, H.; Luo, W.; Yang, B.; Liang, J.; Cheverie, J.; Fidler, S.; Urtasun, R. Torontocity: Seeing the world with a million eyes. arXiv 2016, arXiv:1612.00423. [Google Scholar]
- Arkin, E.M.; Chew, L.P.; Huttenlocher, D.P.; Kedem, K.; Mitchell, J.S. An Efficiently Computable Metric for Comparing Polygonal Shapes; Technical Report; Cornell Univ: Ithaca, NY, USA, 1991. [Google Scholar]
- Fan, H.; Zipf, A.; Fu, Q.; Neis, P. Quality assessment for building footprints data on OpenStreetMap. Int. J. Geogr. Inf. Sci. 2014, 28, 700–719. [Google Scholar] [CrossRef]
Mask | Boundary | |||
---|---|---|---|---|
Method | F1-Score | IoU | SIM | F-Measure |
FCN-8s [11] | 81.82 | 69.23 | 52.80 | 18.71 |
U-Net [13] | 85.37 | 74.48 | 58.11 | 19.32 |
SegNet [12] | 87.81 | 78.28 | 54.84 | 17.11 |
FC-DenseNet [3] | 88.34 | 79.11 | 58.91 | 20.76 |
HRNet [14] | 85.82 | 75.16 | 55.77 | 17.96 |
HA U-Net [34] | 88.09 | 79.00 | 59.20 | 20.59 |
EPUNet [35] | 88.52 | 79.41 | 58.63 | 16.77 |
ESFNet [36] | 88.65 | 80.23 | 57.76 | 19.67 |
Two-stage method [6] | 87.86 | 78.35 | 64.01 | 19.56 |
RegGAN | 90.40 | 82.48 | 65.94 | 22.27 |
Mask | Boundary | |||
---|---|---|---|---|
Method | F1-Score | IoU | SIM | F-Measure |
FCN-8s [11] | 84.79 | 73.60 | 68.96 | 27.01 |
U-Net [13] | 84.83 | 73.66 | 69.48 | 28.98 |
SegNet [12] | 84.43 | 73.05 | 68.68 | 28.16 |
FC-DenseNet [3] | 84.66 | 73.41 | 67.94 | 28.96 |
HRNet [14] | 81.52 | 68.81 | 66.02 | 23.75 |
HA U-Net [34] | 84.28 | 72.82 | 69.18 | 26.64 |
EPUNet [35] | 83.90 | 72.26 | 68.38 | 25.21 |
ESFNet [36] | 83.65 | 71.90 | 68.35 | 24.63 |
Two-stage method [6] | 84.59 | 73.29 | 69.73 | 29.56 |
RegGAN | 86.74 | 76.50 | 71.44 | 32.17 |
Mask | Boundary | |||
---|---|---|---|---|
Method | F1-Score | IoU | SIM | F-Measure |
RegGAN (no regularized loss) | 88.91 | 80.03 | 63.40 | 21.51 |
RegGAN (no multiscale discriminator) | 87.71 | 78.12 | 63.29 | 17.18 |
RegGAN | 90.40 | 82.48 | 65.94 | 22.27 |
Mask | Boundary | |||
---|---|---|---|---|
Method | F1-Score | IoU | SIM | F-Measure |
RegGAN (no regularized loss) | 85.60 | 74.83 | 69.51 | 29.20 |
RegGAN (no multiscale discriminator) | 83.56 | 71.77 | 69.78 | 27.49 |
RegGAN | 86.74 | 76.50 | 71.44 | 32.17 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Q.; Zorzi, S.; Shi, Y.; Fraundorfer, F.; Zhu, X.X. RegGAN: An End-to-End Network for Building Footprint Generation with Boundary Regularization. Remote Sens. 2022, 14, 1835. https://doi.org/10.3390/rs14081835
Li Q, Zorzi S, Shi Y, Fraundorfer F, Zhu XX. RegGAN: An End-to-End Network for Building Footprint Generation with Boundary Regularization. Remote Sensing. 2022; 14(8):1835. https://doi.org/10.3390/rs14081835
Chicago/Turabian StyleLi, Qingyu, Stefano Zorzi, Yilei Shi, Friedrich Fraundorfer, and Xiao Xiang Zhu. 2022. "RegGAN: An End-to-End Network for Building Footprint Generation with Boundary Regularization" Remote Sensing 14, no. 8: 1835. https://doi.org/10.3390/rs14081835
APA StyleLi, Q., Zorzi, S., Shi, Y., Fraundorfer, F., & Zhu, X. X. (2022). RegGAN: An End-to-End Network for Building Footprint Generation with Boundary Regularization. Remote Sensing, 14(8), 1835. https://doi.org/10.3390/rs14081835