WeatherGAN: Unsupervised multi-weather image-to-image translation via single content-preserving UResNet generator

Sunhee Hwang^1,2,
Seogkyu Jeon¹,
Yu-Seung Ma³ &
…
Hyeran Byun ORCID: orcid.org/0000-0002-3082-3214¹

701 Accesses
1 Altmetric
Explore all metrics

Abstract

In this paper, we propose an unsupervised and unified multi-domain Image-to-Image translation model for an image weather domain translation. Most existing multi-domain Image-to-Image translation methods are capable of translating fine details such as facial attributes. However, the image translation model between multiple weather domains, e.g., sunny-to-snowy, or sunny-to-rainy, have to consider the large domain gap. To address the challenging problem, in this paper, we propose WeatherGAN based on a proposed UResNet generator. Our model consists of the UResNet generator, a PatchGAN discriminator, and a VGG perceptual encoder. UResNet is a combined model of U-Net and ResNet to address the ability of each model, that preserve input context information and generate realistic images. The PatchGAN discriminator encourages the generator to produce realistic images of the target domain by criticizing patch-wise details. We also leverage VGG perceptual encoder as a loss network, which guides the generator to minimize the perceptual distance between an input image and generated images to enhance the quality of outputs. Through the extensive experiments on Alps, YouTube driving (our benchmark dataset), and BDD datasets, we demonstrate that WeatherGAN produces more satisfactory results of the target domain compared to the baselines. Besides, we also conduct a data augmentation task to show the usability of our generated images by WeatherGAN, and it shows the overall object detection performance of YOLO v3 is improved in our results on BDD dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

ECGAN: Image Translation with Multi-scale Relativistic Average Discriminator

ForkGAN: Seeing into the Rainy Night

UNet-like network fused swin transformer and CNN for semantic image synthesis

Article Open access 21 July 2024

References

Alami Mejjati Y, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 3693–3703
Anoosheh A, Agustsson E, Timofte R, Van Gool L (2018) Combogan: Unrestrained scalability for image domain translation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 896–8967
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Precup D, Teh YW (eds) Proceedings of the 34th International Conference on Machine Learning, vol 70, PMLR, pp 214–223
Bińkowski M, Sutherland DJ, Arbel M, Gretton A (2018) Demystifying mmd gans. In: International conference on learning representations
Chen W, Hays J (2018) Sketchygan: Towards diverse and realistic sketch to image synthesis. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 9416–9425. https://doi.org/10.1109/CVPR.2018.00981
Chen Q, Koltun V (2017) Photographic image synthesis with cascaded refinement networks. In: The IEEE international conference on computer vision (ICCV)
Choi Y, Choi M, Kim M, Ha J, Kim S, Choo J (2018) Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 8789–8797
Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2414– 2423
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems, vol 27, Curran Associates Inc, pp 2672–2680
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, pp 6626–6637
Hoffman J, Tzeng E, Park T, Zhu J-Y, Isola P, Saenko K, Efros AA, Darrell T Cycada: Cycle-consistent adversarial domain adaptation, arXiv:1711.03213
Hong W, Wang Z, Yang M, Yuan J (2018) Conditional generative adversarial network for structured domain adaptation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 1335–1344. https://doi.org/10.1109/CVPR.2018.00145
Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: The european conference on computer vision (ECCV)
Huang X, Liu M.-Y., Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the european conference on computer vision (ECCV), pp 172–189
Isokane T, Okura F, Ide A, Matsushita Y, Yagi Y (2018) Probabilistic plant modeling via multi-view image-to-image translation. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Isola P, Zhu J, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 5967–5976
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: ECCV
Kazemi H, Soleymani S, Taherkhani F, Iranmanesh S, Nasrabadi N (2018) Unsupervised image-to-image translation using domain-specific variational information bound. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 10348–10358
Kohl S, Romera-Paredes B, Meyer C, De Fauw J, Ledsam JR, Maier-Hein K, Eslami SMA, Jimenez Rezende D, Ronneberger O (2018) A probabilistic u-net for segmentation of ambiguous images. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R. (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 6965–6975
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 105–114. https://doi.org/10.1109/CVPR.2017.19
Lee H-Y, Tseng H-Y, Huang J-B, Singh M, Yang M-H (2018) Diverse image-to-image translation via disentangled representations. In: Proceedings of the European conference on computer vision (ECCV), pp 35–51
Liu AH, Liu Y-C, Yeh Y-Y, Wang Y-CF (2018) A unified feature disentangler for multi-domain image translation and manipulation. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R. (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 2590–2599
Mo S, Cho M, Shin J (2019) https://openreview.net/forum?id=ryxwJhC9YX. In: International conference on learning representations. https://openreview.net/forum?id=ryxwJhC9YX
Redmon J, Farhadi A (2020) Yolov3: An incremental improvement, arXiv
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: MICCAI
Siddiquee MMR, Zhou Z, Tajbakhsh N, Feng R, Gotway MB, Bengio Y, Liang J (2019) Learning fixed points in generative adversarial networks: From image-to-image translation to disease detection and localization. In: Proceedings of the IEEE international conference on computer vision, pp 191–200
Simonyan K, Zisserman A (2020) Very deep convolutional networks for large-scale image recognition, arXiv:1409.1556
Tulyakov S, Liu M-Y, Yang X, Kautz J (2018) Mocogan: Decomposing motion and content for video generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1526–1535
Wang T, Liu M, Zhu J, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 8798–8807
Wang Y, Tao X, Qi X, Shen X, Jia J (2018) Image inpainting via generative multi-column convolutional neural networks. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 331–340
Wang Y, van de Weijer J, Herranz L (2018) Mix and match networks: Encoder-decoder alignment for zero-pair image translation. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: Unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2849–2857
Yu J, Lin Z, Yang J, Shen X, Lu X, Huang TS (2018) Generative image inpainting with contextual attention. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 5505– 5514
Yu F, Xian W, Chen Y, Liu F, Liao M, Madhavan V, Darrell T Bdd100k: A diverse driving video database with scalable annotation tooling, arXiv:1805.04687
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595
Zhu J, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE international conference on computer vision (ICCV), pp 2242– 2251
Zhu J-Y, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shechtman E (2017) Toward multimodal image-to-image translation. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30, Curran Associates Inc, pp 465–476

Download references

Acknowledgements

This project was partly supported by the National Research Foundation of Korea grant funded by the Korea government (MSIT) (No. 2022R1A2B5B02001467) and the Institute for Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (No.2018-0-00769: Neuromorphic Computing Software Platform for Artificial Intelligence Systems, No. 2020-0-01361: Artificial Intelligence Graduate School Program (YONSEI UNIVERSITY)).

Author information

Authors and Affiliations

Yonsei University, Seoul, Republic of Korea
Sunhee Hwang, Seogkyu Jeon & Hyeran Byun
LG Uplus, Seoul, Republic of Korea
Sunhee Hwang
Electronics and Telecommunications Research Institute, Daejeon, Republic of Korea
Yu-Seung Ma

Authors

Sunhee Hwang
View author publications
You can also search for this author in PubMed Google Scholar
Seogkyu Jeon
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Seung Ma
View author publications
You can also search for this author in PubMed Google Scholar
Hyeran Byun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hyeran Byun.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hwang, S., Jeon, S., Ma, YS. et al. WeatherGAN: Unsupervised multi-weather image-to-image translation via single content-preserving UResNet generator. Multimed Tools Appl 81, 40269–40288 (2022). https://doi.org/10.1007/s11042-022-12934-9

Download citation

Received: 10 March 2020
Revised: 29 October 2020
Accepted: 10 March 2022
Published: 07 May 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s11042-022-12934-9

WeatherGAN: Unsupervised multi-weather image-to-image translation via single content-preserving UResNet generator

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ECGAN: Image Translation with Multi-scale Relativistic Average Discriminator

ForkGAN: Seeing into the Rainy Night

UNet-like network fused swin transformer and CNN for semantic image synthesis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

WeatherGAN: Unsupervised multi-weather image-to-image translation via single content-preserving UResNet generator

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ECGAN: Image Translation with Multi-scale Relativistic Average Discriminator

ForkGAN: Seeing into the Rainy Night

UNet-like network fused swin transformer and CNN for semantic image synthesis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now