Abstract
In this paper, we propose an unsupervised and unified multi-domain Image-to-Image translation model for an image weather domain translation. Most existing multi-domain Image-to-Image translation methods are capable of translating fine details such as facial attributes. However, the image translation model between multiple weather domains, e.g., sunny-to-snowy, or sunny-to-rainy, have to consider the large domain gap. To address the challenging problem, in this paper, we propose WeatherGAN based on a proposed UResNet generator. Our model consists of the UResNet generator, a PatchGAN discriminator, and a VGG perceptual encoder. UResNet is a combined model of U-Net and ResNet to address the ability of each model, that preserve input context information and generate realistic images. The PatchGAN discriminator encourages the generator to produce realistic images of the target domain by criticizing patch-wise details. We also leverage VGG perceptual encoder as a loss network, which guides the generator to minimize the perceptual distance between an input image and generated images to enhance the quality of outputs. Through the extensive experiments on Alps, YouTube driving (our benchmark dataset), and BDD datasets, we demonstrate that WeatherGAN produces more satisfactory results of the target domain compared to the baselines. Besides, we also conduct a data augmentation task to show the usability of our generated images by WeatherGAN, and it shows the overall object detection performance of YOLO v3 is improved in our results on BDD dataset.
Similar content being viewed by others
References
Alami Mejjati Y, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 3693–3703
Anoosheh A, Agustsson E, Timofte R, Van Gool L (2018) Combogan: Unrestrained scalability for image domain translation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 896–8967
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Precup D, Teh YW (eds) Proceedings of the 34th International Conference on Machine Learning, vol 70, PMLR, pp 214–223
Bińkowski M, Sutherland DJ, Arbel M, Gretton A (2018) Demystifying mmd gans. In: International conference on learning representations
Chen W, Hays J (2018) Sketchygan: Towards diverse and realistic sketch to image synthesis. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 9416–9425. https://doi.org/10.1109/CVPR.2018.00981
Chen Q, Koltun V (2017) Photographic image synthesis with cascaded refinement networks. In: The IEEE international conference on computer vision (ICCV)
Choi Y, Choi M, Kim M, Ha J, Kim S, Choo J (2018) Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 8789–8797
Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2414– 2423
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems, vol 27, Curran Associates Inc, pp 2672–2680
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, pp 6626–6637
Hoffman J, Tzeng E, Park T, Zhu J-Y, Isola P, Saenko K, Efros AA, Darrell T Cycada: Cycle-consistent adversarial domain adaptation, arXiv:1711.03213
Hong W, Wang Z, Yang M, Yuan J (2018) Conditional generative adversarial network for structured domain adaptation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 1335–1344. https://doi.org/10.1109/CVPR.2018.00145
Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: The european conference on computer vision (ECCV)
Huang X, Liu M.-Y., Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the european conference on computer vision (ECCV), pp 172–189
Isokane T, Okura F, Ide A, Matsushita Y, Yagi Y (2018) Probabilistic plant modeling via multi-view image-to-image translation. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Isola P, Zhu J, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 5967–5976
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: ECCV
Kazemi H, Soleymani S, Taherkhani F, Iranmanesh S, Nasrabadi N (2018) Unsupervised image-to-image translation using domain-specific variational information bound. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 10348–10358
Kohl S, Romera-Paredes B, Meyer C, De Fauw J, Ledsam JR, Maier-Hein K, Eslami SMA, Jimenez Rezende D, Ronneberger O (2018) A probabilistic u-net for segmentation of ambiguous images. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R. (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 6965–6975
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 105–114. https://doi.org/10.1109/CVPR.2017.19
Lee H-Y, Tseng H-Y, Huang J-B, Singh M, Yang M-H (2018) Diverse image-to-image translation via disentangled representations. In: Proceedings of the European conference on computer vision (ECCV), pp 35–51
Liu AH, Liu Y-C, Yeh Y-Y, Wang Y-CF (2018) A unified feature disentangler for multi-domain image translation and manipulation. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R. (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 2590–2599
Mo S, Cho M, Shin J (2019) https://openreview.net/forum?id=ryxwJhC9YX. In: International conference on learning representations. https://openreview.net/forum?id=ryxwJhC9YX
Redmon J, Farhadi A (2020) Yolov3: An incremental improvement, arXiv
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: MICCAI
Siddiquee MMR, Zhou Z, Tajbakhsh N, Feng R, Gotway MB, Bengio Y, Liang J (2019) Learning fixed points in generative adversarial networks: From image-to-image translation to disease detection and localization. In: Proceedings of the IEEE international conference on computer vision, pp 191–200
Simonyan K, Zisserman A (2020) Very deep convolutional networks for large-scale image recognition, arXiv:1409.1556
Tulyakov S, Liu M-Y, Yang X, Kautz J (2018) Mocogan: Decomposing motion and content for video generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1526–1535
Wang T, Liu M, Zhu J, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 8798–8807
Wang Y, Tao X, Qi X, Shen X, Jia J (2018) Image inpainting via generative multi-column convolutional neural networks. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 331–340
Wang Y, van de Weijer J, Herranz L (2018) Mix and match networks: Encoder-decoder alignment for zero-pair image translation. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: Unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2849–2857
Yu J, Lin Z, Yang J, Shen X, Lu X, Huang TS (2018) Generative image inpainting with contextual attention. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 5505– 5514
Yu F, Xian W, Chen Y, Liu F, Liao M, Madhavan V, Darrell T Bdd100k: A diverse driving video database with scalable annotation tooling, arXiv:1805.04687
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595
Zhu J, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE international conference on computer vision (ICCV), pp 2242– 2251
Zhu J-Y, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shechtman E (2017) Toward multimodal image-to-image translation. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30, Curran Associates Inc, pp 465–476
Acknowledgements
This project was partly supported by the National Research Foundation of Korea grant funded by the Korea government (MSIT) (No. 2022R1A2B5B02001467) and the Institute for Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (No.2018-0-00769: Neuromorphic Computing Software Platform for Artificial Intelligence Systems, No. 2020-0-01361: Artificial Intelligence Graduate School Program (YONSEI UNIVERSITY)).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hwang, S., Jeon, S., Ma, YS. et al. WeatherGAN: Unsupervised multi-weather image-to-image translation via single content-preserving UResNet generator. Multimed Tools Appl 81, 40269–40288 (2022). https://doi.org/10.1007/s11042-022-12934-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12934-9