[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

WeatherGAN: Unsupervised multi-weather image-to-image translation via single content-preserving UResNet generator

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose an unsupervised and unified multi-domain Image-to-Image translation model for an image weather domain translation. Most existing multi-domain Image-to-Image translation methods are capable of translating fine details such as facial attributes. However, the image translation model between multiple weather domains, e.g., sunny-to-snowy, or sunny-to-rainy, have to consider the large domain gap. To address the challenging problem, in this paper, we propose WeatherGAN based on a proposed UResNet generator. Our model consists of the UResNet generator, a PatchGAN discriminator, and a VGG perceptual encoder. UResNet is a combined model of U-Net and ResNet to address the ability of each model, that preserve input context information and generate realistic images. The PatchGAN discriminator encourages the generator to produce realistic images of the target domain by criticizing patch-wise details. We also leverage VGG perceptual encoder as a loss network, which guides the generator to minimize the perceptual distance between an input image and generated images to enhance the quality of outputs. Through the extensive experiments on Alps, YouTube driving (our benchmark dataset), and BDD datasets, we demonstrate that WeatherGAN produces more satisfactory results of the target domain compared to the baselines. Besides, we also conduct a data augmentation task to show the usability of our generated images by WeatherGAN, and it shows the overall object detection performance of YOLO v3 is improved in our results on BDD dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Alami Mejjati Y, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 3693–3703

  2. Anoosheh A, Agustsson E, Timofte R, Van Gool L (2018) Combogan: Unrestrained scalability for image domain translation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 896–8967

  3. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Precup D, Teh YW (eds) Proceedings of the 34th International Conference on Machine Learning, vol 70, PMLR, pp 214–223

  4. Bińkowski M, Sutherland DJ, Arbel M, Gretton A (2018) Demystifying mmd gans. In: International conference on learning representations

  5. Chen W, Hays J (2018) Sketchygan: Towards diverse and realistic sketch to image synthesis. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 9416–9425. https://doi.org/10.1109/CVPR.2018.00981

  6. Chen Q, Koltun V (2017) Photographic image synthesis with cascaded refinement networks. In: The IEEE international conference on computer vision (ICCV)

  7. Choi Y, Choi M, Kim M, Ha J, Kim S, Choo J (2018) Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 8789–8797

  8. Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2414– 2423

  9. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems, vol 27, Curran Associates Inc, pp 2672–2680

  10. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777

  11. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  12. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, pp 6626–6637

  13. Hoffman J, Tzeng E, Park T, Zhu J-Y, Isola P, Saenko K, Efros AA, Darrell T Cycada: Cycle-consistent adversarial domain adaptation, arXiv:1711.03213

  14. Hong W, Wang Z, Yang M, Yuan J (2018) Conditional generative adversarial network for structured domain adaptation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 1335–1344. https://doi.org/10.1109/CVPR.2018.00145

  15. Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: The european conference on computer vision (ECCV)

  16. Huang X, Liu M.-Y., Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the european conference on computer vision (ECCV), pp 172–189

  17. Isokane T, Okura F, Ide A, Matsushita Y, Yagi Y (2018) Probabilistic plant modeling via multi-view image-to-image translation. In: The IEEE conference on computer vision and pattern recognition (CVPR)

  18. Isola P, Zhu J, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 5967–5976

  19. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: ECCV

  20. Kazemi H, Soleymani S, Taherkhani F, Iranmanesh S, Nasrabadi N (2018) Unsupervised image-to-image translation using domain-specific variational information bound. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 10348–10358

  21. Kohl S, Romera-Paredes B, Meyer C, De Fauw J, Ledsam JR, Maier-Hein K, Eslami SMA, Jimenez Rezende D, Ronneberger O (2018) A probabilistic u-net for segmentation of ambiguous images. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R. (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 6965–6975

  22. Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 105–114. https://doi.org/10.1109/CVPR.2017.19

  23. Lee H-Y, Tseng H-Y, Huang J-B, Singh M, Yang M-H (2018) Diverse image-to-image translation via disentangled representations. In: Proceedings of the European conference on computer vision (ECCV), pp 35–51

  24. Liu AH, Liu Y-C, Yeh Y-Y, Wang Y-CF (2018) A unified feature disentangler for multi-domain image translation and manipulation. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R. (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 2590–2599

  25. Mo S, Cho M, Shin J (2019) https://openreview.net/forum?id=ryxwJhC9YX. In: International conference on learning representations. https://openreview.net/forum?id=ryxwJhC9YX

  26. Redmon J, Farhadi A (2020) Yolov3: An incremental improvement, arXiv

  27. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: MICCAI

  28. Siddiquee MMR, Zhou Z, Tajbakhsh N, Feng R, Gotway MB, Bengio Y, Liang J (2019) Learning fixed points in generative adversarial networks: From image-to-image translation to disease detection and localization. In: Proceedings of the IEEE international conference on computer vision, pp 191–200

  29. Simonyan K, Zisserman A (2020) Very deep convolutional networks for large-scale image recognition, arXiv:1409.1556

  30. Tulyakov S, Liu M-Y, Yang X, Kautz J (2018) Mocogan: Decomposing motion and content for video generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1526–1535

  31. Wang T, Liu M, Zhu J, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 8798–8807

  32. Wang Y, Tao X, Qi X, Shen X, Jia J (2018) Image inpainting via generative multi-column convolutional neural networks. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31, Curran Associates Inc, pp 331–340

  33. Wang Y, van de Weijer J, Herranz L (2018) Mix and match networks: Encoder-decoder alignment for zero-pair image translation. In: The IEEE conference on computer vision and pattern recognition (CVPR)

  34. Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: Unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2849–2857

  35. Yu J, Lin Z, Yang J, Shen X, Lu X, Huang TS (2018) Generative image inpainting with contextual attention. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 5505– 5514

  36. Yu F, Xian W, Chen Y, Liu F, Liao M, Madhavan V, Darrell T Bdd100k: A diverse driving video database with scalable annotation tooling, arXiv:1805.04687

  37. Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595

  38. Zhu J, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE international conference on computer vision (ICCV), pp 2242– 2251

  39. Zhu J-Y, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shechtman E (2017) Toward multimodal image-to-image translation. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30, Curran Associates Inc, pp 465–476

Download references

Acknowledgements

This project was partly supported by the National Research Foundation of Korea grant funded by the Korea government (MSIT) (No. 2022R1A2B5B02001467) and the Institute for Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (No.2018-0-00769: Neuromorphic Computing Software Platform for Artificial Intelligence Systems, No. 2020-0-01361: Artificial Intelligence Graduate School Program (YONSEI UNIVERSITY)).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hyeran Byun.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hwang, S., Jeon, S., Ma, YS. et al. WeatherGAN: Unsupervised multi-weather image-to-image translation via single content-preserving UResNet generator. Multimed Tools Appl 81, 40269–40288 (2022). https://doi.org/10.1007/s11042-022-12934-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12934-9

Keywords

Navigation