Staged Transformer Network with Color Harmonization for Image Outpainting

Bing Yu¹²,
Wangyidai Lv¹²,
Dongjin Huang¹² &
…
Youdong Ding¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14496))

Included in the following conference series:

Computer Graphics International Conference

929 Accesses

Abstract

Image outpainting aims at generating new looking-realistic content beyond the original boundaries for a given image patch. Existing image outpainting methods tend to generate images with erroneous structures and unnatural colors when extrapolating the sub-image all-side. To solve this problem, we propose a Transformer-based staged image outpainting network. Specifically, we restructure the encoder-decoder architecture by adding hierarchical cross attention to the connection in each layer. We propose a staged expanding module that splits the extrapolation into vertical and horizontal steps so that the generated images can have consistent contextual information and similar texture. A color harmonization module that adjusts both local and global color information is also presented to make color transitions more natural. Our experiments prove that the proposed method outperforms the advanced methods on multiple datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 51.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 64.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Improved two-stage image inpainting with perceptual color loss and modified region normalization

Article 07 October 2022

A Novel Transfer-Learning Network for Image Inpainting

Image Inpainting with Semantic U-Transformer

References

Ballester, C., Bertalmio, M., Caselles, V., Sapiro, G., Verdera, J.: Filling-in by joint interpolation of vector fields and gray levels. IEEE Trans. Image Process. 10(8), 1200–1211 (2001)
Article MathSciNet Google Scholar
Chen, J., Fu, Z., Huang, J., Hu, X., Peng, T.: Boosting vision transformer for low-resolution borehole image stitching through algebraic multigrid. Vis. Comput. 38(9–10), 3191–3203 (2022)
Article Google Scholar
Cheng, Y.C., Lin, C.H., Lee, H.Y., Ren, J., Tulyakov, S., Yang, M.H.: InOut: diverse image outpainting via GAN inversion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11431–11440 (2022)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, 3–7 May 2021 (2021)
Google Scholar
Gao, P., et al.: Generalized image outpainting with U-transformer. Neural Netw. 162, 1–10 (2023)
Article Google Scholar
Ge, S., Li, C., Zhao, S., Zeng, D.: Occluded face recognition in the wild by identity-diversity inpainting. IEEE Trans. Circuits Syst. Video Technol. 30(10), 3387–3397 (2020)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Neural Information Processing Systems (2014)
Google Scholar
Gulati, A., et al.: Conformer: convolution-augmented transformer for speech recognition. In: Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25–29 October 2020, pp. 5036–5040. ISCA (2020)
Google Scholar
Guo, D., et al.: Spiral generative network for image extrapolation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 701–717. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_41
Chapter Google Scholar
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)
Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Chapter Google Scholar
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8110–8119 (2020)
Google Scholar
Kong, D., Kong, K., Kim, K., Min, S.J., Kang, S.J.: Image-adaptive hint generation via vision transformer for outpainting. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3572–3581 (2022)
Google Scholar
Lin, X., Sun, S., Huang, W., Sheng, B., Li, P., Feng, D.D.: EAPT: efficient attention pyramid transformer for image processing. IEEE Trans. Multimedia (2021)
Google Scholar
Liu, Y., Guo, Z., Guo, H., Xiao, H.: Zoom-GAN: learn to colorize multi-scale targets. Vis. Comput., 1–12 (2023)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Sabini, M., Rusak, G.: Painting outside the box: image outpainting with GANs. arXiv preprint arXiv:1808.08483 (2018)
Shan, Q., Curless, B., Furukawa, Y., Hernandez, C., Seitz, S.M.: Photo Uncrop. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 16–31. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_2
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015)
Google Scholar
Tan, W.R., Chan, C.S., Aguirre, H.E., Tanaka, K.: Ceci n’est pas une pipe: a deep convolutional network for fine-art paintings classification. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3703–3707. IEEE (2016)
Google Scholar
Teterwak, P., et al.: Boundless: generative adversarial networks for image extension. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10521–10530 (2019)
Google Scholar
Van Hoorick, B.: Image outpainting and harmonization using generative adversarial networks. arXiv preprint arXiv:1912.10960 (2019)
Wang, Y., Tao, X., Shen, X., Jia, J.: Wide-context semantic image extrapolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1399–1408 (2019)
Google Scholar
Wu, X., et al.: Deep portrait image completion and extrapolation. IEEE Trans. Image Process. 29, 2344–2355 (2020)
Article Google Scholar
Yang, Z., Dong, J., Liu, P., Yang, Y., Yan, S.: Very long natural scenery image prediction by outpainting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10561–10570 (2019)
Google Scholar
Yao, K., Gao, P., Yang, X., Sun, J., Zhang, R., Huang, K.: Outpainting by queries. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022, Proceedings, Part XXIII. pp. 153–169. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20050-2_10
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4471–4480 (2019)
Google Scholar
Yu, X., Li, H., Yang, H.: Two-stage image decomposition and color regulator for low-light image enhancement. Vis. Comput. 39(9), 4165–4175 (2023)
Article Google Scholar
Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable ConvNets V2: more deformable, better results. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019 (2019)
Google Scholar

Download references

Acknowledgements

This work was supported by the Shanghai Natural Science Foundation of China under Grant No. 19ZR1419100.

Author information

Authors and Affiliations

Shanghai University, Shanghai, 200072, China
Bing Yu, Wangyidai Lv, Dongjin Huang & Youdong Ding

Authors

Bing Yu
View author publications
You can also search for this author in PubMed Google Scholar
Wangyidai Lv
View author publications
You can also search for this author in PubMed Google Scholar
Dongjin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Youdong Ding
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bing Yu .

Editor information

Editors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Bin Sheng
Shanghai Jiao Tong University, Shanghai, China
Lei Bi
University of Sydney, Sydney, NSW, Australia
Jinman Kim
MIRALab-CUI, University of Geneve, Carouge, Geneve, Switzerland
Nadia Magnenat-Thalmann
Swiss Federal Institute of Technology, Lausanne, Switzerland
Daniel Thalmann

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 43579 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, B., Lv, W., Huang, D., Ding, Y. (2024). Staged Transformer Network with Color Harmonization for Image Outpainting. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14496. Springer, Cham. https://doi.org/10.1007/978-3-031-50072-5_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-50072-5_21
Published: 29 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50071-8
Online ISBN: 978-3-031-50072-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Staged Transformer Network with Color Harmonization for Image Outpainting

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improved two-stage image inpainting with perceptual color loss and modified region normalization

A Novel Transfer-Learning Network for Image Inpainting

Image Inpainting with Semantic U-Transformer

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Staged Transformer Network with Color Harmonization for Image Outpainting

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improved two-stage image inpainting with perceptual color loss and modified region normalization

A Novel Transfer-Learning Network for Image Inpainting

Image Inpainting with Semantic U-Transformer

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation