More Web Proxy on the site http://driver.im/

research-article

Synthesizing light field from a single image with variable MPI and two network fusion

Authors:

Nima Khademi KalantariAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 39, Issue 6

Article No.: 229, Pages 1 - 10

https://doi.org/10.1145/3414685.3417785

Published: 27 November 2020 Publication History

Abstract

We propose a learning-based approach to synthesize a light field with a small baseline from a single image. We synthesize the novel view images by first using a convolutional neural network (CNN) to promote the input image into a layered representation of the scene. We extend the multiplane image (MPI) representation by allowing the disparity of the layers to be inferred from the input image. We show that, compared to the original MPI representation, our representation models the scenes more accurately. Moreover, we propose to handle the visible and occluded regions separately through two parallel networks. The synthesized images using these two networks are then combined through a soft visibility mask to generate the final results. To effectively train the networks, we introduce a large-scale light field dataset of over 2,000 unique scenes containing a wide range of objects. We demonstrate that our approach synthesizes high-quality light fields on a variety of scenes, better than the state-of-the-art methods.

Supplementary Material

MP4 File (a229-li.mp4)

Download
286.59 MB

MP4 File (3414685.3417785.mp4)

Presentation video

Download
380.35 MB

References

[1]

Gaurav Chaurasia, Sylvain Duchene, Olga Sorkine-Hornung, and George Drettakis. 2013. Depth synthesis and local warps for plausible image-based navigation. ACM Transactions on Graphics (TOG) 32, 3 (2013), 1--12.

Digital Library

[2]

Qifeng Chen and Vladlen Koltun. 2017. Photographic image synthesis with cascaded refinement networks. In Proceedings of the IEEE International Conference on Computer Vision. 1511--1520.

[3]

Inchang Choi, Orazio Gallo, Alejandro Troccoli, Min H Kim, and Jan Kautz. 2019. Extreme View Synthesis. In Proceedings of the IEEE International Conference on Computer Vision. 7781--7790.

[4]

X. Cun, F. Xu, C. Pun, and H. Gao. 2019. Depth-Assisted Full Resolution Network for Single Image-Based View Synthesis. IEEE Computer Graphics and Applications 39, 2 (March 2019), 52--64.

[5]

Donald G. Dansereau, Bernd Girod, and Gordon Wetzstein. 2019. LiFF: Light Field Features in Scale and Depth. In Computer Vision and Pattern Recognition (CVPR). IEEE.

[6]

Helisa Dhamo, Keisuke Tateno, Iro Laina, Nassir Navab, and Federico Tombari. 2019. Peeking behind objects: Layered depth prediction from a single image. Pattern Recognition Letters 125 (2019), 333--340.

Digital Library

[7]

Simon Evain and Christine Guillemot. 2019. A Lightweight Neural Network for Monocular View Generation with Occlusion Handling. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019), 1--14.

[8]

John Flynn, Michael Broxton, Paul Debevec, Matthew DuVall, Graham Fyffe, Ryan Overbeck, Noah Snavely, and Richard Tucker. 2019. DeepView: View synthesis with learned gradient descent. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2367--2376.

[9]

John Flynn, Ivan Neulander, James Philbin, and Noah Snavely. 2016. Deepstereo: Learning to predict new views from the world's imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5515--5524.

[10]

Yoav HaCohen, Eli Shechtman, Dan B Goldman, and Dani Lischinski. 2011. Non-rigid dense correspondence with applications for image enhancement. ACM Transactions on Graphics (TOG) 30, 4 (2011), 70.

Digital Library

[11]

Peter Hedman, Suhib Alsisan, Richard Szeliski, and Johannes Kopf. 2017. Casual 3D photography. ACM Transactions on Graphics (TOG) 36, 6 (2017), 1--15.

Digital Library

[12]

Peter Hedman and Johannes Kopf. 2018. Instant 3d photography. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1--12.

Digital Library

[13]

Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015).

Digital Library

[14]

Nima Khademi Kalantari, Ting-Chun Wang, and Ravi Ramamoorthi. 2016. Learning-based view synthesis for light field cameras. ACM Transactions on Graphics (TOG) 35, 6 (2016), 193.

Digital Library

[15]

Diederick P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR).

[16]

Miaomiao Liu, Xuming He, and Mathieu Salzmann. 2018. Geometry-aware deep network for single-image novel view synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4616--4624.

[17]

Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. 2019. Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines. ACM Transactions on Graphics (TOG) 38, 4, Article 29 (July 2019), 14 pages.

Digital Library

[18]

Simon Niklaus, Long Mai, Jimei Yang, and Feng Liu. 2019. 3D Ken Burns Effect from a Single Image. ACM Transactions on Graphics (TOG) 38, 6, Article Article 184 (Nov. 2019), 15 pages.

Digital Library

[19]

Kyle Olszewski, Sergey Tulyakov, Oliver Woodford, Hao Li, and Linjie Luo. 2019. Transformable Bottleneck Networks. arXiv preprint arXiv:1904.06458 (2019).

[20]

Eunbyung Park, Jimei Yang, Ersin Yumer, Duygu Ceylan, and Alexander C Berg. 2017. Transformation-grounded image generation network for novel 3d view synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3500--3509.

[21]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems. 8024--8035.

[22]

Eric Penner and Li Zhang. 2017. Soft 3D reconstruction for view synthesis. ACM Transactions on Graphics (TOG) 36, 6 (2017), 235.

Digital Library

[23]

Thomas Porter and Tom Duff. 1984. Compositing digital images. In ACM Siggraph Computer Graphics, Vol. 18. ACM, 253--259.

Digital Library

[24]

Konstantinos Rematas, Chuong H Nguyen, Tobias Ritschel, Mario Fritz, and Tinne Tuytelaars. 2016. Novel views of objects from a single image. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 8 (2016), 1576--1590.

Digital Library

[25]

Meng-Li Shih, Shih-Yang Su, Johannes Kopf, and Jia-Bin Huang. 2020. 3D Photography using Context-aware Layered Depth Inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8028--8038.

[26]

Pratul P Srinivasan, Richard Tucker, Jonathan T Barron, Ravi Ramamoorthi, Ren Ng, and Noah Snavely. 2019. Pushing the Boundaries of View Extrapolation with Multiplane Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 175--184.

[27]

Pratul P Srinivasan, Tongzhou Wang, Ashwin Sreelal, Ravi Ramamoorthi, and Ren Ng. 2017. Learning to synthesize a 4d rgbd light field from a single image. In Proceedings of the IEEE International Conference on Computer Vision. 2243--2251.

[28]

Maxim Tatarchenko, Alexey Dosovitskiy, and Thomas Brox. 2015. Single-view to Multi-view: Reconstructing Unseen Views with a Convolutional Network. CoRR abs/1511.06702 (2015).

[29]

Richard Tucker and Noah Snavely. 2020. Single-View View Synthesis with Multiplane Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 551--560.

[30]

Shubham Tulsiani, Richard Tucker, and Noah Snavely. 2018. Layer-structured 3d scene inference via view synthesis. In Proceedings of the European Conference on Computer Vision (ECCV). 302--317.

Digital Library

[31]

Lijun Wang, Xiaohui Shen, Jianming Zhang, Oliver Wang, Zhe Lin, Chih-Yao Hsieh, Sarah Kong, and Huchuan Lu. 2018b. DeepLens: Shallow Depth of Field from a Single Image. ACM Transactions on Graphics (TOG) 37, 6, Article 245 (Dec. 2018), 11 pages.

Digital Library

[32]

Ting-Chun Wang, Jun-Yan Zhu, Nima Khademi Kalantari, Alexei A Efros, and Ravi Ramamoorthi. 2017. Light field video capture using a learning-based hybrid imaging system. ACM Transactions on Graphics (TOG) 36, 4 (2017), 133.

Digital Library

[33]

Yunlong Wang, Fei Liu, Zilei Wang, Guangqi Hou, Zhenan Sun, and Tieniu Tan. 2018a. End-to-end view synthesis for light field imaging with pseudo 4DCNN. In Proceedings of the European Conference on Computer Vision (ECCV). 333--348.

Digital Library

[34]

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600--612.

Digital Library

[35]

Olivia Wiles, Georgia Gkioxari, Richard Szeliski, and Justin Johnson. 2020. Synsin: End-to-end view synthesis from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7467--7477.

[36]

Gaochang Wu, Mandan Zhao, Liangyong Wang, Qionghai Dai, Tianyou Chai, and Yebin Liu. 2017. Light field reconstruction using deep convolutional network on EPI. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6319--6327.

[37]

Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, and Honglak Lee. 2016. Perspective transformer nets: Learning single-view 3d object reconstruction without 3d supervision. In Advances in Neural Information Processing Systems. 1696--1704.

[38]

Jimei Yang, Scott E Reed, Ming-Hsuan Yang, and Honglak Lee. 2015. Weakly-supervised disentangling with recurrent transformations for 3d view synthesis. In Advances in Neural Information Processing Systems. 1099--1107.

[39]

Tinghui Zhou, Richard Tucker, John Flynn, Graham Fyffe, and Noah Snavely. 2018. Stereo Magnification: Learning View Synthesis Using Multiplane Images. ACM Transactions on Graphics (TOG) 37, 4, Article 65 (July 2018), 12 pages.

Digital Library

[40]

Tinghui Zhou, Shubham Tulsiani, Weilun Sun, Jitendra Malik, and Alexei A Efros. 2016. View synthesis by appearance flow. In European Conference on Computer Vision. Springer, 286--301.

Cited By

Wu XXu JWang CPeng YHuang QTompkin JXu W(2024)Local Gaussian Density Mixtures for Unstructured Lumigraph RenderingSIGGRAPH Asia 2024 Conference Papers10.1145/3680528.3687659(1-11)Online publication date: 3-Dec-2024
https://dl.acm.org/doi/10.1145/3680528.3687659
Zhang JJames DKaufman D(2024)Progressive Dynamics for Cloth and Shell AnimationACM Transactions on Graphics10.1145/365821443:4(1-18)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658214
Guo SHu JZhou KWang JSong LXie RZhang W(2024)Real-Time Free Viewpoint Video Synthesis System Based on DIBR and a Depth Estimation NetworkIEEE Transactions on Multimedia10.1109/TMM.2024.335563926(6701-6716)Online publication date: 18-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3355639
Show More Cited By

Index Terms

Synthesizing light field from a single image with variable MPI and two network fusion
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation
      1. Image-based rendering

Recommendations

Learning-based view synthesis for light field cameras

With the introduction of consumer light field cameras, light field imaging has recently become widespread. However, there is an inherent trade-off between the angular and spatial resolution, and thus, these cameras often sparsely sample in either ...
Depth-guided view synthesis for light field reconstruction from a single image
Abstract
Light field imaging has recently become a promising technology for 3D rendering and displaying. However, capturing real-world light field images still faces many challenges in both the quantity and quality. In this paper, we develop a ...
Camera array calibration for light field acquisition

Light field cameras are becoming popular in computer vision and graphics, with many research and commercial applications already having been proposed. Various types of cameras have been developed with the camera array being one of the ways of acquiring ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 39, Issue 6

December 2020

1605 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/3414685

Editor:
Karol Myszkowski
MPI Informatik

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 November 2020

Published in TOG Volume 39, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

TAMU T3

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

63
Total Citations
View Citations
293
Total Downloads

Downloads (Last 12 months)41
Downloads (Last 6 weeks)3

Reflects downloads up to 28 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wu XXu JWang CPeng YHuang QTompkin JXu W(2024)Local Gaussian Density Mixtures for Unstructured Lumigraph RenderingSIGGRAPH Asia 2024 Conference Papers10.1145/3680528.3687659(1-11)Online publication date: 3-Dec-2024
https://dl.acm.org/doi/10.1145/3680528.3687659
Zhang JJames DKaufman D(2024)Progressive Dynamics for Cloth and Shell AnimationACM Transactions on Graphics10.1145/365821443:4(1-18)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658214
Guo SHu JZhou KWang JSong LXie RZhang W(2024)Real-Time Free Viewpoint Video Synthesis System Based on DIBR and a Depth Estimation NetworkIEEE Transactions on Multimedia10.1109/TMM.2024.335563926(6701-6716)Online publication date: 18-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3355639
Lazri ZYeol Lee DSu G(2024)A Framework for Single-View Multi-Plane Image Inpainting2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR)10.1109/MIPR62202.2024.00092(536-541)Online publication date: 7-Aug-2024
https://doi.org/10.1109/MIPR62202.2024.00092
Garg AMallampali RJoshi AGovindarajan SMitra K(2024)Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction2024 IEEE International Conference on Computational Photography (ICCP)10.1109/ICCP61108.2024.10644854(1-12)Online publication date: 22-Jul-2024
https://doi.org/10.1109/ICCP61108.2024.10644854
Lee DSu GYin P(2024)OGRMPI: An Efficient Multiview Integrated Multiplane Image based on Occlusion Guided Residuals2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00084(794-802)Online publication date: 17-Jun-2024
https://doi.org/10.1109/CVPRW63382.2024.00084
Habuchi STakahashi KTsutake CFujii TNagahara H(2024)Time-Efficient Light-Field Acquisition Using Coded Aperture and Events2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.02354(24923-24933)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.02354
Zhao MSheng HYang DWang SCong RCui ZChen RWang TWang SHuang YShen J(2024)A survey for light field super-resolutionHigh-Confidence Computing10.1016/j.hcc.2024.100206(100206)Online publication date: Jan-2024
https://doi.org/10.1016/j.hcc.2024.100206
Pintore GJaspe-Villanueva AHadwiger MSchneider JAgus MMarton FBettio FGobbetti E(2024)Deep synthesis and exploration of omnidirectional stereoscopic environments from a single surround-view panoramic imageComputers and Graphics10.1016/j.cag.2024.103907119:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.cag.2024.103907
Wang XWu CYin SNi MWang JLi LYang ZYang FWang LLiu ZFang YDuan NElkind E(2023)Learning 3D photography videos via self-supervised diffusion on single imagesProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/167(1506-1514)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.24963/ijcai.2023/167
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents