More Web Proxy on the site http://driver.im/

research-article

LoopGaussian: Creating 3D Cinemagraph with Multi-view Images via Eulerian Motion Field

Authors:

Jingxuan HeAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 476 - 485

https://doi.org/10.1145/3664647.3681025

Published: 28 October 2024 Publication History

Abstract

Cinemagraph creates captivating video experience by combining elements of still photography and subtle motion. However, most existing cinemagraph video generation lacks depth information, being restricted within 2-dimensional (2D) image space. We advance cinemagraph from 2D image space to 3-dimensional (3D) space with high quality by proposing LoopGaussian. It is based on 3D Gaussian modeling, taking advantage of the 3D Gaussian Splatting (3D-GS) technique that has significantly improved the field of novel view synthesis. Here is a brief overview of our new approach: It employs 3D-GS to reconstruct 3D Gaussian point clouds from multi-view images of static scenes, where shape regularization is used to prevent blurring or artifacts caused by object deformation. To maintain local continuity between scenes, it then clusters the 3D Gaussian points by the proposed SuperGaussian algorithm using features acquired by an autoencoder tailored for 3D Gaussian. Similarities between clusters are used to derive an Eulerian motion field for describing velocities across the entire scene. The estimated Eulerian motion field drives the movement of the 3D Gaussian points, based on which a 3D Cinemagraph is generated through bidirectional animation. The resulting 3D Cinemagraph exhibits natural and seamlessly loopable dynamics. Experiment results validate the effectiveness of the proposed approach, demonstrating high-quality and visually appealing video generation.

References

[1]

Jiamin Bai, Aseem Agarwala, Maneesh Agrawala, and Ravi Ramamoorthi. 2013. Automatic cinemagraph portraits. In Computer Graphics Forum, Vol. 32. Wiley Online Library, 17--25.

[2]

Jonathan T Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P Srinivasan. 2021. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5855--5864.

[3]

Ang Cao and Justin Johnson. 2023. Hexplane: A fast representation for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 130--141.

[4]

Jiazhong Cen, Jiemin Fang, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, and Qi Tian. 2023. Segment any 3d gaussians. arXiv preprint arXiv:2312.00860 (2023).

[5]

Guikun Chen and Wenguan Wang. 2024. A Survey on 3D Gaussian Splatting. arXiv preprint arXiv:2401.03890 (2024).

[6]

Jongwoo Choi, Kwanggyoon Seo, Amirsaman Ashtari, and Junyong Noh. 2024. StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN. arXiv preprint arXiv:2403.14186 (2024).

[7]

Yung-Yu Chuang, Dan B Goldman, Ke Colin Zheng, Brian Curless, David H Salesin, and Richard Szeliski. 2005. Animating pictures with stochastic motion textures. In ACM SIGGRAPH 2005 Papers. 853--860.

Digital Library

[8]

Devikalyan Das, Christopher Wewer, Raza Yunus, Eddy Ilg, and Jan Eric Lenssen. 2023. Neural parametric gaussians for monocular non-rigid object reconstruction. arXiv preprint arXiv:2312.01196 (2023).

[9]

Yuanxing Duan, Fangyin Wei, Qiyu Dai, Yuhang He, Wenzheng Chen, and Baoquan Chen. 2024. 4D Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic Scenes. arXiv preprint arXiv:2402.03307 (2024).

[10]

Elisabeth Flock. 2011. Cinemagraphs: What it looks like when a photo moves. The Washington Post (2011).

[11]

Sara Fridovich-Keil, Giacomo Meanti, Frederik Rahbæk Warburg, Benjamin Recht, and Angjoo Kanazawa. 2023. K-planes: Explicit radiance fields in space, time, and appearance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12479--12488.

[12]

Dan Guo, Kun Li, Bin Hu, Yan Zhang, and Meng Wang. 2024. Benchmarking Micro-action Recognition: Dataset, Method, and Application. IEEE Transactions on Circuits and Systems for Video Technology (2024).

[13]

Tavi Halperin, Hanit Hakim, Orestis Vantzos, Gershon Hochman, Netai Benaim, Lior Sassy, Michael Kupchik, Ofir Bibi, and Ohad Fried. 2021. Endless loops: detecting and animating periodic patterns in still images. ACM Transactions on graphics (TOG), Vol. 40, 4 (2021), 1--12.

Digital Library

[14]

Peter Hedman, Pratul P Srinivasan, Ben Mildenhall, Jonathan T Barron, and Paul Debevec. 2021. Baking neural radiance fields for real-time view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5875--5884.

[15]

Aleksander Holynski, Brian L Curless, Steven M Seitz, and Richard Szeliski. 2021. Animating pictures with eulerian motion fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5810--5819.

[16]

Yi-Hua Huang, Yang-Tian Sun, Ziyi Yang, Xiaoyang Lyu, Yan-Pei Cao, and Xiaojuan Qi. 2023. SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes. arXiv preprint arXiv:2312.14937 (2023).

[17]

Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4401--4410.

[18]

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 2023. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, Vol. 42, 4 (2023), 1--14.

Digital Library

[19]

Hao Li, Dingwen Zhang, Yalun Dai, Nian Liu, Lechao Cheng, Jingfeng Li, Jingdong Wang, and Junwei Han. 2023. GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding. arXiv preprint arXiv:2311.11863 (2023).

[20]

Xingyi Li, Zhiguo Cao, Huiqiang Sun, Jianming Zhang, Ke Xian, and Guosheng Lin. 2023. 3d cinemagraphy from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4595--4605.

[21]

Zhan Li, Zhang Chen, Zhong Li, and Yi Xu. 2023. Spacetime gaussian feature splatting for real-time dynamic view synthesis. arXiv preprint arXiv:2312.16812 (2023).

[22]

Jing Liao, Mark Finch, and Hugues Hoppe. 2015. Fast computation of seamless video loops. ACM Transactions on Graphics (TOG), Vol. 34, 6 (2015), 1--10.

Digital Library

[23]

Chih-Yang Lin, Yun-Wen Huang, and Timothy K Shih. 2019. Creating waterfall animation on a single image. Multimedia Tools and Applications, Vol. 78 (2019), 6637--6653.

Digital Library

[24]

Youtian Lin, Zuozhuo Dai, Siyu Zhu, and Yao Yao. 2023. Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle. arXiv:2312.03431 (2023).

[25]

Yangbin Lin, Cheng Wang, Dawei Zhai, Wei Li, and Jonathan Li. 2018. Toward better boundary preserved supervoxel segmentation for 3D point clouds. ISPRS journal of photogrammetry and remote sensing, Vol. 143 (2018), 39--47.

[26]

Huan Ling, Seung Wook Kim, Antonio Torralba, Sanja Fidler, and Karsten Kreis. 2023. Align your gaussians: Text-to-4d with dynamic 3d gaussians and composed diffusion models. arXiv preprint arXiv:2312.13763 (2023).

[27]

Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. 2020. Neural sparse voxel fields. Advances in Neural Information Processing Systems, Vol. 33 (2020), 15651--15663.

[28]

Jonathon Luiten, Georgios Kopanas, Bastian Leibe, and Deva Ramanan. 2024. Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis. In 3DV.

[29]

Li Ma, Xiaoyu Li, Jing Liao, and Pedro V Sander. 2023. 3D Video Loops from Asynchronous Input. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 310--320.

[30]

Aniruddha Mahapatra and Kuldeep Kulkarni. 2022. Controllable animation of fluid elements in still images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3667--3676.

[31]

Aniruddha Mahapatra, Aliaksandr Siarohin, Hsin-Ying Lee, Sergey Tulyakov, and Jun-Yan Zhu. 2023. Text-guided synthesis of eulerian cinemagraphs. ACM Transactions on Graphics (TOG), Vol. 42, 6 (2023), 1--13.

Digital Library

[32]

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM, Vol. 65, 1 (2021), 99--106.

Digital Library

[33]

Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. 2022. Instant neural graphics primitives with a multiresolution hash encoding. ACM transactions on graphics (TOG), Vol. 41, 4 (2022), 1--15.

[34]

Margaret A Oliver and Richard Webster. 1990. Kriging: a method of interpolation for geographical information systems. International Journal of Geographical Information System, Vol. 4, 3 (1990), 313--332.

[35]

Jeremie Papon, Alexey Abramov, Markus Schoeler, and Florentin Worgotter. 2013. Voxel cloud connectivity segmentation-supervoxels for point clouds. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2027--2034.

Digital Library

[36]

Keunhong Park, Utkarsh Sinha, Jonathan T Barron, Sofien Bouaziz, Dan B Goldman, Steven M Seitz, and Ricardo Martin-Brualla. 2021. Nerfies: Deformable neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5865--5874.

[37]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, Vol. 32 (2019).

[38]

Albert Pumarola, Enric Corona, Gerard Pons-Moll, and Francesc Moreno-Noguer. 2021. D-nerf: Neural radiance fields for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10318--10327.

[39]

Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652--660.

[40]

Christian Reiser, Songyou Peng, Yiyi Liao, and Andreas Geiger. 2021. KiloNeRF: Speeding Up Neural Radiance Fields With Thousands of Tiny MLPs. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 14335--14345.

[41]

Sara Sabour, Suhani Vora, Daniel Duckworth, Ivan Krasin, David J Fleet, and Andrea Tagliasacchi. 2023. Robustnerf: Ignoring distractors with robust losses. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20626--20636.

[42]

Arno Schodl, Richard Szeliski, David H Salesin, and Irfan Essa. 2023. Video textures. In Seminal Graphics Papers: Pushing the Boundaries, Volume 2. 557--570.

Digital Library

[43]

Jonathan Shade, Steven Gortler, Li-wei He, and Richard Szeliski. 1998. Layered depth images. In Proceedings of the 25th annual conference on Computer graphics and interactive techniques. 231--242.

Digital Library

[44]

James Tompkin, Fabrizio Pece, Kartic Subr, and Jan Kautz. 2011. Towards moment imagery: Automatic cinemagraphs. In 2011 Conference for Visual Media Production. IEEE, 87--93.

Digital Library

[45]

Thomas Unterthiner, Sjoerd van Steenkiste, Karol Kurach, Raphaël Marinier, Marcin Michalski, and Sylvain Gelly. 2019. FVD: A new metric for video generation. (2019).

[46]

Dor Verbin, Peter Hedman, Ben Mildenhall, Todd Zickler, Jonathan T Barron, and Pratul P Srinivasan. 2022. Ref-nerf: Structured view-dependent appearance for neural radiance fields. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 5481--5490.

[47]

Frederik Warburg, Ethan Weber, Matthew Tancik, Aleksander Holynski, and Angjoo Kanazawa. 2023. Nerfbusters: Removing ghostly artifacts from casually captured nerfs. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 18120--18130.

[48]

Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, and Xinggang Wang. 2023. 4d gaussian splatting for real-time dynamic scene rendering. arXiv preprint arXiv:2310.08528 (2023).

[49]

Tianyi Xie, Zeshun Zong, Yuxin Qiu, Xuan Li, Yutao Feng, Yin Yang, and Chenfanfu Jiang. 2023. Physgaussian: Physics-integrated 3d gaussians for generative dynamics. arXiv preprint arXiv:2311.12198 (2023).

[50]

Ziyi Yang, Xinyu Gao, Wen Zhou, Shaohui Jiao, Yuqing Zhang, and Xiaogang Jin. 2023. Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction. arXiv preprint arXiv:2309.13101 (2023).

[51]

Mei-Chen Yeh and Po-Yi Li. 2012. An approach to automatic creation of cinemagraphs. In Proceedings of the 20th ACM international conference on Multimedia. 1153--1156.

Digital Library

[52]

Alex Yu, Ruilong Li, Matthew Tancik, Hao Li, Ren Ng, and Angjoo Kanazawa. 2021. Plenoctrees for real-time rendering of neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5752--5761.

[53]

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586--595.

[54]

Tinghui Zhou, Richard Tucker, John Flynn, Graham Fyffe, and Noah Snavely. 2018. Stereo magnification: Learning view synthesis using multiplane images. arXiv preprint arXiv:1805.09817 (2018).

Cited By

Qiu SXie BLiu QHeng P(2025)Advancing Extended Reality with 3D Gaussian Splatting: Innovations and Prospects2025 IEEE International Conference on Artificial Intelligence and eXtended and Virtual Reality (AIxVR)10.1109/AIxVR63409.2025.00039(203-208)Online publication date: 27-Jan-2025
https://doi.org/10.1109/AIxVR63409.2025.00039
Li HGao YWu CZhang DDai YZhao CFeng HDing EWang JHan J(2024)GGRt: Towards Pose-Free Generalizable 3D Gaussian Splatting in Real-TimeComputer Vision – ECCV 202410.1007/978-3-031-73209-6_19(325-341)Online publication date: 1-Nov-2024
https://doi.org/10.1007/978-3-031-73209-6_19

Recommendations

ST-4DGS: Spatial-Temporally Consistent 4D Gaussian Splatting for Efficient Dynamic Scene Rendering
SIGGRAPH '24: ACM SIGGRAPH 2024 Conference Papers

Dynamic scene rendering at any novel view continues to be a difficult but important task, especially for high-fidelity rendering quality with efficient rendering speed. The recent 3D Gaussian Splatting, i.e., 3DGS, shows great success for static scene ...
4D Gaussian Splatting with Scale-aware Residual Field and Adaptive Optimization for Real-time Rendering of Temporally Complex Dynamic Scenes
MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Reconstructing dynamic scenes from video sequences is a highly promising task in the multimedia domain. While previous methods have made progress, they often struggle with slow rendering and managing temporal complexities such as significant motion and ...
3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting

In this paper, we present an implicit surface reconstruction method with 3D Gaussian Splatting (3DGS), namely 3DGSR, that allows for accurate 3D reconstruction with intricate details while inheriting the high efficiency and rendering quality of 3DGS. The ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Honorable Mention

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
180
Total Downloads

Downloads (Last 12 months)180
Downloads (Last 6 weeks)103

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Qiu SXie BLiu QHeng P(2025)Advancing Extended Reality with 3D Gaussian Splatting: Innovations and Prospects2025 IEEE International Conference on Artificial Intelligence and eXtended and Virtual Reality (AIxVR)10.1109/AIxVR63409.2025.00039(203-208)Online publication date: 27-Jan-2025
https://doi.org/10.1109/AIxVR63409.2025.00039
Li HGao YWu CZhang DDai YZhao CFeng HDing EWang JHan J(2024)GGRt: Towards Pose-Free Generalizable 3D Gaussian Splatting in Real-TimeComputer Vision – ECCV 202410.1007/978-3-031-73209-6_19(325-341)Online publication date: 1-Nov-2024
https://doi.org/10.1007/978-3-031-73209-6_19

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten