GPS-Gaussian+: Generalizable Pixel-Wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views
Boyao Zhou1*, Shunyuan Zheng2*†, Hanzhang Tu1, Ruizhi Shao1, Boning Liu1, Shengping Zhang2✉, Liqiang Nie2, Yebin Liu1
1Tsinghua Univserity 2Harbin Institute of Technology
*Equal contribution †Work done during an internship at Tsinghua University ✉Corresponding author
Projectpage · Paper · Dataset
We present GPS-Gaussian+, a generalizable 3D Gaussian Splatting, for human-centered scene rendering from sparse views in a feed-forward manner.
basketball.mp4
To deploy and run GPS-Gaussian+, run the following scripts:
conda env create --file environment.yml
conda activate gps_plus
Then, compile the diff-gaussian-rasterization
in 3DGS repository:
git clone https://github.com/graphdeco-inria/gaussian-splatting --recursive
cd gaussian-splatting/
pip install -e submodules/diff-gaussian-rasterization
cd ..
(Optional) For training with geometry regulatization, install pytorch3d for chamfer_distance. Otherwise, set if_chamfer = False
in train.py.
-
You can download our captured THumanMV dataset from OneDrive. We provide 15 sequences of human performance captured in 10-camera setting. In our experiments, we split 10 cameras into 3 work sets: (1,2,3,4) (4,5,6,7) (7,8,9,10).
-
We provide step_0rect.py for source view rectification and step_1.py for novel view processing. To prepare data, you set the correct path for
data_root
(raw data) andprocessed_data_root
(processed data) in step_0rect.py and step_1.py. Then you can run for example:
cd data_process
python step_0rect.py -i s1a1 -t train
python step_1.py -i s1a1 -t train
python step_0rect.py -i s3a5 -t val
python step_1.py -i s3a5 -t val
python step_0rect.py -i s1a6 -t test
python step_1.py -i s1a6 -t test
cd ..
The processed dataset should be organized as follows:
processed_data_root
├── train/
│ ├── img/
│ │ ├── s1a1_s1_0000/
│ │ │ ├── 0.jpg
│ │ │ ├── 1.jpg
│ │ │ ├── 2.jpg
│ │ │ ├── 3.jpg
│ │ │ ├── 4.jpg
│ │ | └── 5.jpg
│ | └── ...
│ ├── mask/
│ │ ├── s1a1_s1_0000/
│ │ │ ├── 0.jpg
│ │ │ ├── 1.jpg
│ | └── ...
│ ├── parameter/
│ │ ├── s1a1_s1_0000/
│ │ │ ├── 0_1.json
│ │ │ ├── 2_extrinsic.npy
│ │ │ ├── 2_intrinsic.npy
│ | | └── ...
│ | └── ...
└──val
│ ├── img/
│ ├── mask/
│ ├── parameter/
└──test
│ ├── s1a6_process/
│ | ├── img/
│ | ├── mask/
│ | ├── parameter/
Note that 0-1.jpg are rectified input images and 2-5.jpg are images for supervision or evaluation. In particular, 4-5.jpg are original images of 0-1 views.
We provide the pretrained checkpoint in OneDrive and 60-frame processed data in OneDrive. You can directly put the downloaded data into /PATH/TO/processed_data_root/test/
. You furthermore modify local_data_root=/PATH/TO/processed_data_root/
in stage.yaml
- For novel-view synthesis, you can set the checkpoint path in test.py and pick a target view in 2-3.
python test.py -i example_data -v 2
- For freeview rendering, you can set the checkpoint path and
LOOP_NUM
in run_interpolation.py for frames per work set.
python run_interpolation.py -i example_data
You can check results in experiments\gps_plus
.
Once you prepare all training data of 9 sequences and at least one sequence as validation data. You can modify train_data_root
and val_data_root
in stage.yaml.
python train.py
If you would like to train our network with your own data, you can organize the dataset as above and set inverse_depth_init
in stage.yaml. We use inverse_depth_init = 0.3
in our experiments for the largest depth of the scene is around 3.33 meters.
If you find the code or the data is useful for your research, please consider citing:
@article{zhou2024gps,
title={GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views},
author={Zhou, Boyao and Zheng, Shunyuan and Tu, Hanzhang and Shao, Ruizhi and Liu, Boning and Zhang, Shengping and Nie, Liqiang and Liu, Yebin},
journal={arXiv preprint arXiv:2411.11363},
year={2024}
}