Jingwei Xu1
·
Yikai Wang2
·
Yiqun Zhao3,6
·
Yanwei Fu5
·
Shenghua Gao3,4†
1 ShanghaiTech University
2 Nanyang Technological University
3 The University of Hong Kong
4 HKU Shanghai Intelligent Computing Research Center
5 Fudan University
6 Transcengram
(† denotes corresponding author)
|
all_cmp.mp4
# Run this command before get started.
git submodule update --init --recursive
Waymo: Please follow this instruction.
Pandaset: Please follow this instruction.
Following datasets are not used in the paper, but we provide the instruction and code utilities/support for them:
Kitti: Please follow this instruction.
nuScenes: Please follow this instruction.
conda create -n streetunveiler python=3.10
conda activate streetunveiler
# Suppose CUDA version is 12.1, please change your code properly according to your CUDA version.
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt
git submodule update --init --recursive
pip install submodules/superpose3d
pip install submodules/sh_encoder
pip install submodules/simple-knn
pip install submodules/diff-surfel-rasterization
cd submodules/tiny-cuda-nn/bindings/torch
pip install .
# For Pandaset
cd submodules/pandaset-devkit/python
pip install .
We build some modules to easily use inpainting model at utils/zits_utils.py
and utils/leftrefill_utils.py
. Please download the pretrained models and put them under the corresponding directories before use them.
Under ./3rd_party/ZITS-PlusPlus
, please follow its official instruction to download the pretrained model.
Or you may download and extract the model from this backup google drive link. And put them under ./3rd_party/ZITS-PlusPlus/ckpts
.
Or you may run the following command to download the model:
cd 3rd_party/ZITS-PlusPlus
mkdir ckpts
cd ckpts
wget https://huggingface.co/jingwei-xu-00/pretrained_backup_for_streetunveiler/resolve/main/ZITS%2B%2B/best_lsm_hawp.pth
wget https://huggingface.co/jingwei-xu-00/pretrained_backup_for_streetunveiler/blob/main/ZITS%2B%2B/model_512.zip
unzip model_512.zip
It should be like:
ZITS-PlusPlus
|-- ckpts
| |-- best_lsm_hawp.pth
| |-- model_512
| | |-- config.yml
| | |-- models
| | | `-- last.ckpt
| | |-- samples
| | `-- validation
| `-- model_512.zip
|
...
Finally, do:
cd 3rd_party/ZITS-PlusPlus/nms/cxx/src
source build.sh
Under ./3rd_party/LeftRefill
, please follow its official instruction to download the pretrained model.
Or you may download the model from this backup google drive link. And put them under ./3rd_party/LeftRefill/pretrained_models
.
Or you may run the following command to download the model:
cd 3rd_party/LeftRefill
mkdir pretrained_models
cd pretrained_models
wget https://huggingface.co/jingwei-xu-00/pretrained_backup_for_streetunveiler/resolve/main/LeftRefill/512-inpainting-ema.ckpt
It should be like:
LeftRefill
|-- pretrained_models
| `-- 512-inpainting-ema.ckpt
|
...
(Update: We wrap LeftRefill into a simple API at https://github.com/DavidXu-JJ/simple-leftrefill-inpainting. You may try to use this API if you consider using it in your own project.)
Suppose your data is under data/waymo
and is like:
data
`-- waymo
|-- colmap
| |-- segment-10061305430875486848_1080_000_1100_000_with_camera_labels
| |-- segment-1172406780360799916_1660_000_1680_000_with_camera_labels
| |-- segment-14869732972903148657_2420_000_2440_000_with_camera_labels
| `-- segment-4058410353286511411_3980_000_4000_000_with_camera_labels
|-- processed
| |-- segment-10061305430875486848_1080_000_1100_000_with_camera_labels
| |-- segment-1172406780360799916_1660_000_1680_000_with_camera_labels
| |-- segment-14869732972903148657_2420_000_2440_000_with_camera_labels
| `-- segment-4058410353286511411_3980_000_4000_000_with_camera_labels
# --source_path/-s: the path to the preprocessed data root, to read the lidars
# --colmap_path/-c: the path to the colmap processed data root, to read the SfM points
# --model_path/-m: the path to save the output model
# --resolution/-r: the scaling of input images' resolution
# Waymo
python3 train.py -s ./data/waymo/processed/segment-1172406780360799916_1660_000_1680_000_with_camera_labels \
-c ./data/waymo/colmap/segment-1172406780360799916_1660_000_1680_000_with_camera_labels \
-m ./output_waymo/segment-1172406780360799916_1660_000_1680_000_with_camera_labels \
-r 4
# Pandaset
python3 train.py -s ./data/pandaset/raw \
-c ./data/pandaset/colmap/027 \
-m ./output_pandaset/027 \
-r 4
# Kitti
python3 train.py -s ./data/kitti/raw \
-c ./data/kitti/colmap/2011_09_26/2011_09_26_drive_0001_sync \
-m ./output_kitti/2011_09_26_drive_0001_sync \
-r [proper_resolution_scaling]
# nuScenes
python3 train.py -s ./data/nuscenes/raw \
-c ./data/nuscenes/colmap/scene-0001 \
-m ./output_nuscenes/scene-0001 \
-r [proper_resolution_scaling]
python3 render.py -m ./output_waymo/segment-1172406780360799916_1660_000_1680_000_with_camera_labels
# Example: sh unveil_preprocess.sh [model_path] [gpu_id]
sh unveil_prepare.sh ./output_waymo/segment-1172406780360799916_1660_000_1680_000_with_camera_labels 0
After running inpainting_pipeline/1_selection/1_instance_visualization.py
. There will be some visualizations under model_path/instance_workspace_0/instance_render
.
You can get the instance id through the filename of each image. And then select it as removed object when running inpainting_pipeline/2_condition_preparation/1_select_instance.py
by setting --instance_id
. (By default, --all
is set to removing all objects.)
# Example: sh unveil.sh [model_path] [key_frame_list] [gpu_id]
sh unveil.sh ./output_waymo/segment-1172406780360799916_1660_000_1680_000_with_camera_labels "150 120 90 60 30 0" 0
key_frame_list
is the list of key frames as is discussed in Section A.2 of our supplementary material. The selection of key frames will affect the performance of the unveiling and may differ for each scene.
# Example: sh eval_lpips_fid.sh [results_path] [gt_path] [gpu_id]
model_path=...
sh eval_lpips_fid.sh "$model_path/instance_workspace_0/final_renders" "$model_path/instance_workspace_0/gt" 0
This project is built from 3DGS and 2DGS. The data preprocessing for waymo dataset is mainly based on neuralsim. The implementation of environment map is based on nr3d and torch-ngp. The inpainting model is based on LeftRefill and Zits++.
We appreciate the authors for their great work.
@inproceedings{xu2025streetun
59D1
veiler,
author = {Jingwei Xu and Yikai Wang and Yiqun Zhao and Yanwei Fu and Shenghua Gao},
title = {3D StreetUnveiler with Semantic-aware 2DGS - a simple baseline},
booktitle = {The International Conference on Learning Representations (ICLR)},
year = {2025},
}