8000 GitHub - ChocoWu/PSG-4D-LLM: This is the project repo for 'PSG-4D-LLM'.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ChocoWu/PSG-4D-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

4D Panoptic Scene Graph Generation

(*Correspondence)

Motivation and Method

The latest emerged 4D Panoptic Scene Graph (4D-PSG) provides an advanced-ever representation for comprehensively modeling the dynamic 4D visual real world. Unfortunately, current pioneering 4D-PSG research can largely suffer from data scarcity issues severely, as well as the resulting out-of-vocabulary problems; also, the pipeline nature of the benchmark generation method can lead to suboptimal performance. To address these challenges, this paper investigates a novel framework for 4D-PSG generation that leverages rich 2D visual scene annotations to enhance 4D scene learning. First, we introduce a 4D Large Language Model (4D-LLM) integrated with a 3D mask decoder for end-to-end generation of 4D-PSG. A chained SG inference mechanism is further designed to exploit LLMs' open-vocabulary capabilities to infer accurate and comprehensive object and relation labels iteratively. Most importantly, we propose a 2D-to-4D visual scene transfer learning framework, where a spatial-temporal scene transcending strategy effectively transfers dimension-invariant features from abundant 2D SG annotations to 4D scenes, effectively compensating for data scarcity in 4D-PSG.

framework

Data

Please follow the instructions to prepare the datasets.

Training

Coming soon.

Citation

If you use PSG-4D-LLM in your project, please kindly cite:

@inproceedings{wu2025psg4dllm,
    title={Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene},
    author={Shengqiong Wu and Hao Fei and Jingkang Yang and Xiangtai Li and Juncheng Li and Hanwang Zhang and Tat-Seng Chua1},
    booktitle={CVPR},
    year={2025}
}

Acknowledgement

Our 4D-LLM is developed based on the codebases of NExT-Chat, Chat-UniVi, SA-Gate, and Sam2, and we would like to thank the developers of both.

About

This is the project repo for 'PSG-4D-LLM'.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0