Bingyi Kang* · Yang Yue*
Rui Lu · Zhijie Lin · Yang Zhao · Kaixin Wang · Gao Huang · Jiashi Feng
*Equal Contribution, in alphabetical order
We conduct a systematic study to investigate whether video generation is able to learn physical laws from videos, leveraging data and model scaling.
DIrectory id_ood_data
contains code for generating training and evaluation data to test scaling abilities for in-distribution (in-dist) and out-of-distribution (ood) scenarios. It supports generating videos for scenarios including uniform motion, collision, and parabolic motion.
To generate collision videos at different data size levels:
# Collision videos with increasing data sizes
python3 two_balls_collision.py --data_name in_dist_v2 --data_size_level 1 --num_workers 64
python3 two_balls_collision.py --data_name in_dist_v2 --data_size_level 2 --num_workers 64
python3 two_balls_collision.py --data_name in_dist_v2 --data_size_level 3 --num_workers 64
To generate videos of uniform motion at different data size levels (e.g., 30k, 300k, 3M videos):
python3 one_ball_uniform_motion.py --data_name in_dist_v2 --data_size_level 0 --num_workers 64
python3 one_ball_uniform_motion.py --data_name in_dist_v2 --data_size_level 1 --num_workers 64
python3 one_ball_uniform_motion.py --data_name in_dist_v2 --data_size_level 2 --num_workers 64
To generate parabolic motion videos:
python3 one_ball_parabola.py --data_name in_dist_v2 --data_size_level 0 --num_workers 64
python3 one_ball_parabola.py --data_name in_dist_v2 --data_size_level 1 --num_workers 64
python3 one_ball_parabola.py --data_name in_dist_v2 --data_size_level 2 --num_workers 64
Note: The num_workers
parameter specifies the number of parallel threads used for data generation. Adjust this based on your available CPU resources.
To generate evaluation data for visualization across different scenarios:
# Collision videos for evaluation
python3 two_balls_collision.py --data_for_vis
# Uniform motion videos for evaluation
python3 one_ball_uniform_motion.py --data_for_vis
# Parabolic motion videos for evaluation
python3 one_ball_parabola.py --data_for_vis
We build combinatorial data generation on the Phyre codebase.
For install the python env, run
# create conda env
conda create --yes -n phyre python=3.9
source activate phyre
# install requirements
conda install -c conda-forge sed nodejs=12 thrift-cpp=0.11.0 wget pybind11=2.6 cmake boost=1.75 setuptools pip --yes
pip install matplotlib tqdm ipywidgets yapf==0.28.0
# install our project
cd combinatorial_data
pip install -e src/python
We put our 70 templates 10000:10069 here and complied bins here.
Run the following command to generate training data from 60 templates:
# Replace $ID with values 0, 1, 2, 3, 4 and 5, with each ID generating 10 templates, totally 60 templates
python3 data_generator_v2.py --num_workers 64 --run_id $ID --data_dir ./train
For scaling analysis, you can use a subset of the training data:
- 6 templates 8000 strong>: 10003, 10005, 10016, 10023, 10024, 10053
- 30 templates: Use the regular expression
100[0-5][02468]
to select templates.
To generate evaluation data from 10 reserved templates:
python3 data_generator_v2.py --num_workers 64 --run_id 6 --data_dir ./eval
Evaluation code to parse velocity and calculate error metrics from video data see here id_ood_data/evaluate.py
.
- Data generation code for in-depth analysis
Data Type | Train Data (30K/300K/3M) | Eval Data | Description |
---|---|---|---|
Uniform Motion | 30K, 300K, 3M | Eval | Eval data includes both in-distribution and out-of-distribution data |
Parabola | 30K, 300K, 3M | Eval | - |
Collision | 30K, 300K, 3M | Eval | - |
Combinatorial Data | In-template 6M templates00:59 | Out-of-template | In-template-6M includes train data (0:990 videos in each train template) and in-template eval data (990:1000 videos in each train template). Out-template refers to eval data from reserved 10 templates (templates60:69). |
The code has been reorganized, which may lead to errors or deviations from the original research results. If you encounter any issues, please report them by opening an issue. We will address any bugs promptly.
@article{kang2024how,
title={How Far is Video Generation from World Model? -- A Physical Law Perspective},
author={Kang, Bingyi and Yue, Yang and Lu, Rui and Lin, Zhijie and Zhao, Yang, and Wang, Kaixin and Gao, Huang and Feng Jiashi},
journal={arXiv preprint arXiv:2406.16860},
year={2024}
}