TaKSIE

WACV 2025

The code for the paper:

Incorporating Task Progress Knowledge for Subgoal Generation in Robotic Manipulation through Image Edits

Quick Start

1. Create the Environment

conda create -n taksie python=3.9
conda activate taksie
pip install -r requirements.txt

2. Test the Inference

Run the example inference script to test the subgoal generation:

python example_inference.py

Evaluate the Trained Checkpoint on CALVIN

1. Install Required Modules

(a) CALVIN

Follow the instructions here to install the CALVIN environment.

(b) Goal-Conditioned Policy

Install JAX and jaxrl_m for the goal-conditioned policy:

# Follow the instructions here to install jax and jaxrl_m:
https://github.com/rail-berkeley/bridge_data_v2

Download the DGBC checkpoint from the provided link. Update the cfg/cfg.yaml file by replacing the parent directory path with the path to the downloaded checkpoint.

(c) LIV for Progress Evaluator

Follow the instructions here to install the LIV framework. Additionally, download the fine-tuned LIV checkpoint for the CALVIN dataset:

Download the checkpoint: Hugging Face
Replace the default checkpoint in ~/.liv/resnet50 with the downloaded fine-tuned checkpoint.

2. Run the Evaluation

Run the evaluation script with the appropriate configuration file:

python evaluate_calvin.py --running_config=cfg/cfg.yaml

Checkpoints

GCBC checkpoint: Link
LIV fine-tuned checkpoint: Link
ControlNet: Link
Unet: Link

Training with an Example Trajectory

Use the sample in example/example_trajectory. Choose one CALVIN trajectory. Replace with CALVIN task D to train on the whole data.

Step 1: Cache CLIP and R3M Features

mkdir -p cache_features/clip
mkdir cache_features/r3m
python scripts/generate_clip_feature.py --dataset_path example/example_trajectory --output_dir_path cache_features/clip
python scripts/generate_r3m_feature.py --dataset_path example/example_trajectory --output_dir_path cache_features/r3m

Select Ground-Truth Subgoals

python scripts/select_keyframe.py --r3m_feature_dir cache_features/r3m --lang_annotations_path example/example_trajectory/lang_annotations/auto_lang_ann.npy --data_path example/example_trajectory --output_dir .

Saved to output_dir/selected_keyframes.npy

Generate Subgoal Segments Used for Training Dataloading

python scripts/generate_keyframe_segment.py --input_npy_path selected_keyframes.npy --output_npy_path keyframe_segment.npy

Start training

Use accelerate for multi-GPU training. Adjust batch_size and gradient_accumulation_steps based on GPU memory. I set validation_step=100 to monitor training, use a larger value for faster runs.

bash train_taksie.bash

check wandb for training logs.

Note on Training ControlNet: Training an unconditional diffusion model can help ControlNet converge faster and achieve better results. You can use the original script from the official Diffuers. And then use the following dataset by setting dataset_name: ShuaKang/calvin_d.

Bibtex

@INPROCEEDINGS{10943942,
  author={Kang, Xuhui and Kuo, Yen-Ling},
  booktitle={2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, 
  title={Incorporating Task Progress Knowledge for Subgoal Generation in Robotic Manipulation through Image Edits}, 
  year={2025},
  volume={},
  number={},
  pages={7490-7499},
  doi={10.1109/WACV61041.2025.00728}}

Acknowledgements

Thank you for these excellent works!

CALVIN: https://github.com/mees/calvin
LIV: https://github.com/penn-pal-lab/LIV
GCBC Policy: https://github.com/rail-berkeley/bridge_data_v2
SuSIE: https://github.com/kvablack/susie

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
cfg		cfg
example		example
scripts		scripts
taksie		taksie
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
evaluate_calvin.py		evaluate_calvin.py
example_inference.py		example_inference.py
nohup.out		nohup.out
requirements.txt		requirements.txt
result_combined_with_labels.png		result_combined_with_labels.png
train_taksie.bash		train_taksie.bash
train_taksie.py		train_taksie.py
trajectory_selection.py		trajectory_selection.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TaKSIE

Quick Start

1. Create the Environment

2. Test the Inference

Evaluate the Trained Checkpoint on CALVIN

1. Install Required Modules

(a) CALVIN

(b) Goal-Conditioned Policy

(c) LIV for Progress Evaluator

2. Run the Evaluation

Checkpoints

Training with an Example Trajectory

Step 1: Cache CLIP and R3M Features

Select Ground-Truth Subgoals

Generate Subgoal Segments Used for Training Dataloading

Start training

Bibtex

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Shua-Kang/TaKSIE

Folders and files

Latest commit

History

Repository files navigation

TaKSIE

Quick Start

1. Create the Environment

2. Test the Inference

Evaluate the Trained Checkpoint on CALVIN

1. Install Required Modules

(a) CALVIN

(b) Goal-Conditioned Policy

(c) LIV for Progress Evaluator

2. Run the Evaluation

Checkpoints

Training with an Example Trajectory

Step 1: Cache CLIP and R3M Features

Select Ground-Truth Subgoals

Generate Subgoal Segments Used for Training Dataloading

Start training

Bibtex

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages