✨[CVPR 2025 Highlight] GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities

[CVPR 2025] Official repository of "GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities".

Authors: Rao Fu^* · Dingxi Zhang^* · Alex Jiang · Wanjia Fu · Austin Funk · Daniel Ritchie · Srinath Sridhar

Data Format

Demo Data

The demo data contains 5 motion sequences. The file directory looks like this:

demo_data/
├── hand_pose/
    ├── p<participant id>-<scene>-<squence id>/
        ├── bboxes/							# bounding boxes for 2D keypoints tracking
        ├── keypoints_2d/						# 2D hand keypoints 
        ├── keypoints_3d/						# 3D hand keypoints (triangulate multi-view 2D keypoints.)
        ├── keypoints_3d_mano/						# 3D hand keypoints (extract from mano parms and normalized, more smooth)
        ├── mano_vid/							# visualizations of mano parameters 
        ├── params/							# mano parameters
        ├── rgb_vid/							# raw multiview videos
        	├── brics-odrind-<camera id>-camx
        		├── xxx.mp4
        		├── xxx.txt
        	├── ...
        ├── repro_2d_vid/						# visualizations of 2d hand keypoints
        ├── repro_3d_vid/						# visualizations of 3d hand keypoints
        ├── optim_params.txt						# camera parameters
    ├── ...
└── object_pose
    ├── p<participant id>-<scene>-<squence id>/
        ├── mesh							# reconstructed object mesh
        ├── pose							# object pose
        ├── render							# visualizations of object pose
        ├── segmentation						# segmented object frames
    ├── ...

We store our dataset on Globus. You can download a demo sequence from here, all annotations from here, and access the raw data via here.

Whole Dataset

[2025/05/23] For object poses, access our Globus repository: here. Download each .tar.gz separately (contains 1000 motion sequences per file.)

[2025/04/30] For multiview RGB videos, access our Globus repository: here. Download each .tar.gz separately (contains 10 views per file, 51 camera views in total.)

[2025/04/02] We are pleased to release our full hand pose dataset, available for download here (Including all keypoints_3d, keypoints_3d_mano and params).

Complete text annotation are available here. We used the rewritten_annotation for model training.

More data coming soon! 🔜

The dataset directory should look like this:

./dataset/GigaHands/
├── hand_poses/
    ├── p<participant id>-<scene>/
        ├── keypoints_3d/						# 3D hand keypoints (triangulate multi-view 2D keypoints.)
        ├── keypoints_3d_mano/						# 3D hand keypoints (extract from mano parms and normalized, more smooth)
        ├── params/							# mano parameters
├── object_poses/
	├── <object name>
		├── p<participant id>-<scene>_<squence id>/
			├── pose					# object 6DoF poses
└── annotations_v2.jsonl						# text annotations

Installation

This code requires:

Python 3.8
conda3 or miniconda3
CUDA capable GPU (one is enough)

Create a virtual environment and install necessary dependencies

conda create -n gigahands python==3.8
conda activate gigahands
conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=12.1 -c pytorch -c nvidia
conda install -c conda-forge ffmpeg
pip install -r requirements.txt

Install EasyMocap

cd third-party/EasyMocap
python setup.py develop

Download mano models and place the MANO_*.pkl files under body_models/smplh.
Download the pretrained models by running bash dataset/download_pretrained_models.sh, which should be like:

./checkpoints/GigaHands/
./checkpoints/GigaHands/GPT/			# Text-to-motion generation model
./checkpoints/GigaHands/VQVAE/ 			# Motion autoencoder
./checkpoints/GigaHands/text_mot_match/		# Motion & Text feature extractors for evaluation

Visualizations

After downloading all hand pose annotations, run the script below to visualize them.

python visualize_hands.py

You will see videos of the MANO render results and reprojected keypoints in the visualizations directory.

Inference

Sampling results from customized descriptions:

python gen_motion_custom.py --resume-pth ./checkpoints/GigaHands/VQVAE/net_last.pth --resume-trans ./checkpoints/GigaHands/GPT/net_best_fid.pth --input-text ./input.txt

Train

The results are saved in the folder output.

Training motion VQ-VAE:

python3 train_vq_hand.py \
--batch-size 256 \
--lr 2e-4 \
--total-iter 300000 \
--lr-scheduler 200000 \
--nb-code 512 \
--down-t 2 \
--depth 3 \
--dilation-growth-rate 3 \
--out-dir output \
--dataname GigaHands \
--vq-act relu \
--quantizer ema_reset \
--loss-vel 0.5 \
--recons-loss l1_smooth \
--exp-name VQVAE \
--window-size 128

Training T2M GPT model:

python3 train_t2m_trans_hand.py  \
--exp-name GPT \
--batch-size 128 \
--num-layers 9 \
--embed-dim-gpt 1024 \
--nb-code 512 \
--n-head-gpt 16 \
--block-size 51 \
--ff-rate 4 \
--drop-out-rate 0.1 \
--resume-pth output/VQVAE/net_last.pth \
--vq-name VQVAE \
--out-dir output \
--total-iter 300000 \
--lr-scheduler 150000 \
--lr 0.0001 \
--dataname GigaHands \
--down-t 2 \
--depth 3 \
--quantizer ema_reset \
--eval-iter 10000 \
--pkeep 0.5 \
--dilation-growth-rate 3 \
--vq-act relu \

Checklist

Release demo data
Release hand pose data
Release multi-view video data
Release object pose data (13k) and meshes
Release inference code for text-to-motion task
Release training code for text-to-motion task

Acknowledgement

We appreciate helps from :

Public code like EasyMocap, text-to-motion, TM2T, MDM, T2M-GPT etc.
This research was supported by AFOSR grant FA9550-21 1-0214, NSF CAREER grant #2143576, and ONR DURIP grant N00014-23-1-2804. We would like to thank the Ope nAI Research Access Program for API support and extend our gratitude to Ellie Pavlick, Tianran Zhang, Carmen Yu, Angela Xing, Chandradeep Pokhariya, Sudarshan Harithas, Hongyu Li, Chaerin Min, Xindi Qu, Xiaoquan Liu, Hao Sun, Melvin He and Brandon Woodard.

License

This dataset is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License.

To view a copy of this license, visit https://creativecommons.org/licenses/by-nc/4.0/.

Citation

If you find our work useful in your research, please consider citing:

@article{fu2024gigahands,
  title={GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities},
  author={Fu, Rao and Zhang, Dingxi and Jiang, Alex and Fu, Wanjia and Funk, Austin and Ritchie, Daniel and Sridhar, Srinath},
  journal={arXiv preprint arXiv:2412.04244},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

✨[CVPR 2025 Highlight] GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities

Data Format

Demo Data

Whole Dataset

Installation

Visualizations

Inference

Train

Checklist

Acknowledgement

License

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
body_models		body_models
dataset		dataset
glove		glove
hand_utils		hand_utils
models		models
options		options
utils		utils
README.md		README.md
gen_motion_custom.py		gen_motion_custom.py
input.txt		input.txt
requirements.txt		requirements.txt
train_t2m_trans_hand.py		train_t2m_trans_hand.py
train_vq_hand.py		train_vq_hand.py
visualize_hands.py		visualize_hands.py

Kristen-Z/GigaHands

Folders and files

Latest commit

History

Repository files navigation

✨[CVPR 2025 Highlight] GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities

Data Format

Demo Data

Whole Dataset

Installation

Visualizations

Inference

Train

Checklist

Acknowledgement

License

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages