Dec. 10th, 2024
: This repository contains the training and testing code for the AAAI'25 paper titled with "ZeroMamba: Exploring Visual State Space Model for Zero-Shot Learning" (arXiv). We will release the entire codes in the following days.
Results of our released models using various evaluation protocols on three datasets, both in the CZSL and GZSL settings.
Dataset | Acc(CZSL) | U(GZSL) | S(GZSL) | H(GZSL) |
---|---|---|---|---|
CUB | 80.0 | 72.1 | 76.4 | 74.2 |
SUN | 72.4 | 56.5 | 41.4 | 47.7 |
AWA2 | 71.9 | 67.9 | 87.6 | 76.5 |
Note: We highly recommend that you adhere to the following steps.
-
Python & PyTorch
conda create -n zeromamba python=3.10.13 conda activate zeromamba conda install pytorch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 pytorch-cuda=11.8 -c 99D7 pytorch -c nvidia
-
Mamba dependencies
-
download mamba_ssm-1.1.1
pip install <your file path>
-
download casual_conv1d-1.1.0
pip install <your file path>
-
-
Vision Mamba dependencies
cp -r ZeroMamba/VisionMambaModels/Vim/mamba_ssm <your env's site-packages path>
cd ZeroMamba/VisionMambaModels/VMamba/kernels/selective_scan && pip install .
-
Other dependencies
git clone git@github.com:DingjieFu/ZeroMamba.git
cd ZeroMamba
pip install -r requirements.txt
The structure:
ZeroMamba/
├── data
│ ├── attribute
│ ├── dataset
│ │ ├── AWA2
│ │ │ ├── Animals_with_Attributes2
│ │ │ └── ...
│ │ ├── CUB
│ │ │ ├── CUB_200_2011
│ │ │ └── ...
│ │ ├── SUN
│ │ │ ├── images
│ │ │ └── ...
│ │ ├── xlsa
│ │ └── ...
│ ├── w2v
│ └── ...
├── utils
└── ...
Running following commands in ./scripts/train.sh
, download the pre-trained model here and place it under ./checkpoints
.
# AWA2
python train.py --model_name VMamba-S --model vmambav2_small_224\
--ckpt vssm_small_0229_ckpt_epoch_222.pth --cfg vmambav2_small_224.yaml\
--dataset AWA2 --gamma 0.98 --input_size 448 --batch_size 32\
--backbone_lr 1e-3 --head_lr 1e-3 --head2_lr 1e-4 --loss_L1 0.0
# CUB
python train.py --model_name VMamba-S --model vmambav2_small_224\
--ckpt vssm_small_0229_ckpt_epoch_222.pth --cfg vmambav2_small_224.yaml\
--dataset CUB --gamma 0.3 --input_size 448 --batch_size 32\
--backbone_lr 1e-3 --head_lr 1e-3 --head2_lr 1e-4 --loss_L1 1.0
# SUN
python train.py --model_name VMamba-S --model vmambav2_small_224\
--ckpt vssm_small_0229_ckpt_epoch_222.pth --cfg vmambav2_small_224.yaml\
--dataset SUN --gamma 0.35 --input_size 448 --batch_size 32\
--backbone_lr 1e-3 --head_lr 1e-3 --head2_lr 1e-4 --loss_L1 0.2
We provide trained models(Google Drive)on three different datasets: CUB, SUN, AWA2 in the CZSL and GZSL settings. Download and place them under ./checkpoints
Running following commands in ./scripts/test.sh
# AWA2
python test.py --dataset AWA2 --gamma 0.98
# CUB
python test.py --dataset CUB --gamma 0.3
# SUN
python test.py --dataset SUN --gamma 0.35
This project is partly based on VMamba (github). Thanks for their wonderful works.
If you find ZeroMamba is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.
@inproceedings{hou2025zeromamba,
title={ZeroMamba: Exploring Visual State Space Model for Zero-Shot Learning},
author={Hou, Wenjin and Fu, Dingjie and Li, Kun and Chen, Shiming and Fan, Hehe and Yang, Yi},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={39},
number={4},
pages={3527--3535},
year={2025}
}