iSegProbe: Probing VFMs and Feature Upsamplers using Interactive Segmentation

Introduction

This repository provides the code for the technical report [Arxiv], and also serves as a standalone suite for probing and evaluating future methods in interactive segmentation (IS).

The iSegProbe repository includes:

Pipelines for training and evaluating interactive segmentation models, specifically adapted for probing individual model components (train.py, evaluate.py)
Implementations of vision backbones, such as ViT, MaskCLIP, and DINOv2, tailored for the interactive segmentation task (core.model.featurizers)
Implementations of multiple feature upsamplers, including LiFT, FeatUp, and LoftUp (core.model.upsamplers)
Support for major IS datasets: GrabCut, DAVIS, SBD, Berkeley, COCO+LVIS ... (core.data)
Visualization utilities for plotting predictions and features, as well as recreating plots from the report

Installation

Environment

Developed and tested on Python 3.9, PyTorch 2.4.1, CUDA 12.4, Ubuntu 20.04. To install the required dependencies, run:

pip install -r requirements.txt

Datasets

Download the dataset(s) relevant to your use case and and specify the corresponding paths in the configs/main_cfg.yaml.

📌 Note: Our experiments were conducted on SBD (train) and GrabCut, DAVIS, Berkeley and SBD (test). However, other datasets are fully supported and can be used with minimal effort.

Dataset	Description	Download Link
ADE20k	22k images with 434k instances (total)	official site
OpenImages	944k images with 2.6M instances (total)	official site
MS COCO	118k images with 1.2M instances (train)	official site
LVIS v1.0	100k images with 1.2M instances (total)	official site
COCO+LVIS*	99k images with 1.5M instances (train)	original LVIS images + combined annotations
SBD	8498 images with 20172 instances for (train) 2857 images with 6671 instances for (test)	official site
Grab Cut	50 images with one object each (test)	GrabCut.zip (11 MB)
Berkeley	96 images with 100 instances (test)	Berkeley.zip (7 MB)
DAVIS	345 images with one object each (test)	DAVIS.zip (43 MB)
Pascal VOC	1449 images with 3417 instances (validation)	official site
COCO_MVal	800 images with 800 instances (test)	COCO_MVal.zip (127 MB)

(*) - To prepare COCO+LVIS, first download the original LVIS v1.0 dataset. Then, download and unpack the pre-processed annotations provided by the RITM team, which combine COCO and LVIS. Place the annotations in the same folder as LVIS v1.0.

For an extended list of supported datasets, refer to the SimpleClick dataset collection: [link]

Upsamplers

Download upsampler weights and specify the corresponding paths in the configs/main_cfg.yaml:

LoftUp (DINOv2 S/14): [Google Drive Link]
LiFT (DINOv2 S/14): [Google Drive Link]

For additional trained upsamplers, refer to the LoftUp repository: [link]

Evaluation

Evaluation of the vision foundation model (and feature upsampler) involves two separate stages: (1) training the interactive segmentation model, and (2) performing the actual evaluation.

Train Your IS Model

General training configurations are specified in configs/train_cfg.yaml. For a detailed explanation of the parameters, please refer directly to that file. Each training experiment (containing IS model, datasets and other components) should be defined in a separate Python file, which is then referenced from train_cfg.yaml. Examples of such files can be found in the models/ directory.

To launch the training process, you can either modify train_cfg.yaml accordingly and run:

python train.py

Or override specific arguments directly with CLI using Hydra syntax, for example:

python train.py +exp.name=my_name +exp.model_path=/path/to/my/model

Evaluate Your IS Model

General evaluation configurations are specified in configs/eval_cfg.yaml. For a detailed explanation of the parameters, please refer directly to that file.

To launch the evaluation process, you can either modify eval_cfg.yaml accordingly and run:

python evaluate.py

Or override specific arguments directly with CLI using Hydra syntax, for example:

python evaluate.py +checkpoint=/path/to/checkpoints +datasets=GrabCut,Berkeley,SBD,DAVIS

Logging

Training logs can be visualized using TensorBoard and Weights & Biases.

TensorBoard

To enable TensorBoard, locate folders with experiments output (could be also some root f 8B15 older containing multiple runs) and run:
```
tensorboard --logdir=PATH_TO_LOG_DIR --port=6006
```
Weights & Biases

To enable logging to W&B, set the wandb.log_wandb=true in train_cfg.yaml.
Separate Weights & Biases evaluation logging is available by setting wandb=true in eval_cfg.yaml.

Interactive Demo

To launch Tkinter-based interactive demo, run:

python demo.py --checkpoint /path/to/ckpts

Demo Controls:

Key	Description
`Left Mouse Button`	Place a positive click
`Right Mouse Button`	Place a negative click
`Scroll Wheel`	Zoom an image in and out
`Right Mouse Button` + `Move Mouse`	Move an image
`Space`	Finish the current object mask

Some test images can be found in the assets/test_imgs folder.
For a more detailed description of the demo parameters and functionality, refer to the RITM codebase.

Additional Comments

When launching the demo from a remote machine, you may need to have X11 (or XQuartz) installed and running on your local machine with proper X11 forwarding.
If the demo exits incorrectly, the process might not terminate properly, leading to the following error on the next launch:

free(): invalid pointer

To resolve this, kill the demo process by running:

pkill -9 -f demo.py

Plotting Utilities

In the eval_cfg.yaml file, the vis_preds flag is responsible for visualizing the model's predictions, while the save_feats flag controls whether raw features before and after the upsampler are saved. These saved features can be further visualized using the script core.plots.plot_features.py. Additionally, the script core.plots.plot_iou_vs_clicks.py can be used to perform a comparison of the mean Intersection over Union (mIoU) as a function of the number of clicks made.

Citation

If you find this repository useful, please cite our papers:

@misc{huang2025loftuplearningcoordinatebasedfeature,
      title={LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models}, 
      author={Haiwen Huang and Anpei Chen and Volodymyr Havrylov and Andreas Geiger and Dan Zhang},
      year={2025},
      eprint={2504.14032},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2504.14032}, 
}

@misc{havrylov2025benchmarking,
    title={Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive Segmentation},
    author={Volodymyr Havrylov and Haiwen Huang and Dan Zhang and Andreas Geiger},
    year={2025},
    eprint={2505.02075},
    archivePrefix={arXiv},
    primaryClass={cs.CV},
    url={https://arxiv.org/abs/2505.02075}, 
}

Acknowledgements

This repository is based on SimpleClick and RITM, with most of the featurizers code adapted from FeatUp. We thank the authors of these open-source projects for their valuable contributions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

iSegProbe: Probing VFMs and Feature Upsamplers using Interactive Segmentation

Introduction

Contents

Installation

Environment

Datasets

Upsamplers

Evaluation

Train Your IS Model

Evaluate Your IS Model

Logging

TensorBoard

Weights & Biases

Interactive Demo

Additional Comments

Plotting Utilities

Citation

Acknowledgements

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
configs		configs
core		core
models		models
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENCE		LICENCE
README.md		README.md
demo.py		demo.py
evaluate.py		evaluate.py
requirements.txt		requirements.txt
train.py		train.py

License

havrylovv/iSegProbe

Folders and files

Latest commit

History

Repository files navigation

iSegProbe: Probing VFMs and Feature Upsamplers using Interactive Segmentation

Introduction

Contents

Installation

Environment

Datasets

Upsamplers

Evaluation

Train Your IS Model

Evaluate Your IS Model

Logging

TensorBoard

Weights & Biases

Interactive Demo

Additional Comments

Plotting Utilities

Citation

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages