Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning

Official implementation for SurgSAM2, an innovative model that leverages the power of the Segment Anything Model 2 (SAM2), integrating it with an efficient frame pruning mechanism for real-time surgical video segmentation.

Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning

Haofeng Liu, Erli Zhang, Junde Wu, Mingxuan Hong, Yueming Jin

NeurIPS 2024 Workshop AIM-FM

Overview

We introduce Surgical SAM 2 (SurgSAM-2), an innovative model that leverages the power of the Segment Anything Model 2 (SAM2), integrating it with an efficient frame pruning mechanism for real-time surgical video segmentation. The proposed SurgSAM-2

dramatically reduces memory usage and computational cost of SAM2 for real-time clinical application;
achieves superior performance with 3× FPS (86 FPS), making real-time surgical segmentation in resource-constrained environments a feasible reality.

Dataset Acquisition and Preprocessing

Data Download

Please download the training and validation sets used in our experiments:
1. VOS-Endovis17
2. VOS-Endovis18
The original image data can be obtained from the official websites:
1. Endovis17 Official Dataset
2. Endovis18 Official Dataset

Data Preprocessing

Follow the data preprocessing instructions provided in the ISINet repository.

Dataset Structure

After downloading, organize your data according to the following structure:

project_root/
└── datasets/
    └── VOS-Endovis18/
        └──  train/
        	└──  JPEGImages/
        	└──  Annotations/
        └──  valid/
        	└──  JPEGImages/
        	└──  Annotations/
        	└──  VOS/

Training

To train the model, run:

CUDA_VISIBLE_DEVICES=0 python training/train.py --config configs/sam2.1_training/sam2.1_hiera_s_endovis18_instrument

Evaluation

Download the pretrained weights from sam2.1_hiera_s_endo18. Place the file at project_root/checkpoints/sam2.1_hiera_s_endo18.pth.

python tools/vos_inference.py --sam2_cfg configs/sam2.1/sam2.1_hiera_s.yaml --sam2_checkpoint ./checkpoints/sam2.1_hiera_s_endo18.pth --output_mask_dir ./results/sam2.1/endovis_2018/instrument --input_mask_dir ./datasets/VOS-Endovis18/valid/VOS/Annotations_vos_instrument --base_video_dir ./datasets/VOS-Endovis18/valid/JPEGImages --gt_root ./datasets/VOS-Endovis18/valid/Annotations --gpu_id 0

Demo

Demo data from Endovis 2018 can be downloaded from 2018 demo data.

After downloading, arrange the files according to the following structure:

project_root/
└── datasets/
    └── endovis18/
        └── images/
            └── seq_2/
                └── ...

Acknowledgement

This research utilizes datasets from Endovis 2017 and Endovis 2018.. If you wish to use these datasets, please request access through their respective official websites.

Our implementation builds upon the segment anything 2 framework. We extend our sincere appreciation to the authors for their outstanding work and significant contributions to the field of video segmentation.

Citation

@misc{liu2024surgicalsam2realtime,
 title={Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning}, 
author={Haofeng Liu and Erli Zhang and Junde Wu and Mingxuan Hong and Yueming Jin},
 year={2024},
 eprint={2408.07931},
 archivePrefix={arXiv},
 primaryClass={cs.CV},
 url={https://arxiv.org/abs/2408.07931}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
checkpoints		checkpoints
sam2		sam2
sav_dataset		sav_dataset
tools		tools
training		training
.gitignore		.gitignore
INSTALL.md		INSTALL.md
LICENSE		LICENSE
LICENSE_cctorch		LICENSE_cctorch
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py
video_predictor_512.ipynb		video_predictor_512.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning

Overview

Dataset Acquisition and Preprocessing

Data Download

Data Preprocessing

Dataset Structure

Training

Evaluation

Demo

Acknowledgement

Citation

About

Licenses found

Releases

Packages

Languages

License

Licenses found

jinlab-imvr/Surgical-SAM-2

Folders and files

Latest commit

History

Repository files navigation

Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning

Overview

Dataset Acquisition and Preprocessing

Data Download

Data Preprocessing

Dataset Structure

Training

Evaluation

Demo

Acknowledgement

Citation

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages