8000 GitHub - jinlab-imvr/Surgical-SAM-2: [NeurIPS 2024 Workshop AIM-FM] Official code implementation for paper: Surgical SAM 2
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[NeurIPS 2024 Workshop AIM-FM] Official code implementation for paper: Surgical SAM 2

License

Apache-2.0, BSD-3-Clause licenses found

Licenses found

Apache-2.0
LICENSE
BSD-3-Clause
LICENSE_cctorch
Notifications You must be signed in to change notification settings

jinlab-imvr/Surgical-SAM-2

Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning

Official implementation for SurgSAM2, an innovative model that leverages the power of the Segment Anything Model 2 (SAM2), integrating it with an efficient frame pruning mechanism for real-time surgical video segmentation.

Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning

Haofeng Liu, Erli Zhang, Junde Wu, Mingxuan Hong, Yueming Jin

NeurIPS 2024 Workshop AIM-FM

Overview

We introduce Surgical SAM 2 (SurgSAM-2), an innovative model that leverages the power of the Segment Anything Model 2 (SAM2), integrating it with an efficient frame pruning mechanism for real-time surgical video segmentation. The proposed SurgSAM-2

  • dramatically reduces memory usage and computational cost of SAM2 for real-time clinical application;
  • achieves superior performance with 3× FPS (86 FPS), making real-time surgical segmentation in resource-constrained environments a feasible reality.

architecture

Dataset Acquisition and Preprocessing

Data Download

  1. Please download the training and validation sets used in our experiments:
    1. VOS-Endovis17
    2. VOS-Endovis18
  2. The original image data can be obtained from the official websites:
    1. Endovis17 Official Dataset
    2. Endovis18 Official Dataset

Data Preprocessing

Follow the data preprocessing instructions provided in the ISINet repository.

Dataset Structure

After downloading, organize your data according to the following structure:

project_root/
└── datasets/
    └── VOS-Endovis18/
        └──  train/
        	└──  JPEGImages/
        	└──  Annotations/
        └──  valid/
        	└──  JPEGImages/
        	└──  Annotations/
        	└──  VOS/

Training

To train the model, run:

CUDA_VISIBLE_DEVICES=0 python training/train.py --config configs/sam2.1_training/sam2.1_hiera_s_endovis18_instrument

Evaluation

Download the pretrained weights from sam2.1_hiera_s_endo18. Place the file at project_root/checkpoints/sam2.1_hiera_s_endo18.pth.

python tools/vos_inference.py --sam2_cfg configs/sam2.1/sam2.1_hiera_s.yaml --sam2_checkpoint ./checkpoints/sam2.1_hiera_s_endo18.pth --output_mask_dir ./results/sam2.1/endovis_2018/instrument --input_mask_dir ./datasets/VOS-Endovis18/valid/VOS/Annotations_vos_instrument --base_video_dir ./datasets/VOS-Endovis18/valid/JPEGImages --gt_root ./datasets/VOS-Endovis18/valid/Annotations --gpu_id 0

Demo

Demo data from Endovis 2018 can be downloaded from 2018 demo data.

After downloading, arrange the files according to the following structure:

project_root/
└── datasets/
    └── endovis18/
        └── images/
            └── seq_2/
                └── ...

Acknowledgement

This research utilizes datasets from Endovis 2017 and Endovis 2018.. If you wish to use these datasets, please request access through their respective official websites.

Our implementation builds upon the segment anything 2 framework. We extend our sincere appreciation to the authors for their outstanding work and significant contributions to the field of video segmentation.

Citation

@misc{liu2024surgicalsam2realtime,
 title={Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning}, 
author={Haofeng Liu and Erli Zhang and Junde Wu and Mingxuan Hong and Yueming Jin},
 year={2024},
 eprint={2408.07931},
 archivePrefix={arXiv},
 primaryClass={cs.CV},
 url={https://arxiv.org/abs/2408.07931}, 
}

About

[NeurIPS 2024 Workshop AIM-FM] Official code implementation for paper: Surgical SAM 2

Topics

Resources

License

Apache-2.0, BSD-3-Clause licenses found

Licenses found

Apache-2.0
LICENSE
BSD-3-Clause
LICENSE_cctorch

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0