8000 GitHub - raycaohmu/AlphaPatho
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

raycaohmu/AlphaPatho

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build Status License

AlphaPatho - Powerful Computational Pathology Tool

Paper: An end-to-end and well-generalized weakly supervised whole slide image analysis for lung cancer classification using deep learning

Contents

Framework

Ecosystem

Pretrained weights Description
Fold1 Fold1 model weights
Fold2 Fold2 model weights
Fold3 Fold3 model weights
Fold4 Fold4 model weights
Fold5 Fold5 model weights

Why AlphaPatho

  • Weakly supervised learning (No expensive annotations)
  • End-to-end pipeline (Iterative sampling and feature aggregation module)
  • Data efficiency (<200 images, AUCs of >0.9)
  • Promising generalizability (six heterogenous external datasets with AUCs of 0.93-0.97)
  • Overperform state-of-the-art methods

QuickStart

Install

System: Linux(CentOS) Pytorch (version 1.7.0): model training and validation. OpenSlide (version 3.4.1): processing whole slide images. lmdb (version 1.3.0): building lmdb dataset for faster data loading. staintools: following https://github.com/Peter554/StainTools.

$ git clone https://github.com/DeepLaboratory/AlphaPatho.git # install AlphaPatho
$ cd AlphaPatho

Directory description:

├─ data.py                 // dataset code
├─ mean_pixel.pkl		   // mean value of pixel value for data augmentation
├─ model.py				   // model architecture
├─ run.py		           // main code for training and validation
├─ run.sh                  // run.py launcher
├─ test.py                 // main code for inference or evaluation
├─ run_test.sh             // test.py launcher
├─ utils.py                // tool code
├─ tmp.py                  // LMDB dataset building code

Usage

Train from scratch

  • Slide tiling: see methods in our paper. When tiling is done, we should have files like below:

    ├─ slide1_folder
    │  ├─ tile1.png
    │  ├─ tile2.png
    │  ├─ ...
    ├─ slide2_folder
    │  ├─ tile1.png
    │  ├─ tile2.png
    │  ├─ ...
    ├─ ...
    
  • csv for LMDB dataset generation: after slide tiling, we need to generate a csv used for LMDB dataset generation. The csv content is as follow:

    slides_name label
    xxx/slide1_folder 1
    xxx/slide2_folder 0
    ... ... ...
  • Building LMDB dataset

    tmp.py is the main code. We need to modify the csv path and lmdb dataset save directory. Using tmp.py, we can generate the LMDB dataset.

  • Dataset preparation: preparing csv files for 5-fold validation, the format of each csv file is lik 5AFB e follows:

    slides_name label
    xxx/slide1_folder 1
    xxx/slide2_folder 0
    ... ... ...

    After cvs preparation, we will get train.csv, val.csv, and test.csv for each fold.

  • Model training: run run.sh and start training.

Model inference or test

Preparing datasets for testing, modifying the parameters in the run_test.sh, and runing run_test.sh for testing!

Contribution

Maintainers

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0