(Image in: https://www.kaggle.com/datasets/sshikamaru/glaucoma-detection)
Glaucoma is a progressive eye disease that has been a leading cause of irreversible vision loss in clinics for over a decade. Early diagnosis and treatment of glaucoma are crucial to prevent further visual impairment. By establishing an easy-to-use auxiliary diagnostic platform with high accuracy and efficiency, our software aims to help medical professionals predict the risk of developing glaucoma by analyzing patients' retinal scan images. The operating guide, software design flow, model training process, and future application prospects for software developers, testers, biomedical researchers, and end-users are elaborated in this document.
-
Python Version: Python >= 3.10 is required.
-
Dependencies: Install dependencies from
requirements.txt
using:pip install -r requirements.txt
-
Unzip models: Decompress the files "model_weights_V3.zip" into the "models\weights" folder.
- Image Upload: Users can upload retinal scan images via a simple table.
- Prediction Type Selection: Users can choose the prediction type based on the input image. If the uploaded image is a raw, unprocessed retinal scan, select the 'Raw Image' type. If the uploaded image is a segmented retinal scan, select the 'Segmented Image' type.
- Prediction Conducting: The software will display the original image, the segmented image (if applicable), and the prediction results from various models.
-
Image Preprocessing: The system receives the uploaded raw retinal images and performs necessary preprocessing steps, such as resizing and contrast enhancement, to standardize the images.
-
Optic Disc and Vessel Segmentation: A neural network is used to segment the optic disc, optic cup, and blood vessels. The results are displayed in the "Segmented Disc and Cup" and "Segmented Blood Vessels" sections.
-
Classification according to Two Features: A trained model (e.g., ResNet 18, ResNet 50, Xception, VGG16) is used to predict the health status of the optic disc and blood vessels. The prediction results from each model are displayed individually.
-
Result Display: Based on the predictions from various models, three models predict "Negative," while one model predicts "Positive." The final result is "Little Probability of Glaucoma."
-
Image Preprocessing: The system receives the uploaded pre-cropped retinal images and performs necessary preprocessing steps, such as resizing and contrast enhancement, to standardize the images.
-
Direct Classification according to Segmented Images: A trained model (e.g., ResNet 18, ResNet 50, Xception, VGG16) is used to predict the positive and negative status for glaucoma. The prediction results from each model are displayed individually.
-
Result Display: Based on the predictions from various models, all four models predict "Negative," and the final result is "Healthy."
Number of Models Predicting Positive | Number of Models Predicting Negative | Final Prediction |
---|---|---|
4 | 0 | Glaucoma |
3 | 1 | Large Probability of Glaucoma |
2 | 2 | Possibly Glaucoma |
1 | 3 | Little Probability of Glaucoma |
0 | 4 | Healthy |
The detailed workflow of this software are described in the graph below:
This project utilizes four deep learning models: ResNet18, ResNet50, VGG16, and Xception for glaucoma detection. Below is the key logic for training each model.
- Dependencies: Install the necessary libraries:
torch
,torchvision
,tensorflow
,matplotlib
. - Data Paths: Set up data directories and ensure that training, validation, and test datasets are organized by class.
- Preprocessing: Perform standard preprocessing: resize images to a fixed size, normalize images, and apply data augmentation.
- Data Loading:
Use
ImageDataGenerator
in TensorFlow for VGG16. UseDataLoader
in PyTorch withtorchvision.transforms
for ResNet18, ResNet50, and Xception.
- ResNet18 & ResNet50 (PyTorch): Use pre-trained models,
CrossEntropyLoss
, Adam optimizer, and save the best model. - VGG16 (TensorFlow/Keras): Use
ImageDataGenerator
for data augmentation,binary_crossentropy
, Adam optimizer, and early stopping callback. - Xception (PyTorch):
Use Xception architecture implemented in PyTorch.
Use
BCEWithLogitsLoss
for binary classification. Adam optimizer, and save the best model.
- Learning Rate: Typically set to 0.0001, can be adjusted based on the model’s convergence.
- Batch Size: Use 32, but adjust based on available GPU memory.
- Epochs: Train for 10-25 epochs depending on when the model converges.
- Hardware: Preferably use GPU for faster training.
- Evaluation: After each training epoch, evaluate the model on the validation set to track performance.
- Saving: Save the model weights whenever validation accuracy improves.
-
TransUNet for Disc and Cup Segmentation Use Transformer-based U-Net architecture implemented in PyTorch. Use BCEWithLogitsLoss for binary segmentation. Adam optimizer, learning rate scheduler, early stopping, and save the best model.
-
ResUNet for Blood Vessels Segmentation Use ResUNet architecture implemented in PyTorch for retinal image segmentation. Use BCELoss for binary segmentation. Adam optimizer, save the best model, and perform training with 25 epochs.
This software serves as an auxiliary diagnostic tool, enabling doctors to identify glaucoma in its early stages. It enhances the accuracy of diagnostic processes and minimizes the risk of incorrect diagnoses. By enabling earlier treatment interventions, the software significantly contributes to improved patient prognoses. Additionally, it assists patients with retinal examinations to monitor disease progression.
Subsequent software development could focus on the following aspects:
- Integrating more data types, such as optical coherence tomography, and physiological indicators like retinal nerve fiber layer thickness, to achieve higher prediction accuracy.
- Adding a function to provide appropriate treatment plans based on the patient's disease stage.
- Expanding the application range by developing prediction models for other eye diseases, such as age-related macular degeneration and diabetic retinopathy.
This document provides a detailed introduction to the development and usage of the Glaucoma Prediction Software from retinal scan images. By combining modern web technology and deep learning algorithms, the software is designed to offer medical professionals a powerful auxiliary diagnostic tool. With continuous technological iteration and model optimization, the software is expected to play a significant role in the early diagnosis and treatment of eye diseases.