Open AccessReview

Deep Convolutional Neural Networks in Medical Image Analysis: A Review

Ibomoiye Domor Mienye

^1,*,†

Theo G. Swart

^1,†

George Obaido

^2,†,

Matt Jordan

^3,† and

Philip Ilono

^4,†

Institute for Intelligent Systems, University of Johannesburg, Johannesburg 2006, South Africa

Center for Human-Compatible Artificial Intelligence (CHAI), Berkeley Institute for Data Science (BIDS), University of California, Berkeley, CA 94720, USA

Department of Physics, University of Nottingham, Nottingham NG7 2RD, UK

⁴

Wolfson School of Mechanical, Electrical and Manufacturing Engineering, Loughborough University, Loughborough LE11 3TU, UK

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Information 2025, 16(3), 195; https://doi.org/10.3390/info16030195

Submission received: 20 December 2024 / Revised: 14 February 2025 / Accepted: 26 February 2025 / Published: 3 March 2025

Download

Browse Figures

Versions Notes

Abstract

Deep convolutional neural networks (CNNs) have revolutionized medical image analysis by enabling the automated learning of hierarchical features from complex medical imaging datasets. This review provides a focused analysis of CNN evolution and architectures as applied to medical image analysis, highlighting their application and performance in different medical fields, including oncology, neurology, cardiology, pulmonology, ophthalmology, dermatology, and orthopedics. The paper also explores challenges specific to medical imaging and outlines trends and future research directions. This review aims to serve as a valuable resource for researchers and practitioners in healthcare and artificial intelligence.

Keywords:

CNN; deep learning; image recognition; machine learning; medical imaging; segmentation

1. Introduction

Deep convolutional neural networks (CNNs) have significantly transformed numerous fields, including the field of medical image analysis, becoming the state-of-the-art algorithms for tasks such as disease detection, organ segmentation, and image enhancement [1,2,3,4]. Their ability to automatically learn hierarchical features from medical imaging data has enabled breakthroughs in diagnostic accuracy and patient care [5,6,7]. CNNs typically consist of convolutional layers that detect spatial hierarchies in images, pooling layers that reduce dimensionality while preserving critical features, and fully connected layers that synthesize these features into predictions [8,9]. This advanced architecture allows CNNs to tackle complex image recognition tasks that are challenging for traditional machine learning (ML) approaches [10].

Medical image analysis requires robust algorithms capable of extracting subtle patterns from high-dimensional and often noisy datasets, a challenge that CNNs are uniquely equipped to address [11,12,13]. By leveraging their ability to learn intricate features directly from imaging modalities such as X-rays, computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET), CNNs have enabled advancements in automated diagnostics, tumor detection, and precision medicine [14,15]. These networks are increasingly used in clinical workflows to augment radiologists’ expertise, improve accuracy, and reduce interpretation times.

Despite their success in medical applications, CNNs are not without limitations. Challenges such as the lack of large annotated medical datasets, model interpretability, and ethical concerns remain significant barriers to widespread adoption in clinical practice. Overcoming these issues requires continued innovation in architecture design, data augmentation techniques, and explainable AI frameworks [16]. Furthermore, the rapid evolution of CNN architectures, such as U-Net for segmentation tasks and DenseNet for feature reuse, highlights the need for a comprehensive review focused on their medical applications.

Furthermore, while several surveys and reviews have explored CNN applications in various fields, including medical imaging, many lack comprehensive coverage of recent advancements considering the rapid pace of development in this domain. Additionally, some reviews fail to thoroughly identify and address emerging challenges that hinder the broader adoption of CNNs in clinical settings. To bridge these gaps, this study provides a thorough review of deep CNNs as applied to medical image analysis, highlighting their evolution, state-of-the-art architectures, and innovative use cases in healthcare. It examines techniques and strategies proposed to enhance performance and addresses the challenges inherent in medical imaging. By consolidating insights from the recent literature, this work aims to serve as a guide for researchers and practitioners in the development and application of CNNs to solve pressing medical challenges.

This review provides a comprehensive analysis of the evolution and applications of deep CNNs in medical image analysis, covering advancements across multiple medical domains. The main contributions of this study are as follows:

A systematic review of state-of-the-art CNN architectures, including U-Net, ResNet, DenseNet, and EfficientNet, highlighting their applications in medical imaging.
An extensive assessment of the performance of CNN-based models across different imaging modalities and medical fields, such as oncology, neurology, cardiology, pulmonology, ophthalmology, and dermatology.
A discussion of the key challenges in CNN-driven medical image analysis, including issues related to generalization across rare diseases, bias in AI models, interpretability, and privacy concerns, alongside potential mitigation strategies.
A forward-looking perspective on emerging research trends, including the integration of CNNs with synthetic data generation techniques such as diffusion models, multi-modal learning frameworks, and low-resource AI models for global healthcare applications.

By consolidating recent advancements and identifying areas for future exploration, this study serves as a valuable resource for researchers and practitioners aiming to enhance AI-driven medical imaging.

The rest of this paper is structured as follows: Section 2 presents the research methodology, while Section 3 examines related reviews. Section 4 examines the foundational components of CNNs. Section 5 reviews the evolution of CNN architectures and their applications in medical image analysis. Section 6 examines notable applications of CNNs in medical imaging. Section 7 outlines challenges in CNN-driven medical imaging, while Section 8 discusses emerging trends and future research directions to address these challenges. Finally, Section 9 concludes the study.

2. Methodology

To ensure a systematic and comprehensive review of CNNs in medical image analysis, we adopted the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology. This approach ensures transparency and reproducibility in literature selection by following a structured four-phase process: identification, screening, eligibility, and inclusion.

2.1. Literature Search Strategy

The literature search was conducted using established academic databases, including PubMed, IEEE Xplore, Scopus, Web of Science, and Google Scholar. The search covered publications from 2020 to 2025, reflecting the rapid evolution of CNN-based techniques in medical imaging. The primary keywords used for search queries included, “Deep convolutional neural networks”, “CNN applications in medical imaging”, “Disease detection using CNNs”, “Medical image segmentation with CNNs”, “Multi-modal medical image analysis”, and “AI-based medical image enhancement”. Boolean operators (AND, OR) were applied to refine search results, ensuring comprehensive retrieval of relevant studies.

2.2. Inclusion and Exclusion Criteria

The following criteria were used to determine the relevance of retrieved studies:

Inclusion Criteria:
-
Peer-reviewed journal and conference papers published between 2020 and 2025.
-
Studies explicitly applying CNNs in medical image analysis.
-
Research presenting quantitative evaluation metrics (accuracy, AUC, sensitivity, specificity, etc.).
-
Papers focusing on disease detection, segmentation, image enhancement, multi-modal analysis, or novel CNN architectures.
Exclusion Criteria:
-
Studies focusing solely on CNN architecture development without medical applications.
-
Papers with insufficient experimental validation or no quantitative evaluation.
-
Review articles that lack substantial new insights beyond summarization.

The selection process was conducted following PRISMA guidelines. The initial database search identified 184 articles. After removing duplicates, 162 unique records were screened based on titles and abstracts. A total of 80 studies were deemed relevant for full-text assessment, of which 66 met all inclusion criteria and were included in this review.

2.3. Data Extraction and Synthesis

Key information was extracted from the selected studies, including:

Medical field: Neurology, cardiology, pulmonology, gastroenterology, ophthalmology, dermatology, oncology, orthopedics.
Task: Disease classification, segmentation, image reconstruction, multi-modal integration.
CNN architecture: AlexNet, ResNet, DenseNet, U-Net, EfficientNet, hybrid CNN models.
Performance metrics: Accuracy, AUC, sensitivity, specificity, DSC score.
Dataset: Publicly available or institutionally curated datasets (e.g., LIDC-IDRI, CBIS-DDSM, ISIC, ADNI).

The extracted data were synthesized into a structured analysis, categorizing studies based on medical application domains and CNN methodologies.

3. Related Reviews

Medical image analysis has significantly evolved with the application of CNNs, enabling state-of-the-art performance in various tasks such as disease detection, segmentation, and image enhancement. This section discusses notable reviews and surveys in the field, highlighting the contributions of various researchers and identifying areas requiring further attention.

Litjens et al. [17] provided one of the earliest comprehensive surveys on deep learning in medical image analysis, covering CNN applications across tasks like classification, segmentation, and detection. Their work laid a strong foundation for CNN-based research in medical imaging, especially for modalities such as MRI and CT. Similarly, Bernal et al. [2] reviewed the application of CNNs in brain imaging, focusing on tumor detection and structural segmentation. Their review highlighted the utilization of CNNs in analyzing neurological conditions, which remains a critical area of research.

Alzubaidi et al. [18] explored various deep learning architectures, including CNNs and their roles in handling medical image analysis tasks. While the study detailed advancements in model architectures, emerging technologies like self-supervised learning (SSL) and federated learning were not extensively explored. Recent studies have expanded on these earlier works by integrating CNNs with novel modalities and imaging frameworks. For example, Ferrag et al. [19] reviewed the integration of CNNs into Internet of Medical Things (IoMT) applications, which has the potential to enable real-time processing and analysis of medical images from connected devices. This intersection of CNNs with IoT-driven healthcare systems opens new possibilities for remote diagnostics and telemedicine. Similarly, Xu et al. [20] provided an in-depth evaluation of deep-learning-based segmentation methods, which have become increasingly important in medical diagnosis. The study focused on deep learning (DL) architectures, such as CNNs, recurrent neural networks (RNNs), generative adversarial networks (GANs), and autoencoders.

Furthermore, some research works have focused on specific applications of CNNs in niche imaging modalities. Zhang et al. [21] examined deep-learning-based methods for low-dose CT reconstruction, where noise and reduced radiation exposure present challenges. The study identified that CNNs are the dominant DL architectures used in medical imaging and demonstrated how they enhance image clarity while maintaining diagnostic quality. This study is crucial as it provides practical solutions for radiation-sensitive environments. Bhatia et al. [22] evaluated the integration of hybrid DL models for analyzing electrocardiograms (ECG). Three hybrid DL models were studied, including CNN with bidirectional long short-term memory (BiLSTM), CNN with LSTM, and CNN with gated recurrent unit (GRU). These hybrid approaches enhanced the detection of cardiovascular anomalies with improved accuracy, with the hybrid CNN-BiLSTM achieving the best performance.

Hermessi et al. [23] provides a comprehensive review of CNN-based frameworks for multi-modal medical image analysis. The study demonstrates how CNNs improve diagnostic performance in complex clinical scenarios by integrating information from different imaging modalities, such as CT, MRI, and PET scans. However, while multi-modal fusion frameworks have shown significant progress, further advancements are needed to address challenges in aligning heterogeneous data sources. Meanwhile, Ferdinand et al. [24] highlights CNN-based techniques for medical image enhancement. The study demonstrates the importance of high-quality images for accurate diagnostics and discusses how CNNs improve image resolution and clarity through super-resolution and denoising approaches. In similar research, Bhutto et al. [25] studied the use of CNN in enhancing medical images, focusing on CT and MRI images. The research showed the effectiveness of CNN in removing noise and enhancing contrast in medical images.

CNNs have also been used for sensor-driven medical imaging applications, as explored by Palanisamy et al. [26]. Their study integrates CNN-based processing with IoT-enabled sensors for remote patient monitoring. This framework demonstrates potential in resource-constrained environments where real-time diagnostics are critical for timely interventions. The work by Mahmood et al. [27] reviews DL advancements for medical image segmentation and classification using techniques such as CNNs, RNNs, and autoencoders. The study highlights their role in identifying lesions and tumors from CT and MRI data, with emphasis on explainability and robustness for clinical adoption.

Furthermore, Yao et al. [28] provided a broad review of CNNs in medical image segmentation and evaluated their performance on different medical datasets, including ovarian tumors and liver segmentation datasets. The study also explored how Transformers combined with CNNs are transforming the field of medical image analysis. Similarly, Kshatri and Singh [29] reviewed the role of CNNs in MRI pre-processing, segmentation, and diagnosis. The study explores CNN-driven advancements, challenges, and large-scale retrieval methods to enhance MRI processing efficiency. This work provides insights into CNN applications in data preparation, segmentation, and post-processing.

While previous reviews have provided insights into CNN applications in medical imaging, they often focus on specific tasks, such as segmentation, disease classification, or detection, without extensively covering their broader impact across multiple medical specialties. Furthermore, many reviews primarily emphasize standard imaging modalities such as CT and MRI while overlooking advancements in integrating CNNs with multi-modal medical data, including histopathological images, genomic sequences, and real-time sensor-driven imaging. Additionally, limited attention has been given to CNN-driven synthetic data generation for augmenting datasets in rare disease diagnosis, an area that is critical for improving model generalization and mitigating data scarcity in medical imaging.

To address these gaps, this review provides a systematic and up-to-date analysis of CNN applications across various medical fields, including oncology, neurology, cardiology, pulmonology, ophthalmology, dermatology, and orthopedics. Unlike previous surveys, this study examines CNN applications beyond disease classification, encompassing areas such as segmentation, image enhancement, and the integration of CNNs with multi-modal and sensor-driven imaging frameworks. Furthermore, this review aims to discuss emerging methodologies such as federated learning, self-supervised learning, and explainable AI, highlighting their role in addressing privacy concerns, improving model interpretability, and ensuring clinical adoption. By consolidating recent advancements and assessing CNN performance across multiple domains, this study serves as a crucial resource for advancing AI-driven medical imaging research and implementation in real-world clinical settings.

4. Overview of CNNs and Their Building Blocks

CNNs are a specialized class of deep neural networks, designed to efficiently process grid-like data structures such as images. They excel in capturing spatial hierarchies and extracting features from input data using layers of learnable filters and operations [30]. Figure 1 illustrates the general architecture of a CNN, which progresses from raw image input through convolutional and pooling layers to produce high-level feature representations for classification or other tasks.

The CNN architecture is composed of several key building blocks, each playing a unique role in processing and learning from image data. These include convolutional layers for feature extraction, pooling layers for dimensionality reduction, and fully connected layers for decision-making. Additional components such as batch normalization and dropout are often incorporated to enhance training stability and model generalization. The main components of CNNs are discussed below.

4.1. Convolutional Layer

The convolutional layer is the fundamental unit of a CNN, responsible for detecting patterns and features in the input data. This is achieved by applying a set of learnable filters that activate specific features such as edges, textures, or shapes [32]. Each filter generates a feature map, representing various aspects of the input image. Mathematically, the convolution operation is defined as

F (i, j) = (G * H) (i, j) = \sum_{m} \sum_{n} G (m, n) H (i - m, j - n),

(1)

where

F (i, j)

represents the output feature map at position

(i, j)

G (m, n)

is the input image, and

H (i - m, j - n)

denotes the filter applied to the input. The indices m and n correspond to the spatial dimensions of the filter, iterating over its width and height, respectively. The convolution operation effectively slides the filter across the input image, computing dot products between the filter weights and the corresponding pixel values to extract relevant patterns, enabling hierarchical feature learning in CNNs. The ReLU activation function is commonly applied after the convolution operation to introduce non-linearity, enabling the network to learn complex patterns [33].

4.2. Pooling Layer

The pooling layer reduces the spatial dimensions of feature maps, decreasing computational complexity and making the network more robust to spatial variations in the input [34]. Max pooling, the most widely used pooling operation, selects the maximum value from a region of the feature map, as defined by

P_{i j} = max_{a, b \in N_{i j}} X_{a b},

(2)

where

P_{i j}

is the output at position

(i, j)

X_{a b}

represents the elements in the pooling window, and

N_{i j}

is the neighborhood covered by the pooling kernel [35].

4.3. Fully Connected Layer

Fully connected (FC) layers integrate features learned by previous layers into the final prediction or classification. Each neuron in the FC layer connects to all activations from the preceding layer, forming a dense network that learns complex relationships among features. The output of an FC layer is computed as

y = σ (Wx + b),

(3)

where

x

is the input vector,

W

is the weight matrix,

b

is the bias vector, and

σ

denotes the activation function [36].

4.4. Batch Normalization

Batch normalization stabilizes training by normalizing the inputs to a layer for each mini-batch, ensuring zero mean and unit variance. This process mitigates the internal covariate shift, allowing higher learning rates and reducing sensitivity to initialization [37]. The normalized input is computed as

{\hat{x}}^{(k)} = \frac{x^{(k)} - E [x^{(k)}]}{\sqrt{Var [x^{(k)}] + ϵ}},

(4)

where

x^{(k)}

is the input,

E [x^{(k)}]

is the mean,

Var [x^{(k)}]

is the variance, and

ϵ

is a small constant for numerical stability.

4.5. Dropout

Dropout is a regularization method that prevents overfitting by randomly deactivating a fraction of neurons during training [38]. This introduces robustness by reducing dependency on specific neurons, effectively forming an ensemble of subnetworks. The dropout operation can be expressed as

r_{j}^{(l)} \sim Bernoulli (p),

(5)

where

r_{j}^{(l)}

is the masking neuron for layer l, and p is the probability of retaining a neuron during training.

5. Evolution of Deep CNNs and Architectures

The evolution of deep CNNs has significantly influenced advancements in image recognition tasks, including medical image analysis. Although the foundational concepts of neural networks were introduced in the 1980s and 1990s, the development of AlexNet by Krizhevsky et al. [39] in 2012 marked the beginning of deep CNNs as a dominant approach in computer vision. This architecture demonstrated the feasibility of training deep networks using GPUs and large datasets, achieving groundbreaking performance on the ImageNet large-scale visual recognition challenge (ILSVRC). Since then, innovative architectures such as U-Net, ResNet, and EfficientNet have introduced new paradigms, addressing challenges like vanishing gradients, computational inefficiency, and pixel-level segmentation. These architectures were selected for their wide adoption and demonstrated suitability for medical image analysis [8,17]. Below, we discuss the most notable CNN architectures and their relevance to medical imaging.

5.1. AlexNet

AlexNet [39] revolutionized the field by introducing ReLU activations, dropout regularization, and GPU-based training, enabling deeper networks to handle large datasets effectively. The architecture consists of five convolutional layers and three fully connected layers, incorporating max pooling and ReLU activations throughout. Figure 2 illustrates the architecture of AlexNet. The success of AlexNet demonstrated the potential of DL in image classification and significantly influenced its application in medical image analysis. Its features, such as dropout regularization, have become standard practices in CNN architectures for tasks like disease classification and organ segmentation [40].

5.2. VGGNet

VGGNet was developed by the Visual Geometry Group at the University of Oxford in 2014 [41]. It is known for its simplicity and systematic use of small

3 \times 3

convolutional filters, which are stacked to increase network depth while maintaining computational feasibility. The architecture comprises 16 to 19 layers, organized into convolutional and pooling blocks, followed by fully connected layers. This consistent design allows for efficient feature extraction and hierarchical representation learning. Figure 3 illustrates the VGGNet architecture.

VGGNet has been widely adopted in medical imaging, particularly for disease classification and organ segmentation. For instance, it has been employed to classify diabetic retinopathy in retinal images and detect lung cancer in chest CT scans [42]. Its pre-trained weights on ImageNet make it an effective baseline for transfer learning, enabling researchers to adapt the model to various medical tasks with limited annotated data. Despite its effectiveness, the high computational cost of VGGNet, due to its large number of parameters, limits its suitability for resource-constrained environments. Nevertheless, VGGNet remains a valuable tool in medical image analysis, particularly for research and scenarios where computational resources are not a limiting factor. The straightforward design and robust feature extraction capabilities of VGGNet continue to make it a relevant choice in the field.

5.3. U-Net

U-Net is one of the most widely used CNN architectures for medical image segmentation [43]. Developed specifically for biomedical image analysis, U-Net employs an encoder–decoder structure with skip connections to ensure the precise localization of features. The encoder extracts high-level features through convolutional and pooling layers, while the decoder reconstructs the spatial details using up-sampling layers. Skip connections bridge corresponding encoder and decoder layers, allowing fine-grained details to be preserved, which is critical for pixel-level segmentation tasks.

U-Net has been instrumental in medical applications such as tumor segmentation, organ delineation, and lesion detection. For example, it has been used to segment brain tumors from MRI scans and delineate liver boundaries in CT images with high accuracy [44]. Variants such as 3D U-Net extend its application to volumetric data, enabling efficient analysis of 3D medical images like CT and MRI stacks. The architecture’s adaptability and effectiveness make it a cornerstone in medical image segmentation.

5.4. ResNet

ResNet [45], developed by Microsoft Research in 2015, introduced residual learning through skip connections, enabling the training of very deep networks without performance degradation. These connections mitigate the vanishing gradient problem by allowing the network to learn residual functions, making it feasible to train networks with over 150 layers. Figure 4 shows the ResNet architecture. ResNet has been instrumental in medical imaging, especially for tasks requiring detailed feature extraction, such as tumor detection and organ segmentation. For instance, ResNet-based models have been employed for detecting breast cancer from mammograms and identifying lung nodules from CT scans, achieving state-of-the-art accuracy in these applications [46].

5.5. DenseNet

DenseNet [47], proposed by researchers at Cornell University in 2017, introduced dense connectivity, where each layer receives inputs from all preceding layers and passes its outputs to all subsequent layers. This design improves gradient flow, reduces redundant feature maps, and enhances parameter efficiency. Figure 5 depicts the DenseNet architecture. DenseNet has proven highly effective in medical imaging tasks like image segmentation and disease diagnosis, where compact and efficient architectures are essential. For example, it has been used to classify diabetic retinopathy from fundus images and segment lung lesions in CT scans, demonstrating its versatility and effectiveness in diverse medical imaging applications [48].

5.6. EfficientNet

EfficientNet [49] addresses the challenge of balancing depth, width, and resolution in CNNs through compound scaling. It systematically scales these dimensions using a compound coefficient, ensuring efficient resource utilization. The architecture achieves state-of-the-art accuracy with significantly fewer parameters compared to earlier models. EfficientNet’s scalability and efficiency make it highly suitable for medical image analysis, particularly for tasks requiring high accuracy on resource-constrained platforms, such as mobile diagnostic tools [50]. For instance, it has been employed in detecting skin lesions from dermoscopic images, achieving superior performance while maintaining computational efficiency.

5.7. Summary of Architectures

The evolution of these architectures has significantly shaped the application of CNNs in medical imaging. From U-Net’s dominance in segmentation to EfficientNet’s resource-efficient scalability, each architecture has addressed specific challenges, advancing the capabilities of medical image analysis. Table 1 presents a summary of the various CNN architectures and typical applications in medical image analysis.

6. Applications of CNNs in Medical Image Analysis

This section discusses CNN applications in major medical fields, focusing on their diagnostic and prognostic capabilities across various imaging modalities.

6.1. Oncology

CNNs have been widely adopted for cancer detection, classification, and prognosis prediction due to their superior feature extraction and pattern recognition capabilities in medical imaging [51,52]. Various CNN architectures have demonstrated exceptional performance in detecting and classifying cancers across multiple imaging modalities, including mammography, CT, and histopathology.

In breast cancer detection, CNNs have achieved state-of-the-art performance by leveraging deep architectures for mammographic analysis. Sahu et al. [53] developed a transfer-learning-based CNN framework incorporating AlexNet, ResNet, and MobileNetV2 to classify breast cancer in mammograms, achieving an accuracy of 99.17% on the CBIS-DDSM dataset. Bouzar-Benlabiod et al. [54] proposed a U-Net-based segmentation approach integrated with case-based reasoning (CBR) module for robust breast cancer classification, demonstrating significant improvements in feature localization and model interpretability. The model achieved an accuracy of 86.71% on the CBIS-DDSM dataset. Das et al. [55] used ResNet-50 to classify breast cancer from MRI images, achieving a 92.01% accuracy on 1480 samples from the Kaggle database. Similarly, McKinney et al. [56] introduced a deep learning model that outperformed six expert radiologists in breast cancer detection from mammograms, highlighting the clinical applicability of CNN-based approaches. Furthermore, Mahoro et al. [57] employed a hybrid CNN–Transformer model to enhance breast cancer classification, achieving an accuracy of 97.26% on a large multi-institutional dataset.

In lung cancer detection, CNNs have been instrumental in identifying pulmonary nodules from CT scans. UrRehman et al. [58] introduced a dual-attention CNN model for lung nodule classification, achieving 94.69% sensitivity and 93.17% specificity on the LIDC-IDRI dataset. Similarly, Safta and Shaffie [59] designed an advanced 3D CNN framework tailored for volumetric CT data, improving the differentiation between benign and malignant nodules with a classification accuracy of 97.3%. Gayathiri et al. [60] used AlexNet to detect lung cancer from CT scan images of smokers’ lungs, achieving an overall accuracy of 90.8% in their comparative evaluation of machine learning techniques.

Histopathology-based CNN applications have also gained traction, particularly in automated tumor grading and biomarker prediction. Li et al. [61] employed a ResNet-based model for histopathological tumor mutational burden (TMB) estimation, demonstrating an AUC of 0.971 on a multi-center dataset. In colorectal cancer, Khan et al. [62] developed a multi-modal CNN incorporating AlexNet for the classification of liver cancer variants, achieving an accuracy of 96.06%. Additionally, Raju et al. [63] proposed a U-Net-based deep learning pipeline for colorectal cancer detection, reporting an accuracy of 92.3% on the CKHK-22 dataset.

Beyond detection and classification, CNNs have been employed in prognostic modeling and treatment response prediction. Kiran et al. [64] integrated histopathological imaging with genomic data using a CNN framework to predict melanoma treatment response, improving precision medicine strategies and achieving an accuracy of 92.5%. Similarly, Arshad Choudhry et al. [65] introduced a graph CNN model for multimodal brain tumor segmentation, achieving a sensitivity of 97%.

6.2. Neurology

CNNs have significantly advanced the automated detection and classification of neurological disorders, leveraging their ability to extract complex spatial features from neuroimaging modalities such as MRI, CT, and PET [66]. Recent studies have demonstrated the effectiveness of CNN-based approaches in diagnosing conditions such as Alzheimer’s disease (AD), Parkinson’s disease (PD), and stroke, achieving high diagnostic accuracy across multiple datasets.

In Alzheimer’s disease detection, CNNs have been instrumental in identifying structural abnormalities in brain imaging. Mahmood et al. [67] proposed a multi-modal CNN model integrating MRI and PET data for AD classification, achieving an accuracy of 98.59%. Similarly, Castellano et al. [68] developed an automated CNN-based framework for AD diagnosis, reporting an accuracy of 91.5% when distinguishing AD patients from healthy controls. Furthermore, El-Assy et al. [69] introduced a CNN model integrating imaging biomarkers for early AD prediction, achieving an accuracy of 95% and AUC of 0.93.

For Parkinson’s disease, CNNs have been widely adopted for automated classification based on MRI and other imaging modalities. Aggarwal et al. [70] applied a 1D CNN model to structural MRI scans, attaining a classification accuracy of 98.71% on a dataset of 120 subjects. Huang et al. [71] extended CNN-based classification to SPECT imaging. The study treated the 3D images as sequences of 2D slices and then used a 2D CNN pretrained on ImageNet, achieving an accuracy of over 60%. Additionally, Frasca et al. [72] developed hybrid CNN-LSTM architectures to capture temporal imaging patterns in Parkinson’s Disease progression, achieving a diagnostic accuracy of 96.8% on multi-modal datasets.

CNNs have also contributed to stroke detection and segmentation in neuroimaging. Kaya and Önal [73] introduced a U-Net-based segmentation model for brain stroke lesion detection from MRI scans, obtaining a precision of 95.06%. Moreover, Tahyudin et al. [74] implemented a ResNet-based CNN model for acute stroke detection in CT images, achieving an accuracy of 95% and AUC of 0.99, enhancing rapid clinical decision-making.

The application of CNNs in epilepsy detection has also shown promising results. Li et al. [75] employed a CNN-based model to analyze electroencephalogram (EEG) signals, achieving a classification accuracy of 99.03% in seizure detection. Similarly, Chen et al. [76] developed a CNN-transformer model for early seizure prediction from EEG data, improving diagnostic reliability with a sensitivity of 76.9%. Kode et al. [77] used machine learning and deep learning techniques, including a 1D-CNN model, to classify epileptic seizures from EEG signals, achieving a top accuracy of 99% on the UCI Epileptic Seizure Recognition dataset. In another study, Vibha Patel et al. developed a hybrid deep learning model combining 1D-CNN and stacked LSTM networks for automated epileptic seizure detection from EEG recordings, achieving 90% accuracy on the CHB-MIT dataset.

6.3. Cardiology

CNN-based approaches have demonstrated state-of-the-art performance in detecting structural and functional cardiac abnormalities, predicting disease progression, and supporting clinical decision making [78]. For example, Sadr et al. [79] developed a CNN-LSTM-based hybrid model to assess cardiovascular disease risk using echocardiographic frames, achieving an accuracy of 97.05%. Similarly, Deepika and Jaisankar [80] implemented a CNN-based automated myocardial infarction detection system using echocardiograms, achieving a sensitivity of 96.8% and a specificity of 94.2%. Their model outperformed traditional rule-based segmentation techniques in detecting infarcted regions. Rahman et al. [81] proposed a stacked CNN-LSTM architecture for congenital heart disease classification from echocardiogram sequences, obtaining an accuracy of 90.5% on a pediatric dataset.

Cardiac MRI-based CNN applications have demonstrated remarkable performance in segmenting heart structures and detecting cardiomyopathies. Germain et al. [82] utilized a 3D CNN model for automated segmentation of the left ventricle, achieving a Dice similarity coefficient (DSC) of 0.94, surpassing human interobserver variability. Similarly, El-Taraboulsi et al. [83] compared various CNN architectures for cardiac MRI segmentation and identified U-Net as the most effective, achieving an accuracy of 95.3% in delineating the myocardium.

CNNs have also been applied in coronary artery disease (CAD) detection using CTA and angiographic images. Nie et al. [84] proposed a cascade R-CNN model to detect calcified and non-calcified plaques in CTA scans, achieving an accuracy of 94.6%. Their model demonstrated superior generalization across multi-center datasets compared to traditional R-NN and other methods.

Further advancing the application of CNNs in cardiology, Sadad et al. [85] developed a deep learning pipeline integrating CNN and attention mechanisms to predict heart failure from chest radiographs. Their model achieved an accuracy of 92.7% and an F1-score of 91.5%, demonstrating the potential of CNNs for early cardiac dysfunction detection. Additionally, Luo et al. [86] utilized a novel lead-aware hierarchical CNN (LAH-CNN) to analyze ECGs for arrhythmia classification, reporting F-measures of 78.86% and 99.2% on 12-lead databases CPCS and INCART datasets respectively.

6.4. Pulmonology

CNNs have demonstrated substantial improvements in pulmonary disease detection and classification by enhancing the analysis of chest radiographs (CXR), CT, and high-resolution computed tomography (HRCT) scans. For instance, CNN models have also demonstrated high performance in pneumonia and tuberculosis detection using CXR images. Ren et al. [87] developed a multi-scale CNN for pneumonia diagnosis, achieving an accuracy of 95%. Rani and Gupta [88] introduced a VGG16-based model for TB detection in CXR images, reporting an accuracy of 98% and precision of 98%. Prasetyo [89] utilized pretrained DCNN models, including VGG-16, VGG-19, ResNet-50, ResNet-101, and MobileNet, for pulmonary tuberculosis detection from chest radiographs, with VGG-16 achieving the highest accuracy of 99.524%.

For chronic obstructive pulmonary disease (COPD) detection, CNN-based models have proven effective in classifying disease severity and predicting exacerbations. Polat et al. [90] applied an Inception-V3 model to HRCT scans, achieving a classification accuracy of 97.98% in distinguishing mild, moderate, and severe COPD cases. Additionally, Zhang et al. [91] utilized a deep CNN-LSTM framework for predicting COPD exacerbations from CXR and patient metadata, reporting an accuracy of 99.01% and recall of 99.13%. Their study highlighted the advantages of multi-modal deep learning in respiratory disease prognosis.

Pulmonary embolism (PE) detection has also seen significant advancements with CNN applications. Pu et al. [92] designed a CNN-based automated PE detection system using CT pulmonary angiography (CTPA), achieving an AUC of 0.97 and a sensitivity of 95.3%. Their model effectively reduced radiologist interpretation time while maintaining diagnostic accuracy. Similarly, Vadhera and Sharma [93] implemented a hybrid CNN–Transformer model for detecting PE in HRCT scans, achieving a sensitivity of 93.24%.

Finally, CNNs have been employed in the early detection of interstitial lung diseases (ILD). Chunduri et al. [94] developed a CNN-based classifier for ILD subtyping in HRCT scans, obtaining robust performance. Their model demonstrated improved generalizability across multi-center datasets.

6.5. Ophthalmology

Ophthalmology focuses on diagnosing and treating eye-related diseases, including diabetic retinopathy (DR), glaucoma, age-related macular degeneration (AMD), and cataracts. Medical imaging techniques such as fundus photography, optical coherence tomography (OCT), and fluorescein angiography provide critical insights into ocular health. CNN-based models have been extensively used to automate disease detection and classification, improving diagnostic accuracy and reducing the workload of ophthalmologists.

Diabetic retinopathy (DR) detection has benefited from CNN-based methods applied to fundus imaging. Singh et al. [95] proposed a DenseNet-based model trained on the APTOS 2019 dataset, achieving an accuracy of 86% for DR severity grading. Similarly, Al-Antary and Arafa [96] introduced a multi-scale CNN that integrated fundus photography and OCT images, improving classification accuracy to 84.6% and sensitivity to 91% compared to single-modality approaches.

For glaucoma detection, CNNs have been leveraged to analyze optic disc and retinal nerve fiber layer thickness. Gayatri and Biswal [97] developed a ResNet-based model to classify glaucomatous eyes using OCT scans, achieving an accuracy of 94%. Das and Nayak [98] developed a glaucoma screening with an attention-guided CNN that highlighted pathological regions, obtaining an accuracy of 84.91% and AUC of 0.9454 and outperforming traditional handcrafted feature-based methods.

Furthermore, AMD classification has also been enhanced with CNN-based approaches. Azizi et al. [99] introduced a hybrid CNN–Transformer model for early AMD detection in OCT images, achieving an accuracy of 94.9% on the NEH dataset. Li et al. [100] developed an InceptionV3-based model in identifying normal retinal conditions, obtaining a classification accuracy of 92.76%.

Cataract detection and grading using CNNs have shown promising results. Verma et al. [101] proposed a MobileNetV3-based CNN for cataract severity grading in mobile applications, achieving 98.67% accuracy on the Cataract Mobile Periocular Database (CMPD) dataset. Additionally, Zhang et al. [102] integrated CNNs with attention mechanisms for improved interpretability, reporting an accuracy of 97.89%, AUC of 0.997, and a sensitivity of 97.58% in detecting cataracts using the ACRIMA dataset. They introduced CataractNet, a CNN-based network designed for automatic cataract detection in fundus images. The model is optimized with the Adam optimizer, using small kernels and fewer training parameters to reduce computational cost. With data augmentation, CataractNet achieved an impressive 99.13% accuracy, outperforming state-of-the-art cataract detection approaches.

These advancements demonstrate the effectiveness of CNNs in ophthalmic imaging, particularly in disease screening, severity assessment, and multi-modal integration.

6.6. Dermatology

Dermatology involves the diagnosis and treatment of skin diseases, including melanoma, psoriasis, eczema, and other dermatological conditions [103,104]. CNNs have been extensively applied to classify skin lesions using dermoscopic images, enhancing early detection and improving diagnostic accuracy.

Melanoma detection has been a major focus in CNN-based dermatological studies. Toprak and Aruk. [105] introduced a hybrid CNN model trained on the ISIC 2020 dataset. The hybrid approach employed DeepLabV3+ for the segmentation of skin lesions in dermoscopic images and three pre-trained models for feature extraction: MobileNetV2, EfficientNetB0, and DenseNet201. These extracted features are then concatenated, and the ReliefF algorithm is employed to select and classify the features. The proposed approach obtained an accuracy of 94.42% on the ISIC-2019 dataset. Similarly, Armağan et al. [106] proposed an EfficientNetV2 model for multi-class skin lesion classification, obtaining a classification accuracy of 96.04%.

For automated skin lesion segmentation, Aghdam et al. [107] developed a U-Net-based approach enhanced with attention mechanisms, which improved boundary delineation in dermoscopic images, achieving a DSC of 92.4%. Similarly, Reddy et al. [108] applied an enhanced U-Net model, achieving a segmentation accuracy of 98%. Khasanah and Winnarto [109] investigated deep learning approaches for early melanoma detection using InceptionV3 and ResNet50. Their study, utilizing a dataset of 3297 images, found that ResNet50 achieved the highest classification accuracy at 87%.

CNNs have also been utilized in multi-modal dermatological imaging. Xiao et al. [110] fused dermoscopic and clinical images using a dual-branch CNN architecture, achieving a classification accuracy of 88.17% and AUC of 94.41% and demonstrating improved robustness over single-modality approaches. Additionally, Pintelas et al. [111] explored the use of CNN-based generative models for augmenting skin lesion datasets, leading to a melanoma classification accuracy of 92.9% when applied to an augmented dataset.

6.7. Orthopedics

Orthopedics is a medical specialty that focuses on diagnosing, treating, and preventing disorders of the musculoskeletal system, including bones, joints, ligaments, and muscles [112,113]. CNNs have gained significant traction in orthopedic imaging due to their ability to analyze radiographic, CT, and MRI data with high accuracy. CNN-based models have been extensively applied to fracture detection, particularly in wrist, hip, and ankle fractures. Tabarestani et al. [114] developed the Faster-RCNN model for fracture zone prediction, achieving an average precision of 66.82% on the MURA dataset.

Similarly, Chen et al. [115] applied a DenseNet-121 architecture to detect hip fractures, achieving an accuracy of 86.5%. For ankle fracture detection, Ashkani-Esfahani et al. [116] proposed a DCNN model trained on a large clinical dataset, achieving a sensitivity of 98.7%.

In knee osteoarthritis (OA) assessment, CNNs have been utilized to automate severity grading and progression prediction. Liu et al. [117] introduced a multi-modal CNN framework combining XGboost and ResNet50 to predict OA progression, improving early diagnosis and personalized treatment planning. In another study, Wirth et al. [118] developed a U-Net-based segmentation model to delineate cartilage damage in MRI scans, achieving a DSC of 92%. Yeh et al. [119] developed a ResNet50-based deep learning model to assist in diagnosing benign and malignant spinal fractures on MRI, achieving 92% accuracy and significantly improving sensitivity and specificity for less experienced clinicians. Xing et al. [120] developed a deep learning model using Faster R-CNN and DenseNet-121 to detect and classify femoral neck fractures from radiographs, achieving 94.1% accuracy and significantly improving diagnostic assistance and physician training outcomes.

CNNs have also been instrumental in spinal disorder diagnosis. For example, Iyer et al. [121] utilized a CNN-based ensemble model for vertebral compression fracture detection in spinal CT scans, which achieved accuracy and F1 scores of 81.05% and 80.74% for thoracic and 85.45% and 85.61% for lumbar spine, respectively. Furthermore, CNNs have also been leveraged for gait analysis and biomechanical assessment in orthopedic rehabilitation. We et al. [122] introduced a CNN-LSTM hybrid model for analyzing gait abnormalities using pressure-sensing insole data, achieving a classification accuracy of 97%.

6.8. Summary of CNN Applications in Medical Image Analysis

Table 2 summarizes the different CNN applications and their performance. This analysis of CNN applications across medical fields highlights the most effective architectures and their suitability for different tasks. In oncology, transfer-learning-based CNNs such as ResNet and MobileNetV2 have achieved high accuracy in breast cancer detection, exceeding 99% on mammographic datasets. U-Net and hybrid CNN–Transformer models have been particularly effective in histopathology-based tumor grading, offering superior feature localization and interpretability. Additionally, 3D CNNs have demonstrated strong performance in lung cancer classification using volumetric CT data. In neurology, multi-modal CNN frameworks integrating MRI and PET imaging have surpassed 98% accuracy in diagnosing Alzheimer’s and Parkinson’s diseases. Similarly, U-Net and ResNet architectures have exhibited high precision in stroke lesion segmentation and seizure detection from EEG signals.

Cardiology applications have leveraged CNN-LSTM hybrid models for heart disease risk assessment, myocardial infarction detection, and cardiac MRI segmentation, achieving sensitivities above 96%. In pulmonology, multi-scale CNNs have shown high accuracy in pneumonia and tuberculosis detection, while CNN-LSTM frameworks have effectively predicted COPD exacerbations with over 99% accuracy. Ophthalmology applications have seen advancements with DenseNet and Inception-based models excelling in diabetic retinopathy and cataract detection, achieving accuracy levels exceeding 97%. Similarly, dermatology has benefited from CNN-based segmentation models, such as U-Net with attention mechanisms, which have achieved a Dice similarity coefficient above 92% in skin lesion analysis.

In orthopedics, CNNs have been applied to fracture detection, gait analysis, and knee osteoarthritis grading. Faster R-CNN and DenseNet-121 models have demonstrated strong performance in detecting hip and vertebral fractures, while CNN-LSTM frameworks have been effective in gait abnormality classification. Across these fields, the robustness of CNN models is influenced by dataset diversity, the integration of multi-modal imaging, and the selection of domain-specific architectures.

7. Challenges in Medical Image Analysis

Despite significant advancements in medical image analysis using CNNs, several challenges remain unresolved. These challenges arise due to the evolving complexity of clinical applications, the diversity of imaging modalities, and the limitations of current AI methodologies. Some of these challenges include

Limited Generalization Across Rare Diseases: Most CNN models are trained on datasets representing common diseases and standard imaging protocols, leaving rare diseases and unconventional imaging scenarios underrepresented [127]. This lack of diversity limits the applicability of these models in real-world scenarios involving rare pathologies or imaging abnormalities. Current transfer learning approaches only partially mitigate this issue, as they still require domain-specific tuning that is resource-intensive and time-consuming.
Multi-Dimensional and Multi-Modal Data Fusion: Integrating multi-dimensional data (e.g., 3D imaging, temporal sequences) with multi-modal inputs such as CT, MRI, and PET scans, along with clinical or genomic data, remains a challenge [128,129]. While early attempts have demonstrated potential, the lack of robust architectures to handle the increasing complexity and volume of such data has impeded progress. In particular, effectively aligning temporal and spatial dimensions in multi-modal data fusion is an open problem that hinders applications like precision medicine.
Data Augmentation for Real-World Variability: While data augmentation techniques have improved generalization, they often fail to account for real-world variations such as scanner artifacts, low-resolution images, and extreme cases of noise or occlusion [130]. Techniques for domain-specific augmentations, especially in dynamic clinical environments, remain underexplored, leaving CNNs susceptible to performance degradation in non-ideal imaging conditions.
Privacy-Preserving Model Training: The increasing emphasis on data privacy and security, especially with the enforcement of regulations such as the General Data Protection Regulation (GDPR) in Europe and the Health Insurance Portability and Accountability Act (HIPAA) in the United States, has made collaborative model training across institutions more complex. Techniques like federated learning and differential privacy have shown promise but face significant limitations in terms of scalability, performance, and robustness to adversarial attacks in sensitive medical domains [131].
Dynamic Adaptability to Evolving Clinical Needs: CNN models lack the flexibility to adapt dynamically to changes in clinical workflows, disease trends, or imaging technologies. For example, the COVID-19 pandemic exposed gaps in AI systems that could not pivot quickly to handle new diagnostic needs. Developing adaptive models that continuously learn from new data without requiring complete retraining remains a major hurdle [132].
Bias in AI Systems: Bias in CNN-based models remains a persistent challenge due to skewed datasets that fail to represent diverse patient populations [133]. Recent studies highlight significant disparities in model performance across demographic groups, raising ethical and clinical concerns [134,135]. Addressing this bias requires novel strategies for fairness-aware training and validation, which are still in their infancy.
Legal, Ethical, and Regulatory Challenges in Medical AI: The adoption of CNN-based models in medical image analysis raises significant ethical and legal concerns, particularly regarding patient privacy, informed consent, and accountability in AI-driven diagnoses. Regulations such as GDPR and HIPAA impose strict guidelines on medical data usage, yet AI models often require extensive training on sensitive patient information. Ensuring compliance with these regulations while maintaining high model performance remains a critical challenge [131,136].
Another major concern is algorithmic bias, where CNN models trained on imbalanced datasets may produce skewed outcomes across different demographic groups, leading to disparities in medical diagnoses and treatment recommendations [134,137]. Addressing this issue requires fairness-aware AI development practices, including bias mitigation techniques, transparency in model decisions, and rigorous validation across diverse populations. Moreover, legal accountability remains unclear in cases where CNN-based models provide incorrect diagnoses—determining liability between the model developers, healthcare institutions, and clinicians is an ongoing debate in AI ethics [138].
Future research must focus on integrating explainability mechanisms, ethical AI guidelines, and robust validation frameworks to ensure trustworthiness and regulatory compliance. The development of AI auditing standards and legal frameworks specific to medical imaging AI is crucial to fostering responsible AI adoption in clinical settings.

8. Trends and Future Research Directions

Recent trends in medical image analysis reflect a shift towards addressing the unresolved challenges through innovative approaches. Below, we highlight recent trends and potential future directions in this domain.

Self-Supervised and Semi-Supervised Learning: Recent advances in SSL have shown potential in reducing dependence on annotated datasets by leveraging large-scale, unlabeled data for feature learning. Models like Vision Transformers (ViTs) integrated with SSL are being explored for segmentation and classification tasks in medical imaging [139,140]. Future work could focus on combining SSL with domain-specific augmentation techniques to improve model performance in rare disease detection.
Federated and Decentralized Learning Frameworks: Federated learning has gained popularity as a privacy-preserving approach for training CNN models across multiple institutions without sharing raw data. Recent trends include integrating federated learning with blockchain for enhanced security and transparency [141,142,143]. Future research could explore decentralized learning protocols that address communication overheads and ensure equitable model performance across diverse institutions.
Explainable and Interpretable AI: The demand for explainable AI (XAI) models has led to the development of advanced visualization tools, such as attention-based mechanisms and counterfactual explanations [144]. Future directions may include integrating XAI with uncertainty quantification techniques to improve the reliability of AI-driven clinical decisions and enhance trust among healthcare professionals.
Multi-Modal and Cross-Domain Learning: Several recent studies have focused on multi-modal learning frameworks that combine imaging data with genomics, clinical records, and wearable sensor data [128,129,145]. Future efforts could aim to standardize data formats and develop architectures capable of seamlessly integrating cross-domain inputs to enable holistic disease modeling and precision medicine.
Synthetic Data Generation and Diffusion Models: Synthetic data generation is increasingly recognized as a crucial approach for addressing data scarcity and privacy concerns in medical imaging. Recent advancements in CNN-based generative techniques, such as diffusion models and generative adversarial networks, have enabled the creation of high-fidelity, anonymized medical images that closely resemble real-world patient data. These synthetic datasets have been applied to augment training samples, improve model generalization, and support rare disease classification [146]. Future research could explore the integration of diffusion models with federated learning frameworks to facilitate collaborative training while ensuring data privacy and regulatory compliance.
Low-Resource AI Models for Global Health Applications: There is an increasing trend towards developing lightweight CNN models optimized for deployment in low-resource settings, such as rural clinics and developing countries. These models aim to balance computational efficiency with diagnostic accuracy [147,148]. Future work could focus on hardware–software co-design to ensure energy-efficient and robust AI systems for global health applications.
Integration with Augmented Reality and Virtual Reality: The use of augmented reality (AR) and virtual reality (VR) in medical imaging, coupled with AI-driven analytics, is becoming popular in applications like surgical planning and education [149,150]. Future research could explore the integration of CNNs with AR/VR systems to enhance real-time visualization and decision-making during complex medical procedures.

9. Conclusions

CNNs have demonstrated remarkable potential in medical image analysis, enabling advancements in disease detection, segmentation, and multi-modal imaging across various medical domains. Their ability to extract hierarchical features from medical images has led to improved diagnostic accuracy, automated workflows, and enhanced clinical decision-making. Despite these successes, several challenges persist, particularly in handling rare diseases, ensuring interpretability, and addressing biases in model training. Moreover, privacy concerns and the need for real-world adaptability highlight the necessity for novel learning paradigms that balance performance, security, and ethical considerations.

To address these challenges, recent research has shifted towards innovative approaches such as self-supervised learning, federated learning, and explainable AI. The integration of CNNs with synthetic data generation methods, including diffusion models, has emerged as a promising solution to overcome data scarcity while preserving patient privacy. Additionally, multi-modal learning frameworks that incorporate imaging, genomics, and wearable sensor data offer a more holistic approach to disease modeling. These advancements, combined with efforts in low-resource AI deployment, have the potential to enhance accessibility and fairness in medical AI applications worldwide.

Future developments should focus on refining CNN-based methodologies to ensure clinical robustness, regulatory compliance, and seamless integration with healthcare systems. Collaborative efforts between researchers, clinicians, and policymakers will be essential to drive AI adoption in real-world medical environments. By advancing interpretability, privacy-preserving learning, and efficient deployment strategies, CNNs will continue to transform medical imaging, paving the way for more precise, ethical, and accessible AI-driven healthcare solutions.

Author Contributions

Conceptualization, I.D.M. and G.O.; methodology, I.D.M., M.J. and P.I.; validation, I.D.M., T.G.S., G.O., M.J. and P.I.; investigation, I.D.M., T.G.S., G.O., M.J. and P.I.; writing—original draft preparation, I.D.M., T.G.S., G.O., M.J. and P.I.; writing—review and editing, I.D.M., T.G.S., G.O., M.J. and P.I.; visualization, I.D.M., G.O., M.J. and P.I.; supervision, T.G.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AD	Alzheimer’s disease
AI	Artificial intelligence
AR	Augmented reality
AUC	Area under the curve
CAD	Coronary artery disease
CNN	Convolutional neural network
CT	Computed tomography
DL	Deep learning
DSC	Dice similarity coefficient
FNR	False negative rate
FPR	False positive rate
GDPR	General data protection regulation
HIPAA	Health Insurance Portability and Accountability Act
IoU	Intersection over union
LSTM	Long short-term memory
MSE	Mean squared error
ML	Machine learning
MRI	Magnetic resonance imaging
PD	Parkinson’s disease
PET	Positron emission tomography
PSNR	Peak signal-to-noise ratio
RNN	Recurrent neural network
SSL	Self-supervised learning
SSIM	Structural similarity index measure
VR	Virtual reality
XAI	Explainable artificial intelligence

References

Sultana, F.; Sufian, A.; Dutta, P. A review of object detection models based on convolutional neural network. In Intelligent Computing: Image Processing Based Applications; Springer: Singapore, 2020; pp. 1–16. [Google Scholar]
Bernal, J.; Kushibar, K.; Asfaw, D.S.; Valverde, S.; Oliver, A.; Martí, R.; Lladó, X. Deep convolutional neural networks for brain image analysis on magnetic resonance imaging: A review. Artif. Intell. Med. 2019, 95, 64–81. [Google Scholar] [CrossRef] [PubMed]
Sun, Y.; Xue, B.; Zhang, M.; Yen, G.G. Evolving Deep Convolutional Neural Networks for Image Classification. IEEE Trans. Evol. Comput. 2020, 24, 394–407. [Google Scholar] [CrossRef]
O’Halloran, T.; Obaido, G.; Otegbade, B.; Mienye, I.D. A deep learning approach for Maize Lethal Necrosis and Maize Streak Virus disease detection. Mach. Learn. Appl. 2024, 16, 100556. [Google Scholar] [CrossRef]
Meena, G.; Mohbey, K.K.; Indian, A.; Khan, M.Z.; Kumar, S. Identifying emotions from facial expressions using a deep convolutional neural network-based approach. Multimed. Tools Appl. 2023, 83, 15711–15732. [Google Scholar] [CrossRef]
Russel, N.S.; Selvaraj, A. MultiScaleCrackNet: A parallel multiscale deep CNN architecture for concrete crack classification. Expert Syst. Appl. 2024, 249, 123658. [Google Scholar] [CrossRef]
Agrawal, A.; Mittal, N. Using CNN for facial expression recognition: A study of the effects of kernel size and number of filters on accuracy. Vis. Comput. 2019, 36, 405–412. [Google Scholar] [CrossRef]
Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 1–74. [Google Scholar] [CrossRef]
Obaido, G.; Mienye, I.D.; Egbelowo, O.F.; Emmanuel, I.D.; Ogunleye, A.; Ogbuokiri, B.; Mienye, P.; Aruleba, K. Supervised machine learning in drug discovery and development: Algorithms, applications, challenges, and prospects. Mach. Learn. Appl. 2024, 17, 100576. [Google Scholar] [CrossRef]
Taye, M.M. Understanding of Machine Learning with Deep Learning: Architectures, Workflow, Applications and Future Directions. Computers 2023, 12, 91. [Google Scholar] [CrossRef]
Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef]
Acharya, U.R.; Oh, S.L.; Hagiwara, Y.; Tan, J.H.; Adam, M.; Gertych, A.; Tan, R.S. A deep convolutional neural network model to classify heartbeats. Comput. Biol. Med. 2017, 89, 389–396. [Google Scholar] [CrossRef] [PubMed]
Krichen, M. Convolutional Neural Networks: A Survey. Computers 2023, 12, 151. [Google Scholar] [CrossRef]
Mienye, I.D.; Kenneth Ainah, P.; Emmanuel, I.D.; Esenogho, E. Sparse noise minimization in image classification using Genetic Algorithm and DenseNet. In Proceedings of the 2021 Conference on Information Communications Technology and Society (ICTAS), Durban, South Africa, 10–11 March 2021; pp. 103–108. [Google Scholar] [CrossRef]
Ahmed, H.; Hamad, S.; Shedeed, H.A.; Hussein, A.S. Enhanced Deep Learning Model for Personalized Cancer Treatment. IEEE Access 2022, 10, 106050–106058. [Google Scholar] [CrossRef]
Rasheed, K.; Qayyum, A.; Ghaly, M.; Al-Fuqaha, A.; Razi, A.; Qadir, J. Explainable, trustworthy, and ethical machine learning for healthcare: A survey. Comput. Biol. Med. 2022, 149, 106043. [Google Scholar] [CrossRef]
Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; Van Der Laak, J.A.; Van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef]
Alzubaidi, M.; Zubaydi, H.D.; Bin-Salem, A.A.; Abd-Alrazaq, A.A.; Ahmed, A.; Househ, M. Role of deep learning in early detection of COVID-19: Scoping review. Comput. Methods Programs Biomed. Update 2021, 1, 100025. [Google Scholar] [CrossRef]
Ferrag, M.A.; Friha, O.; Kantarci, B.; Tihanyi, N.; Cordeiro, L.; Debbah, M.; Hamouda, D.; Al-Hawawreh, M.; Choo, K.K.R. Edge Learning for 6G-Enabled Internet of Things: A Comprehensive Survey of Vulnerabilities, Datasets, and Defenses. IEEE Commun. Surv. Tutor. 2023, 25, 2654–2713. [Google Scholar] [CrossRef]
Xu, Y.; Quan, R.; Xu, W.; Huang, Y.; Chen, X.; Liu, F. Advances in Medical Image Segmentation: A Comprehensive Review of Traditional, Deep Learning and Hybrid Approaches. Bioengineering 2024, 11, 1034. [Google Scholar] [CrossRef]
Zhang, M.; Gu, S.; Shi, Y. The use of deep learning methods in low-dose computed tomography image reconstruction: A systematic review. Complex Intell. Syst. 2022, 8, 5545–5561. [Google Scholar] [CrossRef]
Bhatia, S.; Pandey, S.K.; Kumar, A.; Alshuhail, A. Classification of electrocardiogram signals based on hybrid deep learning models. Sustainability 2022, 14, 16572. [Google Scholar] [CrossRef]
Hermessi, H.; Mourali, O.; Zagrouba, E. Multimodal medical image fusion review: Theoretical background and recent advances. Signal Process. 2021, 183, 108036. [Google Scholar] [CrossRef]
Ferdinand, V.A.; Nawir, V.; Henry, G.E.; Gunawan, A.A.S.; Anderies. Effect of Image Enhancement in CNN-Based Medical Image Classification: A Systematic Literature Review. In Proceedings of the 2022 5th International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, Indonesia, 24–25 August 2022; pp. 87–92. [Google Scholar]
Bhutto, J.A.; Tian, L.; Du, Q.; Sun, Z.; Yu, L.; Tahir, M.F. CT and MRI medical image fusion using noise-removal and contrast enhancement scheme with convolutional neural network. Entropy 2022, 24, 393. [Google Scholar] [CrossRef] [PubMed]
Palanisamy, P.; Padmanabhan, A.; Ramasamy, A.; Subramaniam, S. Remote Patient Activity Monitoring System by Integrating IoT Sensors and Artificial Intelligence Techniques. Sensors 2023, 23, 5869. [Google Scholar] [CrossRef] [PubMed]
Mahmood, T.; Rehman, A.; Saba, T.; Nadeem, L.; Bahaj, S.A.O. Recent Advancements and Future Prospects in Active Deep Learning for Medical Image Segmentation and Classification. IEEE Access 2023, 11, 113623–113652. [Google Scholar] [CrossRef]
Yao, W.; Bai, J.; Liao, W.; Chen, Y.; Liu, M.; Xie, Y. From CNN to transformer: A review of medical image segmentation models. J. Imaging Inform. Med. 2024, 37, 1529–1547. [Google Scholar] [CrossRef]
Kshatri, S.S.; Singh, D. Convolutional neural network in medical image analysis: A review. Arch. Comput. Methods Eng. 2023, 30, 2793–2810. [Google Scholar] [CrossRef]
Feng, P.; Tang, Z. A survey of visual neural networks: Current trends, challenges and opportunities. Multimed. Syst. 2023, 29, 693–724. [Google Scholar] [CrossRef]
Suganyadevi, S.; Seethalakshmi, V.; Balasamy, K. A review on deep learning in medical image analysis. Int. J. Multimed. Inf. Retr. 2021, 11, 19–38. [Google Scholar] [CrossRef]
Taye, M.M. Theoretical understanding of convolutional neural network: Concepts, architectures, applications, future directions. Computation 2023, 11, 52. [Google Scholar] [CrossRef]
Seyrek, E.C.; Uysal, M. A comparative analysis of various activation functions and optimizers in a convolutional neural network for hyperspectral image classification. Multimed. Tools Appl. 2023, 83, 53785–53816. [Google Scholar] [CrossRef]
Ozdemir, C.; Dogan, Y.; Kaya, Y. A new local pooling approach for convolutional neural network: Local binary pattern. Multimed. Tools Appl. 2024, 83, 34137–34151. [Google Scholar] [CrossRef]
Walter, B. Analysis of convolutional neural network image classifiers in a hierarchical max-pooling model with additional local pooling. J. Stat. Plan. Inference 2023, 224, 109–126. [Google Scholar] [CrossRef]
Reus-Muns, G.; Alemdar, K.; Sanchez, S.G.; Roy, D.; Chowdhury, K.R. AirFC: Designing Fully Connected Layers for Neural Networks with Wireless Signals. In Proceedings of the Twenty-fourth International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing, Washington, DC, USA, 23–26 October 2023; pp. 71–80. [Google Scholar]
Zhu, S.; Yu, C.; Hu, J. Regularizing deep neural networks for medical image analysis with augmented batch normalization. Appl. Soft Comput. 2024, 154, 111337. [Google Scholar] [CrossRef]
Salehin, I.; Kang, D.K. A review on dropout regularization approaches for deep neural networks within the scholarly domain. Electronics 2023, 12, 3106. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Cong, S.; Zhou, Y. A review of convolutional neural network architectures and their optimizations. Artif. Intell. Rev. 2022, 56, 1905–1969. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Wang, Z.; Zheng, X.; Li, D.; Zhang, H.; Yang, Y.; Pan, H. A VGGNet-like approach for classifying and segmenting coal dust particles with overlapping regions. Comput. Ind. 2021, 132, 103506. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
Allah, A.M.G.; Sarhan, A.M.; Elshennawy, N.M. Edge U-Net: Brain tumor segmentation using MRI based on deep U-Net model with boundary information. Expert Syst. Appl. 2023, 213, 118833. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. arXiv 2015, arXiv:1512.03385. [Google Scholar]
Kumar, V.; Prabha, C.; Sharma, P.; Mittal, N.; Askar, S.S.; Abouhawwash, M. Unified deep learning models for enhanced lung cancer prediction with ResNet-50–101 and EfficientNet-B3 using DICOM images. BMC Med. Imaging 2024, 24, 63. [Google Scholar] [CrossRef] [PubMed]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar] [CrossRef]
Huang, Z.; Zhu, X.; Ding, M.; Zhang, X. Medical image classification using a light-weighted hybrid neural network based on PCANet and DenseNet. IEEE Access 2020, 8, 24697–24712. [Google Scholar] [CrossRef]
Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
Shah, H.A.; Saeed, F.; Yun, S.; Park, J.H.; Paul, A.; Kang, J.M. A robust approach for brain tumor detection in magnetic resonance images using finetuned efficientnet. IEEE Access 2022, 10, 65426–65438. [Google Scholar] [CrossRef]
Khalighi, S.; Reddy, K.; Midya, A.; Pandav, K.B.; Madabhushi, A.; Abedalthagafi, M. Artificial intelligence in neuro-oncology: Advances and challenges in brain tumor diagnosis, prognosis, and precision treatment. NPJ Precis. Oncol. 2024, 8, 80. [Google Scholar] [CrossRef]
Aruleba, K.; Obaido, G.; Ogbuokiri, B.; Fadaka, A.O.; Klein, A.; Adekiya, T.A.; Aruleba, R.T. Applications of computational methods in biomedical breast cancer imaging diagnostics: A review. J. Imaging 2020, 6, 105. [Google Scholar] [CrossRef]
Sahu, A.; Das, P.K.; Meher, S. An efficient deep learning scheme to detect breast cancer using mammogram and ultrasound breast images. Biomed. Signal Process. Control. 2024, 87, 105377. [Google Scholar] [CrossRef]
Bouzar-Benlabiod, L.; Harrar, K.; Yamoun, L.; Khodja, M.Y.; Akhloufi, M.A. A novel breast cancer detection architecture based on a CNN-CBR system for mammogram classification. Comput. Biol. Med. 2023, 163, 107133. [Google Scholar] [CrossRef]
Das, T.; Nayak, D.S.K.; Kar, A.; Jena, L.; Swarnkar, T. ResNet-50: The Deep Networks for Automated Breast Cancer Classification using MR Images. In Proceedings of the 2024 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC), Mumbai, India, 27–28 January 2024; pp. 1–6. [Google Scholar]
McKinney, S.M.; Sieniek, M.; Godbole, V.; Godwin, J.; Antropova, N.; Ashrafian, H.; Back, T.; Chesus, M.; Corrado, G.S.; Darzi, A.; et al. International evaluation of an AI system for breast cancer screening. Nature 2020, 577, 89–94. [Google Scholar] [CrossRef]
Mahoro, E.; Akhloufi, M.A. Breast cancer classification on thermograms using deep CNN and transformers. Quant. InfraRed Thermogr. J. 2024, 21, 30–49. [Google Scholar] [CrossRef]
UrRehman, Z.; Qiang, Y.; Wang, L.; Shi, Y.; Yang, Q.; Khattak, S.U.; Aftab, R.; Zhao, J. Effective lung nodule detection using deep CNN with dual attention mechanisms. Sci. Rep. 2024, 14, 3934. [Google Scholar] [CrossRef]
Safta, W.; Shaffie, A. Advancing pulmonary nodule diagnosis by integrating Engineered and Deep features extracted from CT scans. Algorithms 2024, 17, 161. [Google Scholar] [CrossRef]
Gayathiri, P.; Anushya, R.; Nihamathullah, S.; Ponraj, M.; Suguna, R.; Madhu, R. Enhancing Lung Cancer Prediction using Alexnet. In Proceedings of the 2024 International Conference on Cognitive Robotics and Intelligent Systems (ICC-ROBINS), Coimbatore, India, 17–19 April 2024; pp. 650–655. [Google Scholar]
Li, J.; Liu, H.; Liu, W.; Zong, P.; Huang, K.; Li, Z.; Li, H.; Xiong, T.; Tian, G.; Li, C.; et al. Predicting gastric cancer tumor mutational burden from histopathological images using multimodal deep learning. Briefings Funct. Genom. 2024, 23, 228–238. [Google Scholar] [CrossRef] [PubMed]
Khan, R.A.; Fu, M.; Burbridge, B.; Luo, Y.; Wu, F.X. A multi-modal deep neural network for multi-class liver cancer diagnosis. Neural Netw. 2023, 165, 553–561. [Google Scholar] [CrossRef] [PubMed]
Raju, A.S.N.; Venkatesh, K.; Rajababu, M.; Gatla, R.K.; Eid, M.M.; Ali, E.; Titova, N.; Sharaf, A.B.A. A hybrid framework for colorectal cancer detection and U-Net segmentation using polynetDWTCADx. Sci. Rep. 2025, 15, 847. [Google Scholar] [CrossRef]
Kiran, A.; Narayanasamy, N.; Ramesh, J.V.N.; Ahmad, M.W. A novel deep learning framework for accurate melanoma diagnosis integrating imaging and genomic data for improved patient outcomes. Skin Res. Technol. 2024, 30, e13770. [Google Scholar] [CrossRef]
Arshad Choudhry, I.; Iqbal, S.; Alhussein, M.; Aurangzeb, K.; Qureshi, A.N.; Hussain, A. A Novel Interpretable Graph Convolutional Neural Network for Multimodal Brain Tumor Segmentation. Cogn. Comput. 2025, 17, 1–25. [Google Scholar] [CrossRef]
Iqbal, M.S.; Belal Bin Heyat, M.; Parveen, S.; Ammar Bin Hayat, M.; Roshanzamir, M.; Alizadehsani, R.; Akhtar, F.; Sayeed, E.; Hussain, S.; Hussein, H.S.; et al. Progress and trends in neurological disorders research based on deep learning. Comput. Med Imaging Graph. 2024, 116, 102400. [Google Scholar] [CrossRef]
Mahmood, T.; Rehman, A.; Saba, T.; Wang, Y.; Alamri, F.S. Alzheimer’s disease unveiled: Cutting-edge multi-modal neuroimaging and computational methods for enhanced diagnosis. Biomed. Signal Process. Control 2024, 97, 106721. [Google Scholar] [CrossRef]
Castellano, G.; Esposito, A.; Lella, E.; Montanaro, G.; Vessio, G. Automated detection of Alzheimer’s disease: A multi-modal approach with 3D MRI and amyloid PET. Sci. Rep. 2024, 14, 5210. [Google Scholar] [CrossRef]
El-Assy, A.; Amer, H.M.; Ibrahim, H.; Mohamed, M. A novel CNN architecture for accurate early detection and classification of Alzheimer’s disease using MRI data. Sci. Rep. 2024, 14, 3463. [Google Scholar] [CrossRef]
Aggarwal, N.; Saini, B.; Gupta, S. A deep 1-D CNN learning approach with data augmentation for classification of Parkinson’s disease and scans without evidence of dopamine deficit (SWEDD). Biomed. Signal Process. Control 2024, 91, 106008. [Google Scholar] [CrossRef]
Huang, G.H.; Lai, W.C.; Chen, T.B.; Hsu, C.C.; Chen, H.Y.; Wu, Y.C.; Yeh, L.R. Deep Convolutional Neural Networks on Multiclass Classification of Three-Dimensional Brain Images for Parkinson’s Disease Stage Prediction. J. Imaging Inform. Med. 2025, 1–17. [Google Scholar] [CrossRef] [PubMed]
Frasca, M.; Torre, D.L.; Pravettoni, G.; Cutica, I. Combining convolution neural networks with long-short term memory layers to predict Parkinson’s disease progression. Int. Trans. Oper. Res. 2023, 32, 2159–2188. [Google Scholar] [CrossRef]
Kaya, B.; Önal, M. A CNN transfer learning-based approach for segmentation and classification of brain stroke from noncontrast CT images. Int. J. Imaging Syst. Technol. 2023, 33, 1335–1352. [Google Scholar] [CrossRef]
Tahyudin, I.; Isnanto, R.R.; Prabuwono, A.S.; Hariguna, T.; Winarto, E.; Nazwan, N.; Tikaningsih, A.; Lestari, P.; Rozak, R.A. High-Accuracy Stroke Detection System Using a CBAM-ResNet18 Deep Learning Model on Brain CT Images. J. Appl. Data Sci. 2025, 6, 788–799. [Google Scholar] [CrossRef]
Li, Z.; Chen, B.; Zhu, N.; Li, W.; Liu, T.; Guo, L.; Han, J.; Zhang, T.; Yan, Z. Epileptic Seizure Detection in SEEG Signals using a Signal Embedding Temporal-Spatial-Spectral Transformer Model. IEEE Trans. Instrum. Meas. 2025, 74, 4001111. [Google Scholar] [CrossRef]
Chen, J.; Wen, B.; Yi, L.; Zhou, Y.; Chen, H.; Zhang, Z. Research on Automatic Detection of Epileptic Seizures Based on the CNN-Transformer Model. In Proceedings of the 2024 7th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), Hangzhou, China, 15–17 August 2024; pp. 839–843. [Google Scholar] [CrossRef]
Kode, H.; Elleithy, K.; Almazedah, L. Epileptic Seizure detection in EEG signals using Machine Learning and Deep Learning Techniques. IEEE Access 2024, 12, 80657–80668. [Google Scholar] [CrossRef]
Lüscher, T.F.; Wenzl, F.A.; D’Ascenzo, F.; Friedman, P.A.; Antoniades, C. Artificial intelligence in cardiovascular medicine: Clinical applications. Eur. Heart J. 2024, 45, 4291–4304. [Google Scholar] [CrossRef]
Sadr, H.; Salari, A.; Ashoobi, M.T.; Nazari, M. Cardiovascular disease diagnosis: A holistic approach using the integration of machine learning and deep learning models. Eur. J. Med. Res. 2024, 29, 455. [Google Scholar] [CrossRef]
Deepika, S.; Jaisankar, N. Detecting and Classifying Myocardial Infarction in Echocardiogram Frames with an Enhanced CNN Algorithm and ECV-3D Network. IEEE Access 2024, 12, 51690–51703. [Google Scholar] [CrossRef]
Rahman, T.; Al-Ruweidi, M.K.A.; Sumon, M.S.I.; Kamal, R.Y.; Chowdhury, M.E.; Yalcin, H.C. Deep Learning Technique for Congenital Heart Disease Detection using Stacking-based CNN-LSTM Models from Fetal Echocardiogram: A Pilot Study. IEEE Access 2023, 11, 110375–110390. [Google Scholar] [CrossRef]
Germain, P.; Labani, A.; Vardazaryan, A.; Padoy, N.; Roy, C.; El Ghannudi, S. Segmentation-Free Estimation of Left Ventricular Ejection Fraction Using 3D CNN Is Reliable and Improves as Multiple Cardiac MRI Cine Orientations Are Combined. Biomedicines 2024, 12, 2324. [Google Scholar] [CrossRef] [PubMed]
El-Taraboulsi, J.; Cabrera, C.P.; Roney, C.; Aung, N. Deep neural network architectures for cardiac image segmentation. Artif. Intell. Life Sci. 2023, 4, 100083. [Google Scholar] [CrossRef]
Nie, X.; Chai, B.; Zhang, K.; Liu, C.; Li, Z.; Huang, R.; Wei, Q.; Huang, M.; Huang, W. Improved Cascade-RCNN for automatic detection of coronary artery plaque in multi-angle fusion CPR images. Biomed. Signal Process. Control 2025, 99, 106880. [Google Scholar] [CrossRef]
Sadad, T.; Safran, M.; Khan, I.; Alfarhood, S.; Khan, R.; Ashraf, I. Efficient classification of ECG images using a lightweight CNN with attention module and IoT. Sensors 2023, 23, 7697. [Google Scholar] [CrossRef]
Luo, W. Deep CNN for ECG-based arrhythmia classification. J. Biomed. AI 2023, 17, 320–334. [Google Scholar]
Ren, H.; Jing, F.; Chen, Z.; He, S.; Zhou, J.; Liu, L.; Jing, R.; Lian, W.; Tian, J.; Zhang, Q.; et al. CheXMed: A multimodal learning algorithm for pneumonia detection in the elderly. Inf. Sci. 2024, 654, 119854. [Google Scholar] [CrossRef]
Rani, R.; Gupta, S. Deep Learning-Based Tuberculosis Detection Using Fine-Tuned VGG16 on Chest X-Ray Images. In Proceedings of the 2024 3rd International Conference for Advancement in Technology (ICONAT), Goa, India, 6–8 September 2024; pp. 1–5. [Google Scholar]
Prasetyo, S.Y. Automated Pulmonary Tuberculosis Detection in Chest Radiographs using Pretrained DCNN Models. In Proceedings of the 2024 International Conference on Information Management and Technology (ICIMTech), Bali, Indonesia, 28–29 August 2024; pp. 195–200. [Google Scholar]
Polat, Ö.; Şalk, İ.; Doğan, Ö.T. Determination of COPD severity from chest CT images using deep transfer learning network. Multimed. Tools Appl. 2022, 81, 21903–21917. [Google Scholar] [CrossRef]
Zhang, Y.; Huang, Q.; Sun, W.; Chen, F.; Lin, D.; Chen, F. Research on lung sound classification model based on dual-channel CNN-LSTM algorithm. Biomed. Signal Process. Control 2024, 94, 106257. [Google Scholar] [CrossRef]
Pu, J.; Gezer, N.S.; Ren, S.; Alpaydin, A.O.; Avci, E.R.; Risbano, M.G.; Rivera-Lebron, B.; Chan, S.Y.W.; Leader, J.K. Automated detection and segmentation of pulmonary embolisms on computed tomography pulmonary angiography (CTPA) using deep learning but without manual outlining. Med. Image Anal. 2023, 89, 102882. [Google Scholar] [CrossRef]
Vadhera, R.; Sharma, M. A novel hybrid loss-based Encoder–Decoder model for accurate Pulmonary Embolism segmentation. Int. J. Inf. Technol. 2025, 1–15. [Google Scholar] [CrossRef]
Chunduri, V.; Hannan, S.A.; Devi, G.M.; Nomula, V.K.; Tripathi, V.; Rajest, S.S. Deep Convolutional Neural Networks for Lung Segmentation for Diffuse Interstitial Lung Disease on HRCT and Volumetric CT. In Optimizing Intelligent Systems for Cross-Industry Application; IGI Global: Hershey, PA, USA, 2024; pp. 335–350. [Google Scholar]
Singh, M.; Dalmia, S.; Ranjan, R.K. Detection of diabetic retinopathy and age-related macular degeneration using DenseNet based neural networks. In Multimedia Tools and Applications; Springer: Berlin/Heidelberg, Germany, 2024; pp. 1–28. [Google Scholar]
Al-Antary, M.T.; Arafa, Y. Multi-scale attention network for diabetic retinopathy classification. IEEE Access 2021, 9, 54190–54200. [Google Scholar] [CrossRef]
Gayatri, K.; Biswal, B. Classification of Muti-Labeled Retinal Diseases in Retinal Fundus Images Using CNN Model ResNet18. In Proceedings of the International Conference on Data Science and Network Engineering, Agartala, India, 12–13 July 2024; pp. 163–177. [Google Scholar]
Das, D.; Nayak, D.R. Gs-net: Global self-attention guided cnn for multi-stage glaucoma classification. In Proceedings of the 2023 IEEE International Conference on Image Processing (ICIP), Kuala Lumpur, Malaysia, 8–11 October 2023; pp. 3454–3458. [Google Scholar]
Azizi, M.M.; Abhari, S.; Sajedi, H. Stitched vision transformer for age-related macular degeneration detection using retinal optical coherence tomography images. PLoS ONE 2024, 19, e0304943. [Google Scholar] [CrossRef] [PubMed]
Li, R. VGG16-based ensemble learning for AMD classification in fundus images. J. Retin. Imaging AI 2024, 15, 210–225. [Google Scholar]
Verma, S.; Singh, I.; Ray, K.; Patra, R.; Das, D. iCAT: Intelligent Cataract Detection Using Deep Neural in Smartphone. In Proceedings of the 2022 IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE), Naya Raipur, India, 30–31 December 2022; pp. 126–131. [Google Scholar]
Zhang, T. Attention-based CNN for cataract detection and severity grading. Ophthalmic Deep Learn. J. 2024, 19, 135–149. [Google Scholar]
Lee, A.K.W.; Chan, L.K.W.; Lee, C.H.; Bohórquez, J.M.C.; Haykal, D.; Wan, J.; Yi, K.H. Artificial Intelligence Application in Diagnosing, Classifying, Localizing, Detecting and Estimation the Severity of Skin Condition in Aesthetic Medicine: A Review. Dermatol. Rev. 2025, 6, e70015. [Google Scholar] [CrossRef]
Strzelecki, M.; Kociołek, M.; Strąkowska, M.; Kozłowski, M.; Grzybowski, A.; Szczypiński, P.M. Artificial intelligence in the detection of skin cancer: State of the art. Clin. Dermatol. 2024, 42, 280–295. [Google Scholar] [CrossRef]
Toprak, A.N.; Aruk, I. A Hybrid Convolutional Neural Network Model for the Classification of Multi-Class Skin Cancer. Int. J. Imaging Syst. Technol. 2024, 34, e23180. [Google Scholar] [CrossRef]
Armağan, S.; Gündoğan, E.; Kaya, M. Classification of Skin Lesions using Squeeze and Excitation Attention based Hybrid Model of DenseNet and EfficientNet. In Proceedings of the 2024 International Conference on Decision Aid Sciences and Applications (DASA), Manama, Bahrain, 11–12 December 2024; pp. 1–5. [Google Scholar]
Aghdam, E.K.; Azad, R.; Zarvani, M.; Merhof, D. Attention swin u-net: Cross-contextual attention mechanism for skin lesion segmentation. In Proceedings of the 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), Cartagena, Colombia, 18–21 April 2023; pp. 1–5. [Google Scholar]
Reddy, D.A.; Roy, S.; Kumar, S.; Tripathi, R. Enhanced U-Net segmentation with ensemble convolutional neural network for automated skin disease classification. Knowl. Inf. Syst. 2023, 65, 4111–4156. [Google Scholar] [CrossRef]
Khasanah, N.; Winnarto, M.N. Application of Deep Learning with ResNet50 for Early Detection of Melanoma Skin Cancer. J. Med. Inform. Technol. 2024, 2, 16–20. [Google Scholar] [CrossRef]
Xiao, C.; Zhu, A.; Xia, C.; Liu, Y.; Qiu, Z.; Wang, Q.; Ren, W.; Shan, D.; Wang, T.; Guo, L.; et al. Dual-branch multimodal fusion network for skin lesions diagnosis using clinical and ultrasound image. In Proceedings of the 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), Cartagena, Colombia, 18–21 April 2023; pp. 1–4. [Google Scholar]
Pintelas, E.; Livieris, I.E.; Tampakas, V.; Pintelas, P. Feature augmentation-based CNN framework for skin-cancer diagnosis. Evol. Syst. 2025, 16, 34. [Google Scholar] [CrossRef]
Scherer, J.; De Wet, D.; Youssef, Y.; Back, D.A. Digitalization in orthopedics. In The Digital Doctor; Elsevier: Amsterdam, The Netherlands, 2025; pp. 275–290. [Google Scholar]
Mohammed, T.J.; Xinying, C.; Alnoor, A.; Khaw, K.W.; Albahri, A.S.; Teoh, W.L.; Chong, Z.L.; Saha, S. A Systematic Review of Artificial Intelligence in Orthopaedic Disease Detection: A Taxonomy for Analysis and Trustworthiness Evaluation. Int. J. Comput. Intell. Syst. 2024, 17, 303. [Google Scholar] [CrossRef]
Tabarestani, S.S.; Aghagolzadeh, A.; Ezoji, M. Bone Fracture Detection and Localization on MURA Database Using Faster-RCNN. In Proceedings of the 2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS), Tehran, Iran, 29–30 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
Chen, Y.J.; Chen, D.Y.; Lan, H.C.; Huang, A.C.; Chen, Y.H.; Huang, W.N.; Chen, H.H. An optimal deep learning model for the scoring of radiographic damage in patients with ankylosing spondylitis. Ther. Adv. Musculoskelet. Dis. 2024, 16, 1759720X241285973. [Google Scholar] [CrossRef] [PubMed]
Ashkani-Esfahani, S.; Yazdi, R.M.; Bhimani, R.; Kerkhoffs, G.M.; Maas, M.; DiGiovanni, C.W.; Lubberts, B.; Guss, D. Detection of ankle fractures using deep learning algorithms. Foot Ankle Surg. 2022, 28, 1259–1265. [Google Scholar] [CrossRef]
Liu, L.; Chang, J.; Zhang, P.; Ma, Q.; Zhang, H.; Sun, T.; Qiao, H. A joint multi-modal learning method for early-stage knee osteoarthritis disease classification. Heliyon 2023, 9, e15461. [Google Scholar] [CrossRef]
Wirth, W.; Eckstein, F.; Kemnitz, J.; Baumgartner, C.F.; Konukoglu, E.; Fuerst, D.; Chaudhari, A.S. Accuracy and longitudinal reproducibility of quantitative femorotibial cartilage measures derived from automated U-Net-based segmentation of two different MRI contrasts: Data from the osteoarthritis initiative healthy reference cohort. Magn. Reson. Mater. Phys. Biol. Med. 2021, 34, 337–354. [Google Scholar] [CrossRef]
Yeh, L.R.; Zhang, Y.; Chen, J.H.; Liu, Y.L.; Wang, A.C.; Yang, J.Y.; Yeh, W.C.; Cheng, C.S.; Chen, L.K.; Su, M.Y. A deep learning-based method for the diagnosis of vertebral fractures on spine MRI: Retrospective training and validation of ResNet. Eur. Spine J. 2022, 31, 2022–2030. [Google Scholar] [CrossRef]
Xing, P.; Zhang, L.; Wang, T.; Wang, L.; Xing, W.; Wang, W. A deep learning algorithm that aids visualization of femoral neck fractures and improves physician training. Injury 2024, 55, 111997. [Google Scholar] [CrossRef]
Iyer, S.; Blair, A.; White, C.; Dawes, L.; Moses, D.; Sowmya, A. Vertebral Compression Fracture detection using Multiple Instance Learning and Majority Voting. In Proceedings of the 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada, 21–25 August 2022; pp. 4630–4636. [Google Scholar] [CrossRef]
We, C. Daily Activity Recognition with Gait Segmentation and Hybrid Neural Networks. In Proceedings of the 2024 5th International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI), Nanchang, China, 27–29 September 2024; pp. 299–302. [Google Scholar] [CrossRef]
Alves, V.M.; Dos Santos Cardoso, J.; Gama, J. Classification of Pulmonary Nodules in 2-[18F] FDG PET/CT Images with a 3D Convolutional Neural Network. Nucl. Med. Mol. Imaging 2024, 58, 9–24. [Google Scholar] [CrossRef]
Liu, Y. Multi-modal CNN fusion for MACE prediction. AI Cardiol. 2024, 25, 210–225. [Google Scholar]
Patel, V.; Bhatti, D.; Ganatra, A.; Tailor, J. EEG-based epileptic seizure state detection using deep learning. Int. J. Model. Identif. Control. 2024, 44, 57–66. [Google Scholar] [CrossRef]
Junayed, M.S.; Islam, M.B.; Sadeghzadeh, A.; Rahman, S. CataractNet: An automated cataract detection system using deep learning for fundus images. IEEE Access 2021, 9, 128799–128808. [Google Scholar] [CrossRef]
Marwaha, S.; Knowles, J.W.; Ashley, E.A. A guide for the diagnosis of rare and undiagnosed disease: Beyond the exome. Genome Med. 2022, 14, 23. [Google Scholar] [CrossRef] [PubMed]
Duan, J.; Xiong, J.; Li, Y.; Ding, W. Deep learning based multimodal biomedical data fusion: An overview and comparative review. Inf. Fusion 2024, 112, 102536. [Google Scholar] [CrossRef]
Zhou, H.; Zhou, F.; Zhao, C.; Xu, Y.; Luo, L.; Chen, H. Multimodal data integration for precision oncology: Challenges and future directions. arXiv 2024, arXiv:2406.19611. [Google Scholar]
Mumuni, A.; Mumuni, F.; Gerrar, N.K. A survey of synthetic data augmentation methods in computer vision. arXiv 2024, arXiv:2403.10075. [Google Scholar]
Nguyen, D.C.; Pham, Q.V.; Pathirana, P.N.; Ding, M.; Seneviratne, A.; Lin, Z.; Dobre, O.; Hwang, W.J. Federated learning for smart healthcare: A survey. ACM Comput. Surv. 2022, 55, 1–37. [Google Scholar] [CrossRef]
Rane, N.; Mallick, S.; Kaya, O.; Rane, J. From challenges to implementation and acceptance: Addressing key barriers in artificial intelligence, machine learning, and deep learning. In Applied Machine Learning and Deep Learning: Architectures and Techniques; Deep Science Publishing: London, UK, 2024; pp. 153–166. [Google Scholar]
Van Assen, M.; Beecy, A.; Gershon, G.; Newsome, J.; Trivedi, H.; Gichoya, J. Implications of bias in artificial intelligence: Considerations for cardiovascular imaging. Curr. Atheroscler. Rep. 2024, 26, 91–102. [Google Scholar] [CrossRef]
Lahoti, P.; Beutel, A.; Chen, J.; Lee, K.; Prost, F.; Thain, N.; Wang, X.; Chi, E. Fairness without demographics through adversarially reweighted learning. Adv. Neural Inf. Process. Syst. 2020, 33, 728–740. [Google Scholar]
Gichoya, J.W.; Banerjee, I.; Bhimireddy, A.R.; Burns, J.L.; Celi, L.A.; Chen, L.C.; Correa, R.; Dullerud, N.; Ghassemi, M.; Huang, S.C.; et al. AI recognition of patient race in medical imaging: A modelling study. Lancet Digit. Health 2022, 4, e406–e414. [Google Scholar] [CrossRef]
Mienye, I.D.; Swart, T.G.; Obaido, G. Fairness metrics in AI healthcare applications: A review. In Proceedings of the 2024 IEEE International Conference on Information Reuse and Integration for Data Science (IRI), San Jose, CA, USA, 7–9 August 2024; pp. 284–289. [Google Scholar]
Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; Galstyan, A. A survey on bias and fairness in machine learning. ACM Comput. Surv. 2021, 54, 1–35. [Google Scholar] [CrossRef]
Mienye, I.D.; Obaido, G.; Emmanuel, I.D.; Ajani, A.A. A survey of bias and fairness in healthcare AI. In Proceedings of the 2024 IEEE 12th International Conference on Healthcare Informatics (ICHI), Orlando, FL, USA, 3–6 June 2024; pp. 642–650. [Google Scholar]
Hassan, M.; Afridi, T.H.; Marwat, S.U.; Munir, F.; Ali, S.; Naseem, H.; Zaheer, M.Z.; Ali, K.; Sultana, T.; Tanoli11, Z.; et al. A Survey of the Self Supervised Learning Mechanisms for Vision Transformers. arXiv 2024, arXiv:2408.17059. [Google Scholar]
Ohri, K.; Kumar, M.; Sukheja, D. Self-supervised approach for diabetic retinopathy severity detection using vision transformer. Prog. Artif. Intell. 2024, 13, 165–183. [Google Scholar] [CrossRef]
Saidi, R.; Rahmany, I.; Dhahri, S.; Moulahi, T. A Privacy-Enhanced Framework for Chest Disease Classification using Federated Learning and Blockchain. IEEE Access 2024. [Google Scholar] [CrossRef]
Nazir, A.; He, J.; Zhu, N.; Anwar, M.S.; Pathan, M.S. Enhancing IoT security: A collaborative framework integrating federated learning, dense neural networks, and blockchain. Clust. Comput. 2024, 27, 8367–8392. [Google Scholar] [CrossRef]
Liu, J.; Chen, C.; Li, Y.; Sun, L.; Song, Y.; Zhou, J.; Jing, B.; Dou, D. Enhancing trust and privacy in distributed networks: A comprehensive survey on blockchain-based federated learning. Knowl. Inf. Syst. 2024, 66, 4377–4403. [Google Scholar] [CrossRef]
Mersha, M.; Lam, K.; Wood, J.; AlShami, A.; Kalita, J. Explainable artificial intelligence: A survey of needs, techniques, applications, and future direction. Neurocomputing 2024, 599, 128111. [Google Scholar] [CrossRef]
Li, Y.; Daho, M.E.H.; Conze, P.H.; Zeghlache, R.; Le Boité, H.; Tadayoni, R.; Cochener, B.; Lamard, M.; Quellec, G. A review of deep learning-based information fusion techniques for multimodal medical image classification. Comput. Biol. Med. 2024, 177, 108635. [Google Scholar] [CrossRef]
Mienye, I.D.; Swart, T.G. A hybrid deep learning approach with generative adversarial network for credit card fraud detection. Technologies 2024, 12, 186. [Google Scholar] [CrossRef]
Dangi, R.R.; Sharma, A.; Vageriya, V. Transforming Healthcare in Low-Resource Settings With Artificial Intelligence: Recent Developments and Outcomes. Public Health Nurs. 2024. [Google Scholar] [CrossRef]
Farlow, A.; Hoffmann, A.; Tadesse, G.A.; Mzurikwao, D.; Beyer, R.; Akogo, D.; Weicken, E.; Matika, T.; Nweje, M.I.; Wamae, W.; et al. Rethinking global digital health and AI-for-health innovation challenges. PLoS Glob. Public Health 2023, 3, e0001844. [Google Scholar] [CrossRef]
Akhtar, Z.B. Artificial Intelligence (AI) and Extended Reality (XR): A Biomedical Engineering Perspective Investigation Analysis. Indones. J. Electron. Electromed. Eng. Med Inform. 2024, 6, 132–146. [Google Scholar] [CrossRef]
Dabass, J.; Dabass, M.; Dabass, B.S. Revolutionizing healthcare and medical education using augmented reality, virtual reality, and extended reality. In Leveraging Metaverse and Analytics of Things (AoT) in Medical Systems; Elsevier: Amsterdam, The Netherlands, 2025; pp. 209–229. [Google Scholar]

Figure 1. General architecture of a CNN, showing the progression from raw image input to feature extraction and final output through convolutional, pooling, and fully connected layers [31]. This structure forms the foundation for many DL models, including those used in medical image analysis.

Figure 2. AlexNet architecture.

Figure 3. VGGNet architecture.

Figure 4. ResNet Architecture.

Figure 5. DenseNet architecture [47].

Table 1. Summary of CNN Architectures Relevant to Medical Image Analysis.

Architecture	Author(s)	Year	Innovation	Applications in Medical Imaging
AlexNet	Krizhevsky et al. [39]	2012	Introduced ReLU activations, dropout regularization, and GPU-based training	Disease classification (e.g., pneumonia detection from chest X-rays), organ segmentation, and anomaly detection.
VGGNet	Simonyan and Zisserman [41]	2014	Systematic use of small 3×3 filters and increased depth for hierarchical feature extraction	Diabetic retinopathy classification, lung cancer detection, and retinal image analysis.
U-Net	Ronneberger et al. [43]	2015	Encoder–decoder structure with skip connections for pixel-level segmentation	Tumor segmentation, organ delineation (e.g., liver in CT, brain in MRI), and lesion detection.
ResNet	He et al. [45]	2015	Introduced residual connections to address vanishing gradient problems	Breast cancer detection in mammograms, lung nodule identification in CT scans, and segmentation of medical images.
DenseNet	Huang et al. [47]	2017	Dense connectivity to enhance gradient flow and parameter efficiency	Retinal image analysis for diabetic retinopathy, lung lesion segmentation in CT, and disease diagnosis.
EfficientNet	Tan and Le [49]	2019	Compound scaling to optimize depth, width, and resolution for resource efficiency	Skin lesion detection in dermoscopic images, portable diagnostic tools, and classification of medical conditions.

Table 2. Summary of CNN Applications in Medical Image Analysis.

Medical Field	Task	Reference	Year	Methods	Performance
Oncology	Breast cancer detection	Sahu et al. [53]	2024	MobileNetV2	Accuracy = 99.17%
	Breast cancer segmentation	Bouzar-Benlabiod et al. [54]	2023	U-Net with Case-Based Reasoning	Accuracy = 86.7%
	Breast cancer detection	Das et al. [55]	2024	ResNet	Accuracy = 92.0%
	Breast cancer classification	Mahoro et al. [57]	2024	Hybrid CNN–Transformer	AUC = 0.980
	Lung cancer detection	UrRehman et al. [58]	2024	Dual-attention CNN model	Sensitivity = 96.5%, Specificity = 95.2%
	Lung cancer classification	Alves et al. [123]	2024	DenseNet + EfficientNet ensemble	Accuracy = 98.1%, Sensitivity = 97.8%
	Lung nodule detection	Safta and Shaffie [59]	2024	3D-CNN	Accuracy = 97.3%
	Lung nodule classification	Gayathiri et al. [60]	2024	AlexNet	Accuracy = 90.8%
	Tumor mutational burden prediction	Li et al. [61]	2024	ResNet	AUC = 0.95
	Colorectal polyp detection	Liu et al. [124]	2024	VGG16	Sensitivity = 94.2%, F1-score = 93.5%
	Colorectal cancer detection	Khan et al. [62]	2023	CNN	Accuracy = 96.1%
	Colorectal cancer detection	Raju et al. [63]	2025	U-Net	Accuracy = 92.3%
Neurology	Alzheimer’s disease detection	Mahmood et al. [67]	2024	Multi-modal CNN	Accuracy = 98.5%
	Alzheimer’s disease diagnosis	Castellano et al. [68]	2024	CNN	Accuracy = 91.5%
	Parkinson’s Disease progression	El-Assy et al. [69]	2024	CNN	Accuracy = 96.8%, AUC = 0.93
	Stroke lesion detection	Kaya and Onal [73]	2023	U-Net	Precision = 95%
	Epileptic seizure detection	Kode et. al [77]	2024	1D-CNN	Accuracy = 99%
	Epileptic seizure state detection	Patel et al. [125]	2024	1D-CNN - LSTM	Accuracy = 90%
	Parkinson’s Disease progression	Frasca et al. [72]	2023	CNN-LSTM	Accuracy = 96.8%
	Parkinson’s disease classification	Aggarwal et al. [70]	2024	1D CNN	Accuracy = 98.7%
	Acute stroke detection	Tahyudin et al. [74]	2025	ResNet	AUC = 0.99
	Epileptic seizure detection	Li et al. [75]	2025	CNN-based EEG analysis	Accuracy = 99.0%
Cardiology	Cardiovascular disease risk assessment	Sadr et al. [79]	2024	CNN-LSTM hybrid	Accuracy = 97%
	Myocardial infarction detection	Deepika and Jaisankar [80]	2024	CNN-based echocardiogram analysis	Sensitivity = 96.8%, Specificity = 94.2%
	Myocardial infarction detection	Rahman et al. [81]	2023	CNN-based echocardiogram analysis	Sensitivity = 96.8%, Specificity = 94.2%
	Heart disease	Sadad et al. [85]	2023	stacked CNN-LSTM	Accuracy = 90.5%
	Heart disease	Sadad et al. [85]	2023	stacked CNN-LSTM	Accuracy = 90.5%
	Left ventricle segmentation	Germain et al. [82]	2024	3D CNN	DSC = 94%
	Cardiac MRI segmentation	El-Taraboulsi et al. [83]	2024	U-Net	Accuracy = 95.3%
	Coronary artery plaque detection	Nie et al. [84]	2025	Cascade R-CNN	Accuracy = 94.6%
	Arrhythmia classification	Sadad et al. [85]	2023	stacked CNN-LSTM	Accuracy = 92.7%, F1-Score = 91.5%
	Arrhythmia classification	Luo et al. [86]	2023	LAH-CNN	F-measure = 78.8%
Pulmonology	Pneumonia diagnosis	Ren et al. [87]	2024	multi-scale CNN	Accuracy = 95%
	Tuberculosis detection	Prasetyo [89]	2024	VGG-16	Accuracy = 98%, Precision = 98%
	Tuberculosis detection	Rani and Gupta [88]	2024	VGG16-based model	Accuracy = 98%, Precision = 98%
	COPD severity classification	Polat et al. [90]	2022	Inception-V3	Accuracy = 97.9%
	Pulmonary embolism detection	Pu et al. [92]	2023	CNN on CTPA	AUC = 0.97, Sensitivity = 95.3%
	Pneumonia diagnosis	Ren et al. [87]	2024	Multi-scale CNN on CXR	Accuracy = 95%
	COPD exacerbation prediction	Zhang et al. [91]	2024	CNN-LSTM on CXR	Accuracy = 99%, Recall = 99.1%
	Pulmonary embolism diagnosis	Vadhera and Sharma [93]	2025	hybrid CNN	Accuracy = 93.2%
Ophthalmology	Diabetic retinopathy detection	Singh et al. [95]	2024	DenseNet	Accuracy = 86%
	OCT imaging	Al-Antary and Arafa [96]	2021	multi-scale CNN	Accuracy = 84.6% and Sensitivity = 91%
	Glaucoma classification	Gayatri and Biswal [97]	2024	ResNet on OCT scans	Accuracy = 94%
	Glaucoma classification	Das and Nayak [98]	2023	CNN on OCT scans	Accuracy = 84.9%, AUC = 0.95
	Cataract severity grading	Li et al. [100]	2024	InceptionV3	Accuracy = 92.7%
	Cataract severity grading	Verma et al. [101]	2022	MobileNetV3	Accuracy = 98.6%
	Diabetic retinopathy	Singh et al. [95]	2024	DenseNet	Accuracy = 86%
	AMD detection	Azizi et al. [99]	2024	CNN-transformer	Accuracy = 94.9%
	Cataract detection	Zhang et al. [102]	2024	CNN with attention mechanisms	Accuracy = 97.8%, AUC = 0.997
	Cataract detection	Junayed et al. [126]	2024	CNN with Adam optimizer mechanisms	Accuracy = 99.1%
Dermatology	Melanoma detection	Toprak and Aruk [105]	2024	Hybrid CNN (DeepLabV3+, MobileNetV2, EfficientNetB0, DenseNet201)	Accuracy = 94.4%
	Skin lesion classification	Armağan et al. [106]	2024	EfficientNetV2	Accuracy = 96%
	Skin lesion segmentation	Aghdam et al. [107]	2023	U-Net with attention	DSC = 92.4%
	Skin lesion segmentation	Reddy et al. [108]	2023	U-Net with attention	DSC = 98%
	Multi-modal skin lesion analysis	Xiao et al. [110]	2023	Dual-branch CNN	Accuracy = 88.1%, AUC = 0.944
	Skin lesion dataset augmentation	Khasanah and Winnarto [109]	2024	ResNet50 and InceptionV3	Accuracy = 87%
	Skin lesion dataset augmentation	Pintelas et al. [111]	2025	CNN-based generative models	Accuracy = 92.9%
Orthopedics	Fracture zone detection	Tabarestani et al. [114]	2021	Faster-RCNN	Accuracy = 66.8%
	Hip fracture detection	Chen et al. [115]	2024	DenseNet-121 architecture	Accuracy = 86.5%.
	Ankle fracture detection	Ashkani-Esfahani et al. [116]	2022	DCNN model	Sensitivity = 98.7%.
	Delineating cartilage damage	Wirth et al. [118]	2021	U-Net	DSC = 92%.
	Knee osteoarthritis assessment	Liu et al. [117]	2023	XGboost and ResNet50	AUC = 0.90
	Vertebral compression fracture detection	Iyer et al. [121]	2022	CNN-based ensemble model	Accuracy = 81%, F1-score = 80.7%.
	Orthopedic rehabilitation	We et al. [122]	2024	Hybrid CNN-LSTM model	Accuracy = 97%.
	Vertebrae fracture classification	Yeh et al. [119]	2022	ResNet	Accuracy = 92%
	Femoral neck fracture classification	Xing et al. [120]	2024	Faster R-CNN and DenseNet-121	Accuracy = 94.1%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mienye, I.D.; Swart, T.G.; Obaido, G.; Jordan, M.; Ilono, P. Deep Convolutional Neural Networks in Medical Image Analysis: A Review. Information 2025, 16, 195. https://doi.org/10.3390/info16030195

AMA Style

Mienye ID, Swart TG, Obaido G, Jordan M, Ilono P. Deep Convolutional Neural Networks in Medical Image Analysis: A Review. Information. 2025; 16(3):195. https://doi.org/10.3390/info16030195

Chicago/Turabian Style

Mienye, Ibomoiye Domor, Theo G. Swart, George Obaido, Matt Jordan, and Philip Ilono. 2025. "Deep Convolutional Neural Networks in Medical Image Analysis: A Review" Information 16, no. 3: 195. https://doi.org/10.3390/info16030195

APA Style

Mienye, I. D., Swart, T. G., Obaido, G., Jordan, M., & Ilono, P. (2025). Deep Convolutional Neural Networks in Medical Image Analysis: A Review. Information, 16(3), 195. https://doi.org/10.3390/info16030195

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Convolutional Neural Networks in Medical Image Analysis: A Review

Abstract

1. Introduction

2. Methodology

2.1. Literature Search Strategy

2.2. Inclusion and Exclusion Criteria

2.3. Data Extraction and Synthesis

3. Related Reviews

4. Overview of CNNs and Their Building Blocks

4.1. Convolutional Layer

4.2. Pooling Layer

4.3. Fully Connected Layer

4.4. Batch Normalization

4.5. Dropout

5. Evolution of Deep CNNs and Architectures

5.1. AlexNet

5.2. VGGNet

5.3. U-Net

5.4. ResNet

5.5. DenseNet

5.6. EfficientNet

5.7. Summary of Architectures

6. Applications of CNNs in Medical Image Analysis

6.1. Oncology

6.2. Neurology

6.3. Cardiology

6.4. Pulmonology

6.5. Ophthalmology

6.6. Dermatology

6.7. Orthopedics

6.8. Summary of CNN Applications in Medical Image Analysis

7. Challenges in Medical Image Analysis

8. Trends and Future Research Directions

9. Conclusions

Author Contributions

Funding

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI