[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (415)

Search Parameters:
Keywords = class imbalance problem

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 2311 KiB  
Article
Semi-Supervised Change Detection with Data Augmentation and Adaptive Thresholding for High-Resolution Remote Sensing Images
by Wuxia Zhang, Xinlong Shu, Siyuan Wu and Songtao Ding
Remote Sens. 2025, 17(2), 178; https://doi.org/10.3390/rs17020178 - 7 Jan 2025
Viewed by 117
Abstract
Change detection (CD) is an important research direction in the field of remote sensing, which aims to analyze the changes in the same area over different periods and is widely used in urban planning and environmental protection. While supervised learning methods in change [...] Read more.
Change detection (CD) is an important research direction in the field of remote sensing, which aims to analyze the changes in the same area over different periods and is widely used in urban planning and environmental protection. While supervised learning methods in change detection have demonstrated substantial efficacy, they are often hindered by the rising costs associated with data annotation. Semi-supervised methods have attracted increasing interest, offering promising results with limited data labeling. These approaches typically employ strategies such as consistency regularization, pseudo-labeling, and generative adversarial networks. However, they usually face the problems of insufficient data augmentation and unbalanced quality and quantity of pseudo-labeling. To address the above problems, we propose a semi-supervised change detection method with data augmentation and adaptive threshold updating (DA-AT) for high-resolution remote sensing images. Firstly, a channel-level data augmentation (CLDA) technique is designed to enhance the strong augmentation effect and improve consistency regularization so as to address the problem of insufficient feature representation. Secondly, an adaptive threshold (AT) is proposed to dynamically adjust the threshold during the training process to balance the quality and quantity of pseudo-labeling so as to optimize the self-training process. Finally, an adaptive class weight (ACW) mechanism is proposed to alleviate the impact of the imbalance between the changed classes and the unchanged classes, which effectively enhances the learning ability of the model for the changed classes. We verify the effectiveness and robustness of the proposed method on two high-resolution remote sensing image datasets, WHU-CD and LEVIR-CD. We compare our method to five state-of-the-art change detection methods and show that it achieves better or comparable results. Full article
(This article belongs to the Special Issue 3D Scene Reconstruction, Modeling and Analysis Using Remote Sensing)
Show Figures

Figure 1

Figure 1
<p>Architecture diagram of DA-AT. It includes the supervised branch, the unsupervised branch, and sub-modules of the semi-supervised change detection model.</p>
Full article ">Figure 2
<p>Data augmentation is used for remote sensing images. (<b>a</b>) is the original image, (<b>b</b>) is the weak augmentation, and (<b>c</b>) is the result of the strong augmentation.</p>
Full article ">Figure 3
<p>Effectiveness of channel-level data augmentation in our proposed method. (<b>a</b>) IoU, (change) (<b>b</b>) F1, (<b>c</b>) Kappa, (<b>d</b>) TPR, (<b>e</b>) TNR.</p>
Full article ">Figure 4
<p>Detection results of different algorithms on the WHU-CD dataset: (<b>a</b>) im0 (<b>b</b>) im1 (<b>c</b>) ground truth (<b>d</b>) S4GAN (<b>e</b>) SemiCDNet (<b>f</b>) SemiCD (<b>g</b>) RCL (<b>h</b>) FPA (<b>i</b>) Ours. The red dashed area highlights the significant differences between methods.</p>
Full article ">Figure 5
<p>Detection results of different algorithms on the LEVIR-CD dataset: (<b>a</b>) im0 (<b>b</b>) im1 (<b>c</b>) ground truth (<b>d</b>) S4GAN (<b>e</b>) SemiCDNet (<b>f</b>) SemiCD (<b>g</b>) RCL (<b>h</b>) FPA (<b>i</b>) Ours. The red dashed area highlights the significant differences between methods.</p>
Full article ">
17 pages, 6223 KiB  
Article
Adaptive Oversampling via Density Estimation for Online Imbalanced Classification
by Daeun Lee and Hyunjoong Kim
Information 2025, 16(1), 23; https://doi.org/10.3390/info16010023 - 5 Jan 2025
Viewed by 225
Abstract
Online learning is a framework for processing and learning from sequential data in real time, offering benefits such as promptness and low memory usage. However, it faces critical challenges, including concept drift, where data distributions evolve over time, and class imbalance, which significantly [...] Read more.
Online learning is a framework for processing and learning from sequential data in real time, offering benefits such as promptness and low memory usage. However, it faces critical challenges, including concept drift, where data distributions evolve over time, and class imbalance, which significantly hinders the accurate classification of minority classes. Addressing these issues simultaneously remains a challenging research problem. This study introduces a novel algorithm that integrates adaptive weighted kernel density estimation (awKDE) and a conscious biasing mechanism to efficiently manage memory, while enhancing the classification performance. The proposed method dynamically detects the minority class and employs a biasing strategy to prioritize its representation during training. By generating synthetic minority samples using awKDE, the algorithm adaptively balances class distributions, ensuring robustness in evolving environments. Experimental evaluations across synthetic and real-world datasets demonstrated that the proposed method achieved up to a 13.3 times improvement in classification performance over established oversampling methods and up to a 1.66 times better performance over adaptive rebalancing approaches, while requiring significantly less memory. These results underscore the method’s scalability and practicality for real-time online learning applications. Full article
Show Figures

Figure 1

Figure 1
<p>Types of concept drift. The yellow and blue points represent the data of each class, and the straight line represents the true classification boundary.</p>
Full article ">Figure 2
<p>Characteristics of concept drift. The yellow and blue points represent two different states.</p>
Full article ">Figure 3
<p>Illustration of kernel density estimation using Gaussian kernel: blue dots indicate data points, gray lines represent individual kernels, and the red line shows the estimated density.</p>
Full article ">Figure 4
<p>Example of balancing stage.</p>
Full article ">Figure 5
<p>Example of conscious bias stage.</p>
Full article ">Figure 6
<p>Simulation results on synthetic dataset with 10% imbalance ratio and borderline imbalance type.</p>
Full article ">Figure 7
<p>Performance metrics for Gesture dataset over time.</p>
Full article ">Figure 8
<p>Performance metrics for Fraud dataset over time.</p>
Full article ">
17 pages, 7209 KiB  
Article
Sorghum Spike Detection Method Based on Gold Feature Pyramid Module and Improved YOLOv8s
by Shujin Qiu, Jian Gao, Mengyao Han, Qingliang Cui, Xiangyang Yuan and Cuiqing Wu
Sensors 2025, 25(1), 104; https://doi.org/10.3390/s25010104 - 27 Dec 2024
Viewed by 262
Abstract
In order to solve the problems of high planting density, similar color, and serious occlusion between spikes in sorghum fields, such as difficult identification and detection of sorghum spikes, low accuracy and high false detection, and missed detection rates, this study proposes an [...] Read more.
In order to solve the problems of high planting density, similar color, and serious occlusion between spikes in sorghum fields, such as difficult identification and detection of sorghum spikes, low accuracy and high false detection, and missed detection rates, this study proposes an improved sorghum spike detection method based on YOLOv8s. The method involves augmenting the information fusion capability of the YOLOv8 model’s neck module by integrating the Gold feature pyramid module. Additionally, the SPPF module is refined with the LSKA attention mechanism to heighten focus on critical features. To tackle class imbalance in sorghum detection and expedite model convergence, a loss function incorporating Focal-EIOU is employed. Consequently, the YOLOv8s-Gold-LSKA model, based on the Gold module and LSKA attention mechanism, is developed. Experimental results demonstrate that this improved method significantly enhances sorghum spike detection accuracy in natural field settings. The improved model achieved a precision of 90.72%, recall of 76.81%, mean average precision (mAP) of 85.86%, and an F1-score of 81.19%. Comparing the improved model of this study with the three target detection models of YOLOv5s, SSD, and YOLOv8, respectively, the improved model of this study has better detection performance. This advancement provides technical support for the rapid and accurate recognition of multiple sorghum spike targets in natural field backgrounds, thereby improving sorghum yield estimation accuracy. It also contributes to increased sorghum production and harvest, as well as the enhancement of intelligent harvesting equipment for agricultural machinery. Full article
(This article belongs to the Special Issue Sensor and AI Technologies in Intelligent Agriculture: 2nd Edition)
Show Figures

Figure 1

Figure 1
<p>Images of sorghum spikes taken at different angles.</p>
Full article ">Figure 2
<p>Data enhancement.</p>
Full article ">Figure 3
<p>Network structure of YOLOv8s.</p>
Full article ">Figure 4
<p>Traditional neck structure.</p>
Full article ">Figure 5
<p>Network structure of Gold.</p>
Full article ">Figure 6
<p>Flowchart of the structure of Low-Gold and High-Gold.</p>
Full article ">Figure 7
<p>Module structure diagram.</p>
Full article ">Figure 8
<p>Schematic diagram of each variable.</p>
Full article ">Figure 9
<p>Improved YOLOv8s-Gold-LSKA model.</p>
Full article ">Figure 10
<p>Training loss curve of the improved model.</p>
Full article ">Figure 11
<p>Visualization of YOLOv8s-Gold-LSKA model detection results.</p>
Full article ">Figure 11 Cont.
<p>Visualization of YOLOv8s-Gold-LSKA model detection results.</p>
Full article ">
19 pages, 892 KiB  
Article
Addressing Class Imbalance in Intrusion Detection: A Comprehensive Evaluation of Machine Learning Approaches
by Vaishnavi Shanmugam, Roozbeh Razavi-Far and Ehsan Hallaji
Electronics 2025, 14(1), 69; https://doi.org/10.3390/electronics14010069 - 27 Dec 2024
Viewed by 351
Abstract
The ever-growing number of cyber attacks in today’s digitally interconnected world requires highly efficient intrusion detection systems (IDSs), which accurately identify both frequent and rare network intrusions. One of the most important challenges in IDSs is the class imbalance problem of network traffic [...] Read more.
The ever-growing number of cyber attacks in today’s digitally interconnected world requires highly efficient intrusion detection systems (IDSs), which accurately identify both frequent and rare network intrusions. One of the most important challenges in IDSs is the class imbalance problem of network traffic flow data, where benign traffic flow significantly outweighs attack instances. This directly affects the ability of machine learning models to identify minority class threats. This paper is intended to evaluate various machine learning algorithms under different levels of class imbalances, using resampling as a strategy for this problem. The paper will provide an experimental comparison by combining various algorithms for classification and class imbalance learning, assessing the performance through the F1-score and geometric mean (G-mean). The work will contribute to creating robust and adaptive IDS through the judicious integration of resampling with machine learning models, thus helping the domain of cybersecurity to become resilient. Full article
(This article belongs to the Special Issue Network Security and Cryptography Applications)
Show Figures

Figure 1

Figure 1
<p>Taxonomy of different IDS categories.</p>
Full article ">Figure 2
<p>Categorization of AI-Based IDSs [<a href="#B8-electronics-14-00069" class="html-bibr">8</a>].</p>
Full article ">Figure 3
<p>Recorded F1-score and G-mean for all methods with a 1:10 imbalance ratio.</p>
Full article ">Figure 4
<p>Recorded F1-score and G-mean for all methods with a 1:100 imbalance ratio.</p>
Full article ">Figure 4 Cont.
<p>Recorded F1-score and G-mean for all methods with a 1:100 imbalance ratio.</p>
Full article ">Figure 5
<p>Recorded F1-score and G-mean for all methods under a 1:500 imbalance ratio.</p>
Full article ">Figure 6
<p>Recorded F1-score and G-mean for all methods at a 1:1000 imbalance ratio.</p>
Full article ">
24 pages, 5471 KiB  
Article
SAG’s Overload Forecasting Using a CNN Physical Informed Approach
by Rodrigo Hermosilla, Carlos Valle, Héctor Allende, Claudio Aguilar and Erich Lucic
Appl. Sci. 2024, 14(24), 11686; https://doi.org/10.3390/app142411686 - 14 Dec 2024
Viewed by 594
Abstract
The overload problem in semi-autogenous grinding (SAG) mills is critical in the mining industry, impacting the extraction of valuable metals and overall productivity. Overloads can lead to severe operational issues, including increased wear, reduced grinding efficiency, and unscheduled shutdowns, which result in financial [...] Read more.
The overload problem in semi-autogenous grinding (SAG) mills is critical in the mining industry, impacting the extraction of valuable metals and overall productivity. Overloads can lead to severe operational issues, including increased wear, reduced grinding efficiency, and unscheduled shutdowns, which result in financial losses. Various strategies have been employed to address SAG mill overload, from real-time monitoring to predictive modeling and machine learning techniques. However, existing methods often lack the integration of domain-specific knowledge, particularly in handling class imbalance within operational data, leading to limitations in predictive accuracy. This paper presents a novel approach that integrates convolutional neural networks (CNNs) with physics-informed neural networks (PINNs), embedding physical laws directly into the model’s loss function. This hybrid methodology captures the complex interactions and nonlinearities inherent in SAG mill operations and leverages domain expertise to enforce physical consistency, ensuring more robust predictions. Incorporating physics-based constraints allows the model to remain sensitive to critical overload conditions while addressing the challenge of imbalanced data. Our method demonstrates a significant enhancement in prediction accuracy through extensive experiments on real-world SAG mill operational data, achieving an F1-score of 94.5%. The results confirm the importance of integrating physics-based knowledge into machine learning models, improving predictive performance, and offering a more interpretable and reliable tool for mill operators. This work sets a new benchmark in the predictive modeling of SAG mill overloads, paving the way for more advanced, physically informed predictive maintenance strategies in the mining industry. Full article
Show Figures

Figure 1

Figure 1
<p>Types of overloads on SAG mills.</p>
Full article ">Figure 2
<p>SAG’s features involved in an overload. * features identified but are unavailable for analysis in this study.</p>
Full article ">Figure 3
<p>Influence of <math display="inline"><semantics> <msub> <mi mathvariant="script">L</mi> <mi>F</mi> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi mathvariant="script">L</mi> <mi>B</mi> </msub> </semantics></math> in a physics-informed neural network architecture. In the figure, <math display="inline"><semantics> <mi>σ</mi> </semantics></math> represents the activation functions, <math display="inline"><semantics> <msub> <mo>ℓ</mo> <mi>i</mi> </msub> </semantics></math> different rules, physical equations, and <math display="inline"><semantics> <msub> <mi>b</mi> <mi>i</mi> </msub> </semantics></math> initialization rules or boundaries.</p>
Full article ">Figure 4
<p>Gradation of importance across selected pairs in overload prediction. Blue to red gradient signifies a transition from lesser to greater significance.</p>
Full article ">Figure 5
<p>Depiction of the temporal penalization filter applied per channel.</p>
Full article ">Figure 6
<p>Effect of temporal filter over Gram matrix sample. This matrix was reduced in dimension to clarify the effect in this document.</p>
Full article ">Figure 7
<p>Proposed PINN-CNN architecture. Green elements represent the input and output of the network, respectively.</p>
Full article ">Figure 8
<p>Visualization of feature importance using GradCam++ in an specific cluster.</p>
Full article ">Figure 9
<p>Visualization of feature importance using GradCam++ in PINN-CNN model (<math display="inline"><semantics> <mrow> <mi>k</mi> <mo>=</mo> <mn>9</mn> </mrow> </semantics></math> clusters).</p>
Full article ">
18 pages, 54250 KiB  
Article
Surrounding Rock Squeezing Classification in Underground Engineering Using a Hybrid Paradigm of Generative Artificial Intelligence and Deep Ensemble Learning
by Shouye Cheng, Xin Yin, Feng Gao and Yucong Pan
Mathematics 2024, 12(23), 3832; https://doi.org/10.3390/math12233832 - 4 Dec 2024
Viewed by 555
Abstract
Surrounding rock squeezing is a common geological disaster in underground excavation projects (e.g., TBM tunneling and deep mining), which has adverse effects on construction safety, schedule, and property. To predict the squeezing of the surrounding rock accurately and quickly, this study proposes a [...] Read more.
Surrounding rock squeezing is a common geological disaster in underground excavation projects (e.g., TBM tunneling and deep mining), which has adverse effects on construction safety, schedule, and property. To predict the squeezing of the surrounding rock accurately and quickly, this study proposes a hybrid machine learning paradigm that integrates generative artificial intelligence and deep ensemble learning. Specifically, conditional tabular generative adversarial network is devised to solve the problems of data shortage and class imbalance for data augmentation at the data level, and the deep random forest is built based on the augmented data for subsequent squeezing classification. A total of 139 historical squeezing cases are collected worldwide to validate the efficacy of the proposed modeling paradigm. The results reveal that this paradigm achieves a prediction accuracy of 92.86% and a macro F1-score of 0.9292. In particular, the individual F1-scores on strong squeezing and extremely strong squeezing are more than 0.9, with excellent prediction reliability for high-intensity squeezing. Finally, a comparative analysis with traditional machine learning techniques is conducted and the superiority of this paradigm is further verified. This study provides a valuable reference for surrounding rock squeezing classification under a limited data environment. Full article
(This article belongs to the Special Issue Mathematical Modeling and Analysis in Mining Engineering)
Show Figures

Figure 1

Figure 1
<p>Research framework of this study.</p>
Full article ">Figure 2
<p>The components of the CTGAN.</p>
Full article ">Figure 3
<p>The topology of the RF.</p>
Full article ">Figure 4
<p>The topology of the DRF.</p>
Full article ">Figure 5
<p>Visual statistical distribution: (<b>a</b>) burial depth; (<b>b</b>) excavation diameter; (<b>c</b>) strength-stress ratio; (<b>d</b>) rock mass quality index; and (<b>e</b>) support stiffness. The <span class="html-italic">R</span><sub>1</sub>, <span class="html-italic">R</span><sub>2</sub>, <span class="html-italic">R</span><sub>3</sub>, <span class="html-italic">R</span><sub>4</sub>, and <span class="html-italic">R</span><sub>5</sub> denote no squeezing, mild squeezing, moderate squeezing, strong squeezing, and extremely strong squeezing, respectively.</p>
Full article ">Figure 6
<p>The proportion of various squeezing intensities.</p>
Full article ">Figure 7
<p>Kernel density distribution for the no-squeezing samples: (<b>a</b>) burial depth; (<b>b</b>) excavation diameter; (<b>c</b>) strength-stress ratio; (<b>d</b>) rock mass quality index; and (<b>e</b>) support stiffness.</p>
Full article ">Figure 8
<p>Kernel density distribution for the mild-squeezing samples: (<b>a</b>) burial depth; (<b>b</b>) excavation diameter; (<b>c</b>) strength-stress ratio; (<b>d</b>) rock mass quality index; and (<b>e</b>) support stiffness.</p>
Full article ">Figure 9
<p>Kernel density distribution for the moderate-squeezing samples: (<b>a</b>) burial depth; (<b>b</b>) excavation diameter; (<b>c</b>) strength-stress ratio; (<b>d</b>) rock mass quality index; and (<b>e</b>) support stiffness.</p>
Full article ">Figure 10
<p>Kernel density distribution for the strong-squeezing samples: (<b>a</b>) burial depth; (<b>b</b>) excavation diameter; (<b>c</b>) strength-stress ratio; (<b>d</b>) rock mass quality index; and (<b>e</b>) support stiffness.</p>
Full article ">Figure 11
<p>Kernel density distribution for the extremely strong-squeezing samples: (<b>a</b>) burial depth; (<b>b</b>) excavation diameter; (<b>c</b>) strength-stress ratio; (<b>d</b>) rock mass quality index; and (<b>e</b>) support stiffness.</p>
Full article ">Figure 12
<p>Similarity evaluation for the no-squeezing samples: (<b>a</b>) correlation coefficient matrix of real data; (<b>b</b>) correlation coefficient matrix of synthetic data; and (<b>c</b>) absolute difference between correlation coefficient matrices of true data and synthetic data. The <span class="html-italic">X</span><sub>1</sub>, <span class="html-italic">X</span><sub>2</sub>, <span class="html-italic">X</span><sub>3</sub>, <span class="html-italic">X</span><sub>4</sub>, and <span class="html-italic">X</span><sub>5</sub> denote burial depth, excavation diameter, strength-stress ratio, rock mass quality index, and support stiffness, respectively. In <a href="#mathematics-12-03832-f013" class="html-fig">Figure 13</a>, <a href="#mathematics-12-03832-f014" class="html-fig">Figure 14</a>, <a href="#mathematics-12-03832-f015" class="html-fig">Figure 15</a> and <a href="#mathematics-12-03832-f016" class="html-fig">Figure 16</a>, the meaning of the abbreviations (i.e., <span class="html-italic">X</span><sub>1</sub>, <span class="html-italic">X</span><sub>2</sub>, <span class="html-italic">X</span><sub>3</sub>, <span class="html-italic">X</span><sub>4</sub>, and <span class="html-italic">X</span><sub>5</sub>) is the same as that in <a href="#mathematics-12-03832-f012" class="html-fig">Figure 12</a>.</p>
Full article ">Figure 13
<p>Similarity evaluation for the mild-squeezing samples: (<b>a</b>) correlation coefficient matrix of real data; (<b>b</b>) correlation coefficient matrix of synthetic data; and (<b>c</b>) absolute difference between correlation coefficient matrices of true data and synthetic data.</p>
Full article ">Figure 14
<p>Similarity evaluation for the moderate-squeezing samples: (<b>a</b>) correlation coefficient matrix of real data; (<b>b</b>) correlation coefficient matrix of synthetic data; and (<b>c</b>) absolute difference between correlation coefficient matrices of true data and synthetic data.</p>
Full article ">Figure 15
<p>Similarity evaluation for the strong-squeezing samples: (<b>a</b>) correlation coefficient matrix of real data; (<b>b</b>) correlation coefficient matrix of synthetic data; and (<b>c</b>) absolute difference between correlation coefficient matrices of true data and synthetic data.</p>
Full article ">Figure 16
<p>Similarity evaluation for the extremely strong-squeezing samples: (<b>a</b>) correlation coefficient matrix of real data; (<b>b</b>) correlation coefficient matrix of synthetic data; and (<b>c</b>) absolute difference between correlation coefficient matrices of true data and synthetic data.</p>
Full article ">Figure 17
<p>Confusion matrix of the CTGAN-DRF. The <span class="html-italic">R</span><sub>1</sub>, <span class="html-italic">R</span><sub>2</sub>, <span class="html-italic">R</span><sub>3</sub>, <span class="html-italic">R</span><sub>4</sub>, and <span class="html-italic">R</span><sub>5</sub> denote no squeezing, mild squeezing, moderate squeezing, strong squeezing, and extremely strong squeezing, respectively.</p>
Full article ">Figure 18
<p>Performance evaluation of the CTGAN-DRF: (<b>a</b>) global performance and (<b>b</b>) local performance. The <span class="html-italic">F</span><sub>1</sub>-score<sub>1</sub>, <span class="html-italic">F</span><sub>1</sub>-score<sub>2</sub>, <span class="html-italic">F</span><sub>1</sub>-score<sub>3</sub>, <span class="html-italic">F</span><sub>1</sub>-score<sub>4</sub>, and <span class="html-italic">F</span><sub>1</sub>-score<sub>5</sub> denote the <span class="html-italic">F</span><sub>1</sub>-score of the model on no squeezing, mild squeezing, moderate squeezing, strong squeezing, and extremely strong squeezing, respectively.</p>
Full article ">Figure 19
<p>Confusion matrix: (<b>a</b>) BPNN; (<b>b</b>) SVM; (<b>c</b>) GPC; (<b>d</b>) NBC; and (<b>e</b>) KNN. The <span class="html-italic">R</span><sub>1</sub>, <span class="html-italic">R</span><sub>2</sub>, <span class="html-italic">R</span><sub>3</sub>, <span class="html-italic">R</span><sub>4</sub>, and <span class="html-italic">R</span><sub>5</sub> denote no squeezing, mild squeezing, moderate squeezing, strong squeezing, and extremely strong squeezing, respectively.</p>
Full article ">Figure 20
<p>Performance comparison of different models.</p>
Full article ">
17 pages, 2446 KiB  
Article
The Impact of the SMOTE Method on Machine Learning and Ensemble Learning Performance Results in Addressing Class Imbalance in Data Used for Predicting Total Testosterone Deficiency in Type 2 Diabetes Patients
by Mehmet Kivrak, Ugur Avci, Hakki Uzun and Cuneyt Ardic
Diagnostics 2024, 14(23), 2634; https://doi.org/10.3390/diagnostics14232634 - 22 Nov 2024
Viewed by 499
Abstract
Background and Objective: Diabetes Mellitus is a long-term, multifaceted metabolic condition that necessitates ongoing medical management. Hypogonadism is a syndrome that is a clinical and/or biochemical indicator of testosterone deficiency. Cross-sectional studies have reported that 20–80.4% of all men with Type 2 diabetes [...] Read more.
Background and Objective: Diabetes Mellitus is a long-term, multifaceted metabolic condition that necessitates ongoing medical management. Hypogonadism is a syndrome that is a clinical and/or biochemical indicator of testosterone deficiency. Cross-sectional studies have reported that 20–80.4% of all men with Type 2 diabetes have hypogonadism, and Type 2 diabetes is related to low testosterone. This study presents an analysis of the use of ML and EL classifiers in predicting testosterone deficiency. In our study, we compared optimized traditional ML classifiers and three EL classifiers using grid search and stratified k-fold cross-validation. We used the SMOTE method for the class imbalance problem. Methods: This database contains 3397 patients for the assessment of testosterone deficiency. Among these patients, 1886 patients with Type 2 diabetes were included in the study. In the data preprocessing stage, firstly, outlier/excessive observation analyses were performed with LOF and missing value analyses were performed with random forest. The SMOTE is a method for generating synthetic samples of the minority class. Four basic classifiers, namely MLP, RF, ELM and LR, were used as first-level classifiers. Tree ensemble classifiers, namely ADA, XGBoost and SGB, were used as second-level classifiers. Results: After the SMOTE, while the diagnostic accuracy decreased in all base classifiers except ELM, sensitivity values increased in all classifiers. Similarly, while the specificity values decreased in all classifiers, F1 score increased. The RF classifier gave more successful results on the base-training dataset. The most successful ensemble classifier in the training dataset was the ADA classifier in the original data and in the SMOTE data. In terms of the testing data, XGBoost is the most suitable model for your intended use in evaluating model performance. XGBoost, which exhibits a balanced performance especially when the SMOTE is used, can be preferred to correct class imbalance. Conclusions: The SMOTE is used to correct the class imbalance in the original data. However, as seen in this study, when the SMOTE was applied, the diagnostic accuracy decreased in some models but the sensitivity increased significantly. This shows the positive effects of the SMOTE in terms of better predicting the minority class. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

Figure 1
<p>Testesterone target organs [<a href="#B7-diagnostics-14-02634" class="html-bibr">7</a>].</p>
Full article ">Figure 2
<p>The working step.</p>
Full article ">Figure 3
<p>Outlier/excessive observation analyses with local outlier factor. The observations shown in blue in the figure are values within normal limits. The values above the red line are outliers.</p>
Full article ">Figure 4
<p>Illustration of class imbalance.</p>
Full article ">Figure 5
<p>Original and preprocessed (SMOTE) data. Blue color normal individuals and green color testosterone deficiency (TD) individuals.</p>
Full article ">Figure 6
<p>Classification diagram for original and SMOTE data using base classifiers (training data). (<b>A1</b>): Original data of MLP, (<b>A2</b>): SMOTE data of MLP, (<b>B1</b>): Original data of RF, (<b>B2</b>): SMOTE data of RF, (<b>C1</b>): Original data of LR, (<b>C2</b>): SMOTE data of LR, (<b>D1</b>): Original data of ELM, (<b>D2</b>): SMOTE data of ELM.</p>
Full article ">Figure 7
<p>Classification diagram for original and SMOTE data using base classifiers (training data). (<b>A1</b>): Original data of ADA, (<b>A2</b>): SMOTE data of ADA, (<b>B1</b>): Original data of XGBoost, (<b>B2</b>): SMOTE data of XGBoost, (<b>C1</b>): Original data of SGB, (<b>C2</b>): SMOTE data of SGB.</p>
Full article ">
14 pages, 325 KiB  
Article
Multimodal Framework for Long-Tailed Recognition
by Jian Chen, Jianyin Zhao, Jiaojiao Gu, Yufeng Qin and Hong Ji
Appl. Sci. 2024, 14(22), 10572; https://doi.org/10.3390/app142210572 - 16 Nov 2024
Viewed by 580
Abstract
Long-tailed data distribution (i.e., minority classes occupy most of the data, while most classes have very few samples) is a common problem in image classification. In this paper, we propose a novel multimodal framework for long-tailed data recognition. In the first stage, long-tailed [...] Read more.
Long-tailed data distribution (i.e., minority classes occupy most of the data, while most classes have very few samples) is a common problem in image classification. In this paper, we propose a novel multimodal framework for long-tailed data recognition. In the first stage, long-tailed data are used for visual-semantic contrastive learning to obtain good features, while in the second stage, class-balanced data are used for classifier training. The proposed framework leverages the advantages of multimodal models and mitigates the problem of class imbalance in long-tailed data recognition. Experimental results demonstrate that the proposed framework achieves competitive performance on the CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and iNaturalist2018 datasets for image classification. Full article
Show Figures

Figure 1

Figure 1
<p>An overview of the proposed framework. The dashed line above in the figure represents the image–text branch, while the dashed line below represents the image-classifier branch. The term cls-text denotes the textual database describing each category. The nums-cls graph illustrates several classes in the training set along with their corresponding sample counts, displaying a long-tailed distribution curve. The image encoder and text encoder are utilized separately to extract features from their respective modalities. Due to potential noise and interference in the textual descriptions for each category, such as ambiguous or entirely incorrect descriptions, in the second phase of training, we employ filters to retain the most representative textual features for each class. Each small square in the figure represents a feature extracted by the feature extractor for each sample, forming a vector. Different colors indicate different categories. For example, the three yellow squares represent three distinct samples, but they share the same class label.</p>
Full article ">Figure 2
<p>A description of the feat-filter module. The function of this module is to select the features most relevant to the category from the textual features while filtering out irrelevant or noisy ones. We named it “feat-filter” because its primary purpose is to filter out features that are unrelated to the category. Through this module, we ensure that only the most discriminative textual features are retained for the subsequent multimodal classification tasks.</p>
Full article ">
19 pages, 829 KiB  
Article
A New Image Oversampling Method Based on Influence Functions and Weights
by Jun Ye, Shoulei Lu and Jiawei Chen
Appl. Sci. 2024, 14(22), 10553; https://doi.org/10.3390/app142210553 - 15 Nov 2024
Viewed by 501
Abstract
Although imbalanced data have been studied for many years, the problem of data imbalance is still a major problem in the development of machine learning and artificial intelligence. The development of deep learning and artificial intelligence has further expanded the impact of imbalanced [...] Read more.
Although imbalanced data have been studied for many years, the problem of data imbalance is still a major problem in the development of machine learning and artificial intelligence. The development of deep learning and artificial intelligence has further expanded the impact of imbalanced data, so studying imbalanced data classification is of practical significance. We propose an image oversampling algorithm based on the influence function and sample weights. Our scheme not only synthesizes high-quality minority class samples but also preserves the original features and information of minority class images. To address the lack of visually reasonable features in SMOTE when synthesizing images, we improve the pre-training model by removing the pooling layer and the fully connected layer in the model, extracting the important features of the image by convolving the image, executing SMOTE interpolation operation on the extracted important features to derive the synthesized image features, and inputting the features into a DCGAN network generator, which maps these features into the high-dimensional image space to generate a realistic image. To verify that our scheme can synthesize high-quality images and thus improve classification accuracy, we conduct experiments on the processed CIFAR10, CIFAR100, and ImageNet-LT datasets. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

Figure 1
<p>Imbalance data resolution.</p>
Full article ">Figure 2
<p>Image feature extraction network architecture.</p>
Full article ">Figure 3
<p>Visualization of image features.</p>
Full article ">Figure 4
<p>SMOTE interpolation for synthetic feature visualization.</p>
Full article ">Figure 5
<p>DCGAN generator architecture.</p>
Full article ">Figure 6
<p>Oversampling in the CIFAR-10 image dataset.</p>
Full article ">Figure 7
<p>Accuracy for each category in the CIFAR-10 dataset at an imbalance ratio of 50.</p>
Full article ">Figure 8
<p>Accuracy rates of different schemes in the ImageNet-LT dataset.</p>
Full article ">
23 pages, 2745 KiB  
Article
Enhanced Plant Leaf Classification over a Large Number of Classes Using Machine Learning
by Ersin Elbasi, Ahmet E. Topcu, Elda Cina, Aymen I. Zreikat, Ahmed Shdefat, Chamseddine Zaki and Wiem Abdelbaki
Appl. Sci. 2024, 14(22), 10507; https://doi.org/10.3390/app142210507 - 14 Nov 2024
Viewed by 887
Abstract
In botany and agriculture, classifying leaves is a crucial process that yields vital information for studies on biodiversity, ecological studies, and the identification of plant species. The Cope Leaf Dataset offers a comprehensive collection of leaf images from various plant species, enabling the [...] Read more.
In botany and agriculture, classifying leaves is a crucial process that yields vital information for studies on biodiversity, ecological studies, and the identification of plant species. The Cope Leaf Dataset offers a comprehensive collection of leaf images from various plant species, enabling the development and evaluation of advanced classification algorithms. This study presents a robust methodology for classifying leaf images within the Cope Leaf Dataset by enhancing the feature extraction and selection process. Cope Leaf Dataset has 99 classes and 64 features with 1584 records. Features are extracted based on the margin, texture, and shape of the leaves. It is challenging to classify a large number of labels because of class imbalance, feature complexity, overfitting, and label noise. Our approach combines advanced feature selection techniques with robust preprocessing methods, including normalization, imputation, and noise reduction. By systematically integrating these techniques, we aim to reduce dimensionality, eliminate irrelevant or redundant features, and improve data quality. Increasing accuracy in classification, especially when dealing with large datasets and many classes, involves a combination of data preprocessing, model selection, regularization techniques, and fine-tuning. The results indicate that the Multilayer Perception algorithm gives 89.48%, the Naïve Bayes Classifier gives 89.63%, Convolutional Neural Networks has 88.72%, and the Hoeffding Tree algorithm gives 89.92% accuracy for the classification of 99 label plant leaf classification problems. Full article
(This article belongs to the Special Issue Smart Agriculture Based on Big Data and Internet of Things (IoT))
Show Figures

Figure 1

Figure 1
<p>Structure of plant leaf classification.</p>
Full article ">Figure 2
<p>Features selection, extraction, and classification using machine learning.</p>
Full article ">Figure 3
<p>Overview of Leaf Type Identification Process.</p>
Full article ">Figure 4
<p>Accuracy with margin features, texture features, and after-feature selection.</p>
Full article ">Figure 5
<p>Accuracy of plant leaf classification.</p>
Full article ">Figure 6
<p>MAE, RAE, RMSE, and TP rates.</p>
Full article ">Figure 7
<p>Samples of correctly classified leaves.</p>
Full article ">Figure 8
<p>Sample of incorrectly classified leaves.</p>
Full article ">
16 pages, 10190 KiB  
Article
Automated Recognition of Submerged Body-like Objects in Sonar Images Using Convolutional Neural Networks
by Yan Zun Nga, Zuhayr Rymansaib, Alfie Anthony Treloar and Alan Hunter
Remote Sens. 2024, 16(21), 4036; https://doi.org/10.3390/rs16214036 - 30 Oct 2024
Viewed by 775
Abstract
The Police Robot for Inspection and Mapping of Underwater Evidence (PRIME) is an uncrewed surface vehicle (USV) currently being developed for underwater search and recovery teams to assist in crime scene investigation. The USV maps underwater scenes using sidescan sonar (SSS). Test exercises [...] Read more.
The Police Robot for Inspection and Mapping of Underwater Evidence (PRIME) is an uncrewed surface vehicle (USV) currently being developed for underwater search and recovery teams to assist in crime scene investigation. The USV maps underwater scenes using sidescan sonar (SSS). Test exercises use a clothed mannequin lying on the seafloor as a target object to evaluate system performance. A robust, automated method for detecting human body-shaped objects is required to maximise operational functionality. The use of a convolutional neural network (CNN) for automatic target recognition (ATR) is proposed. SSS image data acquired from four different locations during previous missions were used to build a dataset consisting of two classes, i.e., a binary classification problem. The target object class consisted of 166 196 × 196 pixel image snippets of the underwater mannequin, whereas the non-target class consisted of 13,054 examples. Due to the large class imbalance in the dataset, CNN models were trained with six different imbalance ratios. Two different pre-trained models (ResNet-50 and Xception) were compared, and trained via transfer learning. This paper presents results from the CNNs and details the training methods used. Larger datasets are shown to improve CNN performance despite class imbalance, achieving average F1 scores of 97% in image classification. Average F1 scores for target vs background classification with unseen data are only 47% but the end result is enhanced by combining multiple weak classification results in an ensemble average. The combined output, represented as a georeferenced heatmap, accurately indicates the target object location with a high detection confidence and one false positive of low confidence. The CNN approach shows improved object detection performance when compared to the currently used ATR method. Full article
(This article belongs to the Special Issue AI-Driven Mapping Using Remote Sensing Data)
Show Figures

Figure 1

Figure 1
<p>PRIME USV executing a survey at Underfall Yard, Bristol Harbour, UK.</p>
Full article ">Figure 2
<p>Satellite map showing locations where experimental trials took place: (1) Bristol Harbour; (2) Bathampton Canal; (3) Dundas Aqueduct; (4) Minerva Bath Rowing Club.</p>
Full article ">Figure 3
<p>Example of a sonar image of the target object before (<b>a</b>) and after (<b>b</b>) rescaling using bicubic interpolation.</p>
Full article ">Figure 4
<p>Examples of generated training data: (<b>a</b>) Object class sample cropped from image containing the target object, remaining data are discarded. (<b>b</b>) Background class data generated from entire target-free image.</p>
Full article ">Figure 5
<p>Gallery of example training data for object class (<b>a</b>) and background class (<b>b</b>).</p>
Full article ">Figure 6
<p>F1 scores of Xception and ResNet-50 for (<b>a</b>) classification on test data samples (image snippets), (<b>b</b>) object detection on full-size sonar images, (<b>c</b>) object detection on full-size sonar images from an unseen dataset, with increasing imbalance ratios. Lines show median and quartiles of distributions. Networks were re-trained 10 times each.</p>
Full article ">Figure 6 Cont.
<p>F1 scores of Xception and ResNet-50 for (<b>a</b>) classification on test data samples (image snippets), (<b>b</b>) object detection on full-size sonar images, (<b>c</b>) object detection on full-size sonar images from an unseen dataset, with increasing imbalance ratios. Lines show median and quartiles of distributions. Networks were re-trained 10 times each.</p>
Full article ">Figure 7
<p>Examples of region proposals with the highest misclassification rates. (<b>a</b>) Rocks, (<b>b</b>) vegetation, and (<b>c</b>) fish were frequently misclassified as the target object. (<b>d</b>) Shows the target object misclassified as background.</p>
Full article ">Figure 8
<p>Examples of bounding box placement after object recognition. Left column (<b>a</b>,<b>c</b>) shows true positive detected with a high confidence rate, right column (<b>b</b>,<b>d</b>) shows false positive with moderate confidence rate and a false negative. Second row (<b>c</b>,<b>d</b>) shows corresponding heatmaps generated from the bounding box and confidence rate. Blue-white-red colour scale ranges from 0–100 and indicates confidence rate.</p>
Full article ">Figure 9
<p>Example of georeferenced (<b>a</b>) sonar image, with corresponding heatmaps generated using (<b>b</b>) existing ATR method and (<b>c</b>) CNN-based ATR method. Both methods incorrectly identify a school of fish as the target object. The blue-white-red colour scales in (<b>b</b>) and (<b>c</b>) range from 0.075–0.2 and 0–1, respectively.</p>
Full article ">Figure 10
<p>Comparison of combined heatmaps produced using (<b>a</b>) existing image processing-based anomaly detection method and (<b>b</b>) CNN-based object detection method. The blue-white-red colour scales in (<b>a</b>) and (<b>b</b>) range from 0–0.015 0–0.3, respectively. Overlaid arrows indicate location of (A) target, (B) rocks and (C) change in floor texture. These results correspond to run 1 of the 2021 dataset highlighted in <a href="#remotesensing-16-04036-t001" class="html-table">Table 1</a>.</p>
Full article ">Figure 11
<p>Combined heatmap produced using CNN-based object detection method on additional unseen dataset. Overlaid arrows indicate location of (A) target, (B) rocks shown in <a href="#remotesensing-16-04036-f010" class="html-fig">Figure 10</a>. The blue-white-red colour scale ranges from 0–0.3. This result corresponds to run 1 of the 2020 dataset highlighted in <a href="#remotesensing-16-04036-t001" class="html-table">Table 1</a>.</p>
Full article ">
21 pages, 48158 KiB  
Article
ETFT: Equiangular Tight Frame Transformer for Imbalanced Semantic Segmentation
by Seonggyun Jeong and Yong Seok Heo
Sensors 2024, 24(21), 6913; https://doi.org/10.3390/s24216913 - 28 Oct 2024
Viewed by 690
Abstract
Semantic segmentation often suffers from class imbalance, where the label ratio for each class in the dataset is not uniform. Recent studies have addressed the issue of class imbalance in semantic segmentation by leveraging the neural collapse phenomenon in conjunction with an Equiangular [...] Read more.
Semantic segmentation often suffers from class imbalance, where the label ratio for each class in the dataset is not uniform. Recent studies have addressed the issue of class imbalance in semantic segmentation by leveraging the neural collapse phenomenon in conjunction with an Equiangular Tight Frame (ETF). While the use of ETF aids in enhancing the discriminability of minor classes, class correlation is another crucial factor that must be taken into account. However, managing the balance between class correlation and discrimination through neural collapse remains challenging, as these properties inherently conflict with one another. Moreover, this control is established during the training stage, resulting in a fixed classifier. There is no guarantee that this classifier will consistently perform well with different input images. To address this problem, we propose an Equiangular Tight Frame Transformer (ETFT), a transformer-based model that jointly processes the features and classifier using ETF structure, and dynamically generates the classifier as a function of the input for imbalanced semantic segmentation. Specifically, the classifier initialized with the ETF structure is jointly processed with the input patch tokens during the attention process. As a result, the transformed patch tokens, aided by the ETF structure, achieve discriminability between classes while preserving contextual correlation. The classifier, initially structured as an ETF, is adjusted to incorporate the correlation information, benefiting from the attention mechanism. Furthermore, the learned classifier is combined with the fixed ETF classifier, leveraging the advantages of both. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art methods for imbalanced semantic segmentation on both the ADE20K and Cityscapes datasets. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

Figure 1
<p>Comparison of classifiers in segmentation networks. (<b>a</b>) Linear classifier [<a href="#B20-sensors-24-06913" class="html-bibr">20</a>]. (<b>b</b>) Mask transformer [<a href="#B22-sensors-24-06913" class="html-bibr">22</a>]. (<b>c</b>) Proposed ETFT. The lock icon in (<b>c</b>) indicates a fixed ETF structure.</p>
Full article ">Figure 2
<p>The process of neural collapse in classification networks during training. At the end of training, neural collapse occurs, where both the classifiers (weights of the last layer) and class means of features converge to an Equiangular Tight Frame (ETF) structure. Specifically, they converge to have the same norm, and the angle between any two vectors (except for identical ones) is the same for all pairs.</p>
Full article ">Figure 3
<p>Our proposed module, ETF Transformer (ETFT), is composed of several transformer blocks. The input to the first transformer block is generated by concatenating the feature vectors from the decoder with a fixed ETF matrix. The final classifier is obtained by concatenating the learned classifier with the fixed ETF classifier and passing them through fully connected layers.</p>
Full article ">Figure 4
<p>Qualitative comparison of Swin UperNet-T [<a href="#B23-sensors-24-06913" class="html-bibr">23</a>], SeMask-T FPN [<a href="#B26-sensors-24-06913" class="html-bibr">26</a>], and FeedFormer-B2 [<a href="#B25-sensors-24-06913" class="html-bibr">25</a>], without and with our proposed ETFT, on the ADE20K validation set.</p>
Full article ">Figure 4 Cont.
<p>Qualitative comparison of Swin UperNet-T [<a href="#B23-sensors-24-06913" class="html-bibr">23</a>], SeMask-T FPN [<a href="#B26-sensors-24-06913" class="html-bibr">26</a>], and FeedFormer-B2 [<a href="#B25-sensors-24-06913" class="html-bibr">25</a>], without and with our proposed ETFT, on the ADE20K validation set.</p>
Full article ">Figure 5
<p>Qualitative comparison of Swin UperNet-T [<a href="#B23-sensors-24-06913" class="html-bibr">23</a>], SeMask-T FPN [<a href="#B26-sensors-24-06913" class="html-bibr">26</a>], and FeedFormer-B2 [<a href="#B25-sensors-24-06913" class="html-bibr">25</a>], without and with our ETFT, on the Cityscapes validation set. The green box represents a magnified view of the corresponding red box.</p>
Full article ">Figure 6
<p>Qualitative comparison of Ceco [<a href="#B20-sensors-24-06913" class="html-bibr">20</a>], GPaCo [<a href="#B21-sensors-24-06913" class="html-bibr">21</a>] and ETFT on ADE20K validation set.</p>
Full article ">Figure 7
<p>Qualitative comparison of M1 (Segmenter [<a href="#B22-sensors-24-06913" class="html-bibr">22</a>]) and M4 (ETFT) on ADE20K validation set.</p>
Full article ">Figure A1
<p>Detailed architecture of a transformer block in ETFT.</p>
Full article ">
14 pages, 928 KiB  
Article
Online Action Detection Incorporating an Additional Action Classifier
by Min-Hang Hsu, Chen-Chien Hsu, Yin-Tien Wang, Shao-Kang Huang and Yi-Hsing Chien
Electronics 2024, 13(20), 4110; https://doi.org/10.3390/electronics13204110 - 18 Oct 2024
Viewed by 701
Abstract
Most online action detection methods focus on solving a (K + 1) classification problem, where the additional category represents the ‘background’ class. However, training on the ‘background’ class and managing data imbalance are common challenges in online action detection. To address these [...] Read more.
Most online action detection methods focus on solving a (K + 1) classification problem, where the additional category represents the ‘background’ class. However, training on the ‘background’ class and managing data imbalance are common challenges in online action detection. To address these issues, we propose a framework for online action detection by incorporating an additional pathway between the feature extractor and online action detection model. Specifically, we present one configuration that retains feature distinctions for fusion with the final decision from the Long Short-Term Transformer (LSTR), enhancing its performance in the (K + 1) classification. Experimental results show that the proposed method achieves an accuracy of 71.2% in mean Average Precision (mAP) on the Thumos14 dataset, outperforming the 69.5% achieved by the original LSTR method. Full article
Show Figures

Figure 1

Figure 1
<p>Distribution of frames per category in the Thumos14 dataset, where category 0 represents the background class.</p>
Full article ">Figure 2
<p>The proposed framework for online action detection with an additional action classifier.</p>
Full article ">
17 pages, 1592 KiB  
Article
An Enhanced Tree Ensemble for Classification in the Presence of Extreme Class Imbalance
by Samir K. Safi and Sheema Gul
Mathematics 2024, 12(20), 3243; https://doi.org/10.3390/math12203243 - 16 Oct 2024
Viewed by 954
Abstract
Researchers using machine learning methods for classification can face challenges due to class imbalance, where a certain class is underrepresented. Over or under-sampling of minority or majority class observations, or solely relying on model selection for ensemble methods, may prove ineffective when the [...] Read more.
Researchers using machine learning methods for classification can face challenges due to class imbalance, where a certain class is underrepresented. Over or under-sampling of minority or majority class observations, or solely relying on model selection for ensemble methods, may prove ineffective when the class imbalance ratio is extremely high. To address this issue, this paper proposes a method called enhance tree ensemble (ETE), based on generating synthetic data for minority class observations in conjunction with tree selection based on their performance on the training data. The proposed method first generates minority class instances to balance the training data and then uses the idea of tree selection by leveraging out-of-bag (ETEOOB) and sub-samples (ETESS) observations, respectively. The efficacy of the proposed method is assessed using twenty benchmark problems for binary classification with moderate to extreme class imbalance, comparing it against other well-known methods such as optimal tree ensemble (OTE), SMOTE random forest (RFSMOTE), oversampling random forest (RFOS), under-sampling random forest (RFUS), k-nearest neighbor (k-NN), support vector machine (SVM), tree, and artificial neural network (ANN). Performance metrics such as classification error rate and precision are used for evaluation purposes. The analyses of the study revealed that the proposed method, based on data balancing and model selection, yielded better results than the other methods. Full article
(This article belongs to the Special Issue Advances in Statistical Methods with Applications)
Show Figures

Figure 1

Figure 1
<p>Flow chart of the proposed methods.</p>
Full article ">Figure 2
<p>Box plots comparing <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">E</mi> <mi mathvariant="normal">T</mi> <mi mathvariant="normal">E</mi> </mrow> <mrow> <mi mathvariant="normal">O</mi> <mi mathvariant="normal">O</mi> <mi mathvariant="normal">B</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">E</mi> <mi mathvariant="normal">T</mi> <mi mathvariant="normal">E</mi> </mrow> <mrow> <mi mathvariant="normal">S</mi> <mi mathvariant="normal">S</mi> </mrow> </msub> </mrow> </semantics></math> to other state-of-the-art methods, displaying the classification error rate for a range of datasets using 70% training and 30% testing.</p>
Full article ">Figure 3
<p>Box plots comparing <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">E</mi> <mi mathvariant="normal">T</mi> <mi mathvariant="normal">E</mi> </mrow> <mrow> <mi mathvariant="normal">O</mi> <mi mathvariant="normal">O</mi> <mi mathvariant="normal">B</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">E</mi> <mi mathvariant="normal">T</mi> <mi mathvariant="normal">E</mi> </mrow> <mrow> <mi mathvariant="normal">S</mi> <mi mathvariant="normal">S</mi> </mrow> </msub> </mrow> </semantics></math> to other state-of-the-art methods, displaying the precision for a range of datasets using 70% training and 30% testing.</p>
Full article ">Figure 4
<p>A multi-line plot examines the impact of the proposed method, i.e., <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">E</mi> <mi mathvariant="normal">T</mi> <mi mathvariant="normal">E</mi> </mrow> <mrow> <mi mathvariant="normal">O</mi> <mi mathvariant="normal">O</mi> <mi mathvariant="normal">B</mi> </mrow> </msub> </mrow> </semantics></math>, varying the number of trees (H) in the ensemble on the error rate.</p>
Full article ">Figure 5
<p>A multi-line plot examines the impact of the proposed method, i.e., <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">E</mi> <mi mathvariant="normal">T</mi> <mi mathvariant="normal">E</mi> </mrow> <mrow> <mi mathvariant="normal">S</mi> <mi mathvariant="normal">S</mi> </mrow> </msub> </mrow> </semantics></math>, varying the number of trees (H) in the ensemble on the error rate.</p>
Full article ">Figure 6
<p>Bar plots display the proposed method, i.e., <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">E</mi> <mi mathvariant="normal">T</mi> <mi mathvariant="normal">E</mi> </mrow> <mrow> <mi mathvariant="normal">O</mi> <mi mathvariant="normal">O</mi> <mi mathvariant="normal">B</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">E</mi> <mi mathvariant="normal">T</mi> <mi mathvariant="normal">E</mi> </mrow> <mrow> <mi mathvariant="normal">S</mi> <mi mathvariant="normal">S</mi> </mrow> </msub> </mrow> </semantics></math>, classification error rate and precision on both <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">S</mi> <mi mathvariant="normal">I</mi> <mi mathvariant="normal">D</mi> <mi mathvariant="normal">S</mi> </mrow> <mrow> <mi mathvariant="normal">s</mi> <mi mathvariant="normal">i</mi> <mi mathvariant="normal">m</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">S</mi> <mi mathvariant="normal">B</mi> <mi mathvariant="normal">D</mi> <mi mathvariant="normal">S</mi> </mrow> <mrow> <mi mathvariant="normal">s</mi> <mi mathvariant="normal">i</mi> <mi mathvariant="normal">m</mi> </mrow> </msub> </mrow> </semantics></math>, including comparisons with other state-of-the-art techniques.</p>
Full article ">
19 pages, 3076 KiB  
Article
Three-Stage Recursive Learning Technique for Face Mask Detection on Imbalanced Datasets
by Chi-Yi Tsai, Wei-Hsuan Shih and Humaira Nisar
Mathematics 2024, 12(19), 3104; https://doi.org/10.3390/math12193104 - 4 Oct 2024
Viewed by 1036
Abstract
In response to the COVID-19 pandemic, governments worldwide have implemented mandatory face mask regulations in crowded public spaces, making the development of automatic face mask detection systems critical. To achieve robust face mask detection performance, a high-quality and comprehensive face mask dataset is [...] Read more.
In response to the COVID-19 pandemic, governments worldwide have implemented mandatory face mask regulations in crowded public spaces, making the development of automatic face mask detection systems critical. To achieve robust face mask detection performance, a high-quality and comprehensive face mask dataset is required. However, due to the difficulty in obtaining face samples with masks in the real-world, public face mask datasets are often imbalanced, leading to the data imbalance problem in model training and negatively impacting detection performance. To address this problem, this paper proposes a novel recursive model-training technique designed to improve detection accuracy on imbalanced datasets. The proposed method recursively splits and merges the dataset based on the attribute characteristics of different classes, enabling more balanced and effective model training. Our approach demonstrates that the carefully designed splitting and merging of datasets can significantly enhance model-training performance. This method was evaluated using two imbalanced datasets. The experimental results show that the proposed recursive learning technique achieves a percentage increase (PI) of 84.5% in mean average precision ([email protected]) on the Kaggle dataset and of 186.3% on the Eden dataset compared to traditional supervised learning. Additionally, when combined with existing oversampling techniques, the PI on the Kaggle dataset further increases to 88.9%, highlighting the potential of the proposed method for improving detection accuracy in highly imbalanced datasets. Full article
(This article belongs to the Special Issue Advances in Algorithm Design and Machine Learning)
Show Figures

Figure 1

Figure 1
<p>Three conditions of face mask-wearing: (<b>a</b>) correct mask-wearing, (<b>b</b>) no mask-wearing, and (<b>c</b>) incorrect mask-wearing.</p>
Full article ">Figure 2
<p>Comparison of (<b>a</b>) the traditional supervised learning and (<b>b</b>) the proposed recursive learning method. The proposed recursive learning method incorporates dataset manipulation processing into the model-training process to train the model recursively.</p>
Full article ">Figure 3
<p>Concept of the proposed dataset split-and-merge processing for recursive learning.</p>
Full article ">Figure 4
<p>Illustration of the proposed three-stage recursive learning method combined with dataset split-and-merge processing.</p>
Full article ">Figure 5
<p>Flowchart of the proposed recursive learning method.</p>
Full article ">Figure 6
<p>Illustration of the distances C and D between the ground truth A and the predicted B bounding boxes.</p>
Full article ">Figure 7
<p>Experimental results of (<b>a</b>) the supervised learning and (<b>b</b>) the proposed Over-R-S1S2S3 learning method on the Kaggle test set, along with (<b>c</b>) and (<b>d</b>), the corresponding zoom-in results.</p>
Full article ">Figure 8
<p>Experimental results of (<b>a</b>) the supervised learning and (<b>b</b>) the proposed Over-R-S1S2S3 learning method on the Kaggle test set, along with (<b>c</b>) and (<b>d</b>), the corresponding zoom-in results.</p>
Full article ">
Back to TopTop