[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (544)

Search Parameters:
Keywords = Shapley values

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 4910 KiB  
Article
A Novel SHAP-GAN Network for Interpretable Ovarian Cancer Diagnosis
by Jingxun Cai, Zne-Jung Lee, Zhihxian Lin and Ming-Ren Yang
Mathematics 2025, 13(5), 882; https://doi.org/10.3390/math13050882 - 6 Mar 2025
Viewed by 132
Abstract
Ovarian cancer stands out as one of the most formidable adversaries in women’s health, largely due to its typically subtle and nonspecific early symptoms, which pose significant challenges to early detection and diagnosis. Although existing diagnostic methods, such as biomarker testing and imaging, [...] Read more.
Ovarian cancer stands out as one of the most formidable adversaries in women’s health, largely due to its typically subtle and nonspecific early symptoms, which pose significant challenges to early detection and diagnosis. Although existing diagnostic methods, such as biomarker testing and imaging, can help with early diagnosis to some extent, these methods still have limitations in sensitivity and accuracy, often leading to misdiagnosis or missed diagnosis. Ovarian cancer’s high heterogeneity and complexity increase diagnostic challenges, especially in disease progression prediction and patient classification. Machine learning (ML) has outperformed traditional methods in cancer detection by processing large datasets to identify patterns missed by conventional techniques. However, existing AI models still struggle with accuracy in handling imbalanced and high-dimensional data, and their “black-box” nature limits clinical interpretability. To address these issues, this study proposes SHAP-GAN, an innovative diagnostic model for ovarian cancer that integrates Shapley Additive exPlanations (SHAP) with Generative Adversarial Networks (GANs). The SHAP module quantifies each biomarker’s contribution to the diagnosis, while the GAN component optimizes medical data generation. This approach tackles three key challenges in medical diagnosis: data scarcity, model interpretability, and diagnostic accuracy. Results show that SHAP-GAN outperforms traditional methods in sensitivity, accuracy, and interpretability, particularly with high-dimensional and imbalanced ovarian cancer datasets. The top three influential features identified are PRR11, CIAO1, and SMPD3, which exhibit wide SHAP value distributions, highlighting their significant impact on model predictions. The SHAP-GAN network has demonstrated an impressive accuracy rate of 99.34% on the ovarian cancer dataset, significantly outperforming baseline algorithms, including Support Vector Machines (SVM), Logistic Regression (LR), and XGBoost. Specifically, SVM achieved an accuracy of 72.78%, LR achieved 86.09%, and XGBoost achieved 96.69%. These results highlight the superior performance of SHAP-GAN in handling high-dimensional and imbalanced datasets. Furthermore, SHAP-GAN significantly alleviates the challenges associated with intricate genetic data analysis, empowering medical professionals to tailor personalized treatment strategies for individual patients. Full article
Show Figures

Figure 1

Figure 1
<p>The distribution of ovarian cancer data.</p>
Full article ">Figure 2
<p>Pearson correlation heatmap of features in the ovarian cancer dataset.</p>
Full article ">Figure 3
<p>Principal component analysis (PCA) distribution plot for ovarian cancer data.</p>
Full article ">Figure 4
<p>The basic architecture of the ACGAN.</p>
Full article ">Figure 5
<p>The flowchart of the proposed method.</p>
Full article ">Figure 6
<p>The architecture of the SHAP-GAN network.</p>
Full article ">Figure 7
<p>Distribution of ovarian cancer samples after data augmentation.</p>
Full article ">Figure 8
<p>The relationship between model performance and the number of selected features.</p>
Full article ">Figure 9
<p>Shapley values of selected features.</p>
Full article ">Figure 10
<p>The confusion matrix for SVM model performance.</p>
Full article ">Figure 11
<p>The confusion matrix for LR model performance.</p>
Full article ">Figure 12
<p>The confusion matrix for XGBoost model performance.</p>
Full article ">Figure 13
<p>The confusion matrix for the proposed SHAP-GAN network performance.</p>
Full article ">Figure 14
<p>The ROC curve for the proposed SHAP-GAN network performance.</p>
Full article ">
17 pages, 954 KiB  
Article
Leveraging Explainable Artificial Intelligence in Solar Photovoltaic Mappings: Model Explanations and Feature Selection
by Eduardo Gomes, Augusto Esteves, Hugo Morais and Lucas Pereira
Energies 2025, 18(5), 1282; https://doi.org/10.3390/en18051282 - 5 Mar 2025
Viewed by 196
Abstract
This work explores the effectiveness of explainable artificial intelligence in mapping solar photovoltaic power outputs based on weather data, focusing on short-term mappings. We analyzed the impact values provided by the Shapley additive explanation method when applied to two algorithms designed for tabular [...] Read more.
This work explores the effectiveness of explainable artificial intelligence in mapping solar photovoltaic power outputs based on weather data, focusing on short-term mappings. We analyzed the impact values provided by the Shapley additive explanation method when applied to two algorithms designed for tabular data—XGBoost and TabNet—and conducted a comprehensive evaluation of the overall model and across seasons. Our findings revealed that the impact of selected features remained relatively consistent throughout the year, underscoring their uniformity across seasons. Additionally, we propose a feature selection methodology utilizing the explanation values to produce more efficient models, by reducing data requirements while maintaining performance within a threshold of the original model. The effectiveness of the proposed methodology was demonstrated through its application to a residential dataset in Madeira, Portugal, augmented with weather data sourced from SolCast. Full article
(This article belongs to the Topic Smart Energy Systems, 2nd Edition)
Show Figures

Figure 1

Figure 1
<p>Proposed methodology for explaining PV production mappings using SHAP values.</p>
Full article ">Figure 2
<p>Proposed methodology for feature selection using SHAP values.</p>
Full article ">Figure 3
<p>Examples of domain and exogenous features for a period of 24 h.</p>
Full article ">Figure 4
<p>XGBoost and TabNet overall SHAP impact values. Each point represents an individual training example, with its color indicating the magnitude of a specific feature’s value. The horizontal position of each point reflects the impact of that feature on the model’s output.</p>
Full article ">Figure 5
<p>Model performances on a summer day for the testing set (4 August 2020).</p>
Full article ">
30 pages, 2514 KiB  
Article
FedCon: Scalable and Efficient Federated Learning via Contribution-Based Aggregation
by Wenyu Gao, Gaochao Xu and Xianqiu Meng
Electronics 2025, 14(5), 1024; https://doi.org/10.3390/electronics14051024 - 4 Mar 2025
Viewed by 219
Abstract
With the increasing application of federated learning to medical and image data, the challenges of class distribution imbalances and Non-IID heterogeneity across clients have become critical factors affecting the generalization ability of global models. In the medical domain, the phenomenon of data silos [...] Read more.
With the increasing application of federated learning to medical and image data, the challenges of class distribution imbalances and Non-IID heterogeneity across clients have become critical factors affecting the generalization ability of global models. In the medical domain, the phenomenon of data silos is particularly pronounced, leading to significant differences in data distributions across hospitals, which in turn hinder the performance of global model training. To address these challenges, this paper proposes FedCon, a federated learning method capable of dynamically adjusting aggregation weights, while accurately evaluating client contributions. Specifically, FedCon initializes aggregation weights based on client data volume and class distribution and employs Monte Carlo sampling to effectively simplify the computation of Shapley values. Subsequently, it further optimizes the aggregation weights by comprehensively considering the historical contributions of clients and the similarity between clients and the global model. This approach significantly enhances the ability to generalize and update the stability of the global model. Experimental results demonstrate that, compared to existing methods, FedCon achieved a superior generalization performance on public datasets and significantly accelerated the convergence of the global model. Full article
(This article belongs to the Special Issue Empowering IoT with AI: AIoT for Smart and Autonomous Systems)
Show Figures

Figure 1

Figure 1
<p>The FedCon framework: (<b>A</b>) the calculation of client data quality in the first round using discrepancies between global and local data distributions to determine initialization weights; (<b>B</b>) the dynamic adjustment of aggregation weights based on precise client contribution computations (using Shapley values and similarity metrics) in each round, improving the model convergence stability.</p>
Full article ">Figure 2
<p>The above chart is the heatmap of data partitioning for CIFAR10-NIID-1.</p>
Full article ">Figure 3
<p>The above chart is the heatmap of data partitioning for CIFAR10-NIID-2.</p>
Full article ">Figure 4
<p>Convergence analysis of different methods on CIFAR10 and CIFAR100 datasets under independent and identically distributed (Homo) settings. (<b>a</b>) shows the convergence analysis of CIFAR10 under Homo settings, and (<b>b</b>) shows the convergence analysis of CIFAR100 under Homo settings.</p>
Full article ">Figure 5
<p>Convergence analysis of different methods on the CIFAR10 dataset under two Non-IID data partitioning strategies, NIID-1 and NIID-2. The figure above demonstrates that the FedCon method achieved a faster convergence and outperformed the other methods in terms of performance. (<b>a</b>) shows the convergence analysis of CIFAR10 under NIID-1 settings, and (<b>b</b>) shows the convergence analysis of CIFAR10 under NIID-2 settings.</p>
Full article ">Figure 6
<p>Convergence analysis of various methods on the CIFAR100 dataset under two Non-IID data partitioning strategies, NIID-1 and NIID-2. (<b>a</b>) shows the convergence analysis of CIFAR100 under NIID-1 settings, and (<b>b</b>) shows the convergence analysis of CIFAR100 under NIID-2 settings.</p>
Full article ">Figure 7
<p>Convergence analysis of different methods on the HAR and HAM10000 datasets under the NIID-1 partitioning strategy (since these two datasets have different numbers of classes, only the NIID-1 partitioning strategy could be applied). (<b>a</b>) shows the convergence analysis of HAR under NIID-1 settings, and (<b>b</b>) shows the convergence analysis of HAM10000 under NIID-1 settings.</p>
Full article ">Figure 8
<p>Convergence analysis of different methods on the OrganAMNIST dataset under the NIID-1 and NIID-2 Non-IID data partitioning strategies. (<b>a</b>) shows the convergence analysis of OrganAMNIST under NIID-1 settings, and (<b>b</b>) shows the convergence analysis of OrganAMNIST under NIID-2 settings.</p>
Full article ">Figure 9
<p>Convergence analysis of different methods on the OrganCMNIST dataset under the NIID-1 and NIID-2 Non-IID data partitioning strategies. (<b>a</b>) shows the convergence analysis of OrganCMNIST under NIID-1 settings, and (<b>b</b>) shows the convergence analysis of OrganCMNIST under NIID-2 settings.</p>
Full article ">Figure 10
<p>Convergence analysis of different methods on the OrganSMNIST dataset under the NIID-1 and NIID-2 Non-IID data partitioning strategies. (<b>a</b>) shows the convergence analysis of OrganSMNIST under NIID-1 settings, and (<b>b</b>) shows the convergence analysis of OrganSMNIST under NIID-2 settings.</p>
Full article ">Figure 11
<p>The figure above presents the RMSE of the eight baseline methods and the FedCon method across the different datasets and partitioning strategies. Notably, the performance of FedCon was particularly remarkable on the HAM10000 dataset. On other datasets, FedCon also demonstrated superior performance, highlighting its advanced capabilities.</p>
Full article ">Figure 12
<p>Comparison of communication time, round time, and accuracy for NIID-1 setting. (<b>a</b>) shows communication time and accuracy on NIID-1 setting. (<b>b</b>) shows round time and accuracy on NIID-1 setting.</p>
Full article ">Figure 13
<p>Comparison of communication time, round time, and accuracy for NIID-2 setting. (<b>a</b>) shows communication time and accuracy on NIID-2 setting. (<b>b</b>) shows round time and accuracy on NIID-2 setting.</p>
Full article ">Figure 14
<p>The above figure shows the hyperparameter analysis under CIFAR10 with the NIID-1 data partitioning.</p>
Full article ">Figure 15
<p>The above figure shows the hyperparameter analysis under CIFAR10 with the NIID-2 data partitioning.</p>
Full article ">Figure 16
<p>Violin plot comparing FedCon without A, FedCon without B, FedCon, and other baseline methods.</p>
Full article ">
23 pages, 7882 KiB  
Article
Deep-Neural-Networks-Based Data-Driven Methods for Characterizing the Mechanical Behavior of Hydroxyl-Terminated Polyether Propellants
by Ruohan Han, Xiaolong Fu, Bei Qu, La Shi and Yuhang Liu
Polymers 2025, 17(5), 660; https://doi.org/10.3390/polym17050660 - 28 Feb 2025
Viewed by 208
Abstract
Hydroxyl-terminated polyether (HTPE) propellants are attractive in the weapons materials and equipment industry for their insensitive properties. Storage, combustion, and explosion of solid propellants are affected by their mechanical properties, so accurate mechanical modeling is vital. In this study, deep neural networks are [...] Read more.
Hydroxyl-terminated polyether (HTPE) propellants are attractive in the weapons materials and equipment industry for their insensitive properties. Storage, combustion, and explosion of solid propellants are affected by their mechanical properties, so accurate mechanical modeling is vital. In this study, deep neural networks are applied to model composite solid-propellant mechanical behavior for the first time. A data-driven framework incorporating a novel training–testing splitting strategy is proposed. By building Neural Networks (FFNNs), Kolmogorov–Arnold Networks (KANs) and Long Short-Term Memory (LSTM) networks and optimizing the model framework and parameters using a Bayesian optimization algorithm, the results show that the LSTM model predicts the stress–strain curve of HTPE propellant with an RMSE of 0.053 MPa, which is 62.7% and 48.5% higher than the FFNNs and the KANs, respectively. The R2 values of the LSTM model for the testing set exceed 0.99, which can effectively capture the effects of tensile rate and temperature changes on tensile strength, and accurately predict the yield point and the slope change of the stress–strain curve. Using the interpretable Shapley Additive Explanations (SHAP) method, fine-grained ammonium perchlorate (AP) can increase its tensile strength, and plasticizers can increase their elongation at break; this method provides an effective approach for HTPE propellant formulation. Full article
(This article belongs to the Section Polymer Composites and Nanocomposites)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) HTPE propellant preparation process. (<b>b</b>) Schematic diagram of HTPE propellant uniaxial tensile experiment and specimen (unit: mm).</p>
Full article ">Figure 2
<p>Boxplot of the selected input numerical features.</p>
Full article ">Figure 3
<p>Heatmap of the SCC of each pair in the original 13 features. Red and blue colors represent positive and negative correlations, respectively.</p>
Full article ">Figure 4
<p>Frameworks of (<b>a</b>) Feedforward artificial neural networks and (<b>b</b>) Kolmogorov–Arnold networks.</p>
Full article ">Figure 5
<p>Three-dimensional architecture of LSTM deep neural network.</p>
Full article ">Figure 6
<p>Diagram of the DNN modeling framework for predicting the mechanical behavior of HTPE propellant.</p>
Full article ">Figure 7
<p>Stress–strain curves of formulation 1 under different (<b>a</b>) low and (<b>b</b>) high tensile rates. Stress–strain curves of formulation 2 under different (<b>c</b>) low and (<b>d</b>) high tensile rates.</p>
Full article ">Figure 8
<p>Parallel coordinate plot of hyperparameter optimization of (<b>a</b>) MLP, (<b>b</b>) KAN, and (<b>c</b>) LSTM.</p>
Full article ">Figure 9
<p>Neural network model training in process predicted stress–strain curves with the number of model iterations.</p>
Full article ">Figure 10
<p>Stress–strain curves of (<b>a</b>) HTPE/AP/Al, (<b>b</b>) HTPE/RDX/AP/Al, and (<b>c</b>) HTPE/HMX/AP/Al propellants predicted by FFNN, KAN, and LSTM models.</p>
Full article ">Figure 11
<p>Comparison of the errors of FFNN, KAN, and LSTM on the prediction results of the (<b>a</b>) training set and (<b>b</b>) test set.</p>
Full article ">Figure 12
<p>Results of using LSTM model to predict stress–strain curves of HTPE/AP/Al and HTPE/HMX/AP/Al propellants at different tensile rates.</p>
Full article ">Figure 13
<p>Results of using LSTM model to predict stress–strain curves for HTPE/RDX/AP/Al propellants at different test temperatures and tensile rates.</p>
Full article ">Figure 14
<p>Results of using LSTM models to predict stress–strain curves for HTPE/HATO/AP/Al propellants at different test temperatures for different content of coarse-particle-sized AP.</p>
Full article ">Figure 15
<p>Global interpretations analysis by SHAP values for the input features of tensile strength.</p>
Full article ">Figure 16
<p>Global interpretations analysis by SHAP values for the input features of elongation at break.</p>
Full article ">Figure 17
<p>Individual interpretations for the maximum tensile strength of HTPE/AP/Al propellant of formulation 1 at 100 mm/min.</p>
Full article ">
14 pages, 947 KiB  
Article
The Nonsense of Bitcoin in Portfolio Analysis
by Haim Shalit
J. Risk Financial Manag. 2025, 18(3), 125; https://doi.org/10.3390/jrfm18030125 - 28 Feb 2025
Viewed by 141
Abstract
The paper demonstrates the nonsense of using Bitcoin in financial investments. By using mean-variance financial analysis, stochastic dominance, CVaR, and the Shapley value theory as analytical statistical models, I show how Bitcoin performs poorly by comparing it against other traded assets. The conclusion [...] Read more.
The paper demonstrates the nonsense of using Bitcoin in financial investments. By using mean-variance financial analysis, stochastic dominance, CVaR, and the Shapley value theory as analytical statistical models, I show how Bitcoin performs poorly by comparing it against other traded assets. The conclusion is reached by analyzing daily freely available market data for the period 2018–2023. Full article
(This article belongs to the Section Financial Markets)
Show Figures

Figure 1

Figure 1
<p>Efficient frontier, stocks, and Bitcoin.<a href="#fn004-jrfm-18-00125" class="html-fn">4</a></p>
Full article ">Figure 2
<p>The Lorenz curve.<a href="#fn006-jrfm-18-00125" class="html-fn">6</a></p>
Full article ">Figure 3
<p>Lorenz curves of all assets for 2018–2023 daily returns.</p>
Full article ">Figure 4
<p>Shapley values vs means of frontier portfolio assets for 2018–2013 daily returns.</p>
Full article ">
25 pages, 2080 KiB  
Article
Biform Game Approach to Strategy Optimization of Autonomous Vehicle Lane Changes on Highway Ramps
by Xiaorong Wang, Yinzhen Li, Changxi Ma and Shurui Cao
Appl. Sci. 2025, 15(5), 2568; https://doi.org/10.3390/app15052568 - 27 Feb 2025
Viewed by 230
Abstract
The traditional non-cooperative and cooperative game methods have limitations in solving the traffic problems of autonomous or assisted driving vehicles using vehicle-to-everything communication. In this paper, the biform game method is introduced to optimize the lane-changing behavior of autonomous or assisted driving vehicles [...] Read more.
The traditional non-cooperative and cooperative game methods have limitations in solving the traffic problems of autonomous or assisted driving vehicles using vehicle-to-everything communication. In this paper, the biform game method is introduced to optimize the lane-changing behavior of autonomous or assisted driving vehicles in highway on-ramp areas based on vehicle-to-everything. Considering the lane-changing and speed adjustment needs of autonomous vehicles in high-speed scenarios, a forced lane-changing framework was constructed, and the speed gain allocation was determined based on the target vehicle lane-changing time, and a speed increase was regarded as a benefit. Through the constructed biform game model, research was carried out on conflicting and cooperative vehicles. A strategy combination is first constructed in the non-cooperative situation, and then the cooperative game competition stage begins. The Shapley value is used to deduce the distribution value of each participant in the cooperative game stage, which is the profit value in the non-cooperative stage, and then the pure-strategy Nash equilibrium solution is calculated. The interaction with other vehicles in the lane-change process is based on maximizing the benefit to all the vehicles participating in the lane change, and the optimal speed solution of the biform game model when changing lanes is obtained. Numerical examples were used to verify the validity and feasibility of the model and broaden the application range of the biform game method. In future research, this method will be applied to more complex traffic models, such as driving models in emergency situations and research from the perspective of road infrastructure designers, providing new ideas and directions for optimization strategies for autonomous vehicle lane changes in the Internet of Vehicles. Full article
Show Figures

Figure 1

Figure 1
<p>Automatic lane-change process of a vehicle.</p>
Full article ">Figure 2
<p>Optimal stable speed change trend when adjusting the probability of different speed changes of three vehicles. (<b>a</b>) When the forced lane-changing probability of the target vehicle is fixed, the speeds of the three vehicles show an upward trend; (<b>b</b>) when the deceleration probability of the rear vehicle is fixed, the speeds of the three vehicles remain unchanged; (<b>c</b>) when the acceleration probability of the front vehicle is fixed, the speeds of the three vehicles show an upward trend.</p>
Full article ">Figure 3
<p>Trends in optimal stable speed changes of three vehicles vs. probability of speed changes of front and rear vehicles (Red is the ahead car speed, green is the lane-changing car speed, and blue is the rear car speed).</p>
Full article ">Figure 3 Cont.
<p>Trends in optimal stable speed changes of three vehicles vs. probability of speed changes of front and rear vehicles (Red is the ahead car speed, green is the lane-changing car speed, and blue is the rear car speed).</p>
Full article ">Figure 4
<p>Trends in optimal stable speed changes of three vehicles when the speed probability of the following vehicle remains unchanged (Red is the ahead car speed, green is the lane-changing car speed, and blue is the rear car speed).</p>
Full article ">
15 pages, 2871 KiB  
Article
The Power of Machine Learning Methods and PSO in Air Quality Prediction
by Emine Cengil
Appl. Sci. 2025, 15(5), 2546; https://doi.org/10.3390/app15052546 - 27 Feb 2025
Viewed by 233
Abstract
Monitoring and forecasting air quality is essential for public health and environmental management. It details the air’s cleanliness, pollution levels, and any related health risks that the general public may find concerning. This research investigates how effective machine learning techniques and particle swarm [...] Read more.
Monitoring and forecasting air quality is essential for public health and environmental management. It details the air’s cleanliness, pollution levels, and any related health risks that the general public may find concerning. This research investigates how effective machine learning techniques and particle swarm optimization are in predicting air quality. An array of machine learning algorithms, including XGBoost, support vector regression, linear regression, and random forest, was selected to ensure effective modeling outcomes. The models were trained on an open access dataset and performed performance evaluation. The answers from an on-site gas multisensor device in an Italian city were included in the dataset. The findings from the dataset were shown to accurately model and predict environmental factors affecting air quality (e.g., air temperature, humidity, indium oxide, tin oxide, NOx, NO2, etc.) using real-world air quality data. The experiments were repeated by optimizing the relevant machine learning methods with PSO. PSO is a metaheuristic optimization method widely used in feature selection and feature extraction processes. The metrics MAE, MSE, RMSE, and R2, commonly used to evaluate regression algorithms, were utilized to assess the models’ performances. Particle swarm optimization-based support vector regression performed best, with MAE, MSE, RMSE, and R2 values of 0.071, 0.015, 0.122, and 0.999, respectively. In addition, Shapley additive explanations (SHAP) analysis was performed to show which feature of the PSO-based SVR model was practical and to what extent. The results show that the proposed model successfully predicts air quality. Full article
Show Figures

Figure 1

Figure 1
<p>Boxplot showing the variables in the dataset and the starting value distributions for each variable.</p>
Full article ">Figure 2
<p>The implementation stages of the proposed method.</p>
Full article ">Figure 3
<p>The value ranges for each feature after preprocessing.</p>
Full article ">Figure 4
<p>Predictive performance of machine learning + PSO models.</p>
Full article ">Figure 5
<p>Actual prediction comparison of PSO-optimized models.</p>
Full article ">Figure 6
<p>SHAP values for air quality prediction. Distribution and importance order of variables affecting SVR + PSO model output (the top value is the most effective).</p>
Full article ">
16 pages, 3853 KiB  
Article
Comprehensive SHAP Values and Single-Cell Sequencing Technology Reveal Key Cell Clusters in Bovine Skeletal Muscle
by Yaqiang Guo, Fengying Ma, Peipei Li, Lili Guo, Zaixia Liu, Chenxi Huo, Caixia Shi, Lin Zhu, Mingjuan Gu, Risu Na and Wenguang Zhang
Int. J. Mol. Sci. 2025, 26(5), 2054; https://doi.org/10.3390/ijms26052054 - 26 Feb 2025
Viewed by 108
Abstract
The skeletal muscle of cattle is the main component of their muscular system, responsible for supporting and movement functions. However, there are still many unknown areas regarding the ranking of the importance of different types of cell populations within it. This study conducted [...] Read more.
The skeletal muscle of cattle is the main component of their muscular system, responsible for supporting and movement functions. However, there are still many unknown areas regarding the ranking of the importance of different types of cell populations within it. This study conducted in-depth research and made a series of significant findings. First, we trained 15 bovine skeletal muscle models and selected the best-performing model as the initial model. Based on the SHAP (Shapley Additive exPlanations) analysis of this initial model, we obtained the SHAP values of 476 important genes. Using the contributions of these 476 genes, we reconstructed a 476-gene SHAP value matrix, and relying solely on the interactions among these 476 genes, successfully mapped the single-cell atlas of bovine skeletal muscle. After retraining the model and further interpretation, we found that Myofiber cells are the most representative cell type in bovine skeletal muscle, followed by neutrophils. By determining the key genes of each cell type through SHAP values, we conducted analyses on the correlations among key genes and between cells for Myofiber cells, revealing the critical role these genes play in muscle growth and development. Further, by using protein language models, we performed cross-species comparisons between cattle and pigs, deepening our understanding of Myofiber cells as key cells in skeletal muscle, and exploring the common regulatory mechanisms of muscle development across species. Full article
(This article belongs to the Section Molecular Genetics and Genomics)
Show Figures

Figure 1

Figure 1
<p>Initial data analysis. (<b>A</b>) Statistical evaluation of the number of 14 cell types in <span class="html-italic">bovine</span> skeletal muscle. The abbreviated cells in the figure are as follows: FAP (fibro/adipogenic progenitor); VEndoC (venular endothelial cell); MuC (mural cell); MC (myogenic cell); LEndoc (lymphatic endothelial cell); Mo/Ma (monocyte/macrophage). (<b>B</b>) Quantitative analysis of gene expression levels, categorizing and averaging the expression in 10% incremental gradients. (<b>C</b>) Original matrix-based expression atlas of <span class="html-italic">bovine</span> skeletal muscle data.</p>
Full article ">Figure 2
<p>Expression data fitting and model training. (<b>A</b>) Schematic depicting the process of data fitting, beginning with the under-sampling of the cell type with the least quantities and progressing to the oversampling of the cell type with the greatest quantities. (<b>B</b>) The top left illustrates the accuracy assessment of 15 model trainings, alongside the total number of cells input into the model, while the top right, bottom left, and bottom right, respectively, contrast the precision, F1 score, and recall of the unfitted original data against those of the initial best model. (<b>C</b>) Under the condition of the best performance of the initial best model, the training set comprises single-cell maps of skeletal muscle, neutrophils, and CD8 T cells. (<b>D</b>) Similarly, under the condition of the best performance of the initial best model, the test set includes single-cell maps of skeletal muscle, neutrophils, and CD8 T cells. (<b>E</b>) Following the SHAP interpretation of the initial best model, the top 20 genes by SHAP value are identified. (<b>F</b>) Correlation analysis of the top 20 genes, with * indicating gene-to-gene correlations exceeding 0.8.</p>
Full article ">Figure 3
<p>The scientific nature of the training model for 476 co-expressed and specifically expressed genes in 14 types of cells. (<b>A</b>) Post SHAP interpretation of the initial optimal model, analysis of the top 500 SHAP-valued genes for co-expression, and specific expression across each cell type. (<b>B</b>) Assessment of accuracy following the training of models using matrices composed of 180 co-expressed genes, 296 cell-specific expressed genes, 476 genes from both categories, and 1161 unique genes on the optimal test set of the initial best model. (<b>C</b>) Single-cell map of skeletal muscle for 180 genes. (<b>D</b>) Single-cell map of skeletal muscle for 296 genes. (<b>E</b>) Single-cell map of skeletal muscle for 476 genes. (<b>F</b>) Single-cell map of skeletal muscle for 1161 genes. (<b>G</b>) Analysis of cellular composition for 476 genes.</p>
Full article ">Figure 4
<p>Final mapping of the single-cell profiles of <span class="html-italic">bovine</span> skeletal muscle and the identification of key genes and cell types. (<b>A</b>) Evaluation of the accuracy of the training model after reconstructing the SHAP matrix with 476 genes. (<b>B</b>) Ranking of the importance of 14 cell types in <span class="html-italic">bovine</span> skeletal muscle. (<b>C</b>) Final determined map of single-cell profiles in <span class="html-italic">bovine</span> skeletal muscle. (<b>D</b>) Ranking of the top 20 key genes for the most critical cell type, Myofiber, in skeletal muscle. (<b>E</b>) Correlation analysis between the top 20 key genes of Myofiber and the 14 cell types. (<b>F</b>) Expression levels of the top 20 key genes across the 14 cell types. (<b>G</b>) Biological process analysis of the top 20 key genes of Myofiber, with gene names on the left and pathway names in the middle.</p>
Full article ">Figure 5
<p>We delve into the intracellular and intercellular information extraction of selected cell types. (<b>A</b>) This section delineates the correlation analysis between genes and intercellular interactions among the top 20 key genes in neutrophils, myofibers, and FAPs. (<b>B</b>) It further explores the biological processes associated with the top 10 key genes across three categories of cells. (<b>C</b>) An analysis of co-expression and specific expression patterns is conducted for the top 20 key genes among 14 categories of cells. (<b>D</b>) The correlation between 15 specific expression genes across three cell types is scrutinized, with * indicating correlations greater than 0.8. (<b>E</b>) A correlation analysis is performed between these 15 specific expression genes and cellular interactions. (<b>F</b>) Violin plots display the expression levels of these 15 specific expression genes across each cell type. (<b>G</b>) The biological processes related to these 15 specific expression genes are examined in detail. (<b>H</b>) Lastly, the tissue expression analysis of these 15 specific expression genes is presented.</p>
Full article ">Figure 6
<p>A comparative analysis of skeletal muscle across species and within Macrogene contexts is presented. (<b>A</b>) The left panel delineates the principal component analysis spectra of skeletal muscle across species, while the right panel illustrates the PCA spectra for cell types. (<b>B</b>) The left panel exhibits the UMAP profiles of skeletal muscle cells across species, with the right panel detailing the UMAP profiles for cell types. (<b>C</b>) The heatmap depicts the expression profiles of Macrogene markers for myofibers. (<b>D</b>) The cell component analysis of the top 20 genes with the highest weight under the Macrogene context of myofibers is presented. (<b>E</b>) A correlation analysis between Macrogene markers and cell types is conducted, where the horizontal axis categorizes cell types by color: red for <span class="html-italic">bovine</span>-specific, blue for <span class="html-italic">porcine</span>-specific, turquoise for common cell types across both species, and green for unidentified cell types. (<b>F</b>) The cumulative weight statistics of the top 20 genes under each cell type’s Macrogene context are summarized.</p>
Full article ">
23 pages, 6343 KiB  
Article
Multi-Feature Extraction and Explainable Machine Learning for Lamb-Wave-Based Damage Localization in Laminated Composites
by Jaehyun Jung, Muhammad Muzammil Azad and Heung Soo Kim
Mathematics 2025, 13(5), 769; https://doi.org/10.3390/math13050769 - 26 Feb 2025
Viewed by 159
Abstract
Laminated composites display exceptional weight-saving abilities that make them suited to advanced applications in aerospace, automobile, civil, and marine industries. However, the orthotropic nature of laminated composites means that they possess several damage modes that can lead to catastrophic failure. Therefore, machine learning-based [...] Read more.
Laminated composites display exceptional weight-saving abilities that make them suited to advanced applications in aerospace, automobile, civil, and marine industries. However, the orthotropic nature of laminated composites means that they possess several damage modes that can lead to catastrophic failure. Therefore, machine learning-based Structural Health Monitoring (SHM) techniques have been used for damage detection. While Lamb waves have shown significant potential in the SHM of laminated composites, most of these techniques are focused on imaging-based methods and are limited to damage detection. Therefore, this study aims to localize the damage in laminated composites without the use of imaging methods, thus improving the computational efficiency of the proposed approach. Moreover, the machine learning models are generally black-box in nature, with no transparency of the reason for their decision making. Thus, this study also proposes the use of Shapley Additive Explanations (SHAP) to identify the important feature to localize the damage in laminated composites. The proposed approach is validated by the experimental simulation of the damage at nine different locations of a composite laminate. Multi-feature extraction is carried out by first applying the Hilbert transform on the envelope signal followed by statistical feature analysis. This study compares raw signal features, Hilbert transform features, and multi-feature extraction from the Hilbert transform to demonstrate the effectiveness of the proposed approach. The results demonstrate the effectiveness of an explainable K-Nearest Neighbor (KNN) model in locating the damage, with an R2 value of 0.96, a Mean Square Error (MSE) value of 10.29, and a Mean Absolute Error (MAE) value of 0.5. Full article
(This article belongs to the Section E2: Control Theory and Mechanics)
Show Figures

Figure 1

Figure 1
<p>The proposed multi-feature extraction of the Hilbert transform framework for damage localization.</p>
Full article ">Figure 2
<p>Composite sheet fabrication, (<b>a</b>) a schematic of the symmetric cross-ply design of the composite layup, (<b>b</b>) the curing cycle utilized in the composite fabrication process, and (<b>c</b>) the resulting composite sheet.</p>
Full article ">Figure 3
<p>The specifics of the experimental setup for the damage simulator, including (<b>a</b>) all experimental paths, (<b>b</b>) the location of damage in laminated composites, and (<b>c</b>) the location path of each PZT sensor.</p>
Full article ">Figure 4
<p>Structure and recursive splitting of a DT regression model.</p>
Full article ">Figure 5
<p>KNN regression method showing how proximity to neighbors predicts target values.</p>
Full article ">Figure 6
<p>RF regression model demonstrating how random Decision Trees combine predictions through bagging.</p>
Full article ">Figure 7
<p>SVR model showing the regression curve fitting process using support vectors and optimization with the ε-insensitive loss function.</p>
Full article ">Figure 8
<p>Result comparison of five machine learning models using multi-feature extraction from raw signal (<b>a</b>) MSE, (<b>b</b>) MAE, and (<b>c</b>) <math display="inline"><semantics> <mrow> <msup> <mrow> <mi>R</mi> </mrow> <mrow> <mn>2</mn> </mrow> </msup> </mrow> </semantics></math>.</p>
Full article ">Figure 9
<p>Result comparison of five machine learning models using the Hilbert transform (<b>a</b>) MSE, (<b>b</b>) MAE, and (<b>c</b>) <math display="inline"><semantics> <mrow> <msup> <mrow> <mi>R</mi> </mrow> <mrow> <mn>2</mn> </mrow> </msup> </mrow> </semantics></math>.</p>
Full article ">Figure 10
<p>Result comparison of five machine learning models using multi-feature extraction from the Hilbert transform signal (<b>a</b>) MSE, (<b>b</b>) MAE, and (<b>c</b>) <math display="inline"><semantics> <mrow> <msup> <mrow> <mi>R</mi> </mrow> <mrow> <mn>2</mn> </mrow> </msup> </mrow> </semantics></math>.</p>
Full article ">Figure 11
<p>Localization results in terms of the true and predicted coordinates using the KNN model.</p>
Full article ">Figure 12
<p>SHAP feature importance analysis for damage localization using statistical features in the KNN model.</p>
Full article ">Figure 13
<p>Feature importance analysis using SHAP for damage localization with selected statistical features in the KNN model.</p>
Full article ">
26 pages, 5578 KiB  
Article
Predicting Harmful Algal Blooms Using Explainable Deep Learning Models: A Comparative Study
by Bekir Zahit Demiray, Omer Mermer, Özlem Baydaroğlu and Ibrahim Demir
Water 2025, 17(5), 676; https://doi.org/10.3390/w17050676 - 26 Feb 2025
Viewed by 295
Abstract
Harmful algal blooms (HABs) have emerged as a significant environmental challenge, impacting aquatic ecosystems, drinking water supply systems, and human health due to the combined effects of human activities and climate change. This study investigates the performance of deep learning models, particularly the [...] Read more.
Harmful algal blooms (HABs) have emerged as a significant environmental challenge, impacting aquatic ecosystems, drinking water supply systems, and human health due to the combined effects of human activities and climate change. This study investigates the performance of deep learning models, particularly the Transformer model, as there are limited studies exploring its effectiveness in HAB prediction. The chlorophyll-a (Chl-a) concentration, a commonly used indicator of phytoplankton biomass and a proxy for HAB occurrences, is used as the target variable. We consider multiple influencing parameters—including physical, chemical, and biological water quality monitoring data from multiple stations located west of Lake Erie—and employ SHapley Additive exPlanations (SHAP) values as an explainable artificial intelligence (XAI) tool to identify key input features affecting HABs. Our findings highlight the superiority of deep learning models, especially the Transformer, in capturing the complex dynamics of water quality parameters and providing actionable insights for ecological management. The SHAP analysis identifies Particulate Organic Carbon, Particulate Organic Nitrogen, and total phosphorus as critical factors influencing HAB predictions. This study contributes to the development of advanced predictive models for HABs, aiding in early detection and proactive management strategies. Full article
(This article belongs to the Special Issue Aquatic Ecosystems: Biodiversity and Conservation)
Show Figures

Figure 1

Figure 1
<p>Location and description of western Lake Erie water quality monitoring stations managed by NOAA’s Great Lakes Environmental Research Laboratory (original source available at NOAA GLERL’s web page: <a href="https://www.glerl.noaa.gov/res/HABs_and_Hypoxia/rtMonSQL.php" target="_blank">https://www.glerl.noaa.gov/res/HABs_and_Hypoxia/rtMonSQL.php</a>, accessed on 23 February 2025).</p>
Full article ">Figure 2
<p>Box plot of chlorophyll-a concentrations in Lake Erie from 2013 to 2020: (<b>a</b>) yearly scale, (<b>b</b>) monthly scale, and (<b>c</b>) station scale.</p>
Full article ">Figure 3
<p>Scatter plots of observed chlorophyll-a concentration (<span class="html-italic">x</span>-axis) versus prediction Chl-a concentration (<span class="html-italic">y</span>-axis) for deep learning models (LSTM, GRU, and Transformer) using different data subsets (training, test, and all data).</p>
Full article ">Figure 4
<p>HAB prediction performance of tested models in peak months.</p>
Full article ">Figure 5
<p>Feature importance analysis for Transformer model.</p>
Full article ">Figure 6
<p>SHAP values of features and their impact on predictions for Transformer model.</p>
Full article ">Figure A1
<p>Feature importance analysis for GRU model.</p>
Full article ">Figure A2
<p>SHAP values of features and their impact on predictions for GRU model.</p>
Full article ">Figure A3
<p>Feature importance analysis for LSTM model.</p>
Full article ">Figure A4
<p>SHAP values of features and their impact on predictions for LSTM model.</p>
Full article ">
15 pages, 1815 KiB  
Article
Predicting Red Blood Cell Transfusion in Elective Cardiac Surgery: A Machine Learning Approach
by Beatriz Lau, Daniel Ramos, Vera Afreixo, Luís M. Silva, Ana Helena Tavares, Miguel Martins Felgueiras, Diana Castro Paupério and João Firmino-Machado
Math. Comput. Appl. 2025, 30(2), 22; https://doi.org/10.3390/mca30020022 - 24 Feb 2025
Viewed by 265
Abstract
The benefits of Patient Blood Management can vary depending on a patient’s risk profile for requiring a blood transfusion. The objective of this study is to develop and analyse machine learning models that can identify patients at risk of requiring red blood cell [...] Read more.
The benefits of Patient Blood Management can vary depending on a patient’s risk profile for requiring a blood transfusion. The objective of this study is to develop and analyse machine learning models that can identify patients at risk of requiring red blood cell transfusion. This retrospective cohort study was conducted at a tertiary northern Portuguese hospital between 2018 and 2023. Two machine learning algorithms, extreme gradient boosting and neural networks, were employed due to their efficiency in handling complex feature interactions. Shapley additive explanations values were analysed to assess the contribution of each feature to the predictions generated by the models. The neural network achieved an accuracy of 0.735 and an area under the receiver operating characteristic curve of 0.798 (95% CI 0.747 to 0.849). The extreme gradient boosting model achieved an accuracy of 0.700 and an area under the receiver operating characteristic curve of 0.762 (95% CI 0.707 to 0.817). An analysis of Shapley additive explanations values revealed that the most important variable was preoperative haemoglobin levels, which can be optimised through the Patient Blood Management approach. These machine learning models demonstrate the potential to improve the accuracy of transfusion prediction at hospital admission, despite the absence of key variables such as surgeon identity and anaemia diagnosis. Full article
(This article belongs to the Special Issue Feature Papers in Mathematical and Computational Applications 2025)
Show Figures

Figure 1

Figure 1
<p>Schematic representation of a neural network with three hidden layers.</p>
Full article ">Figure 2
<p>Performance comparison of the models (<b>a</b>): ROC curves; (<b>b</b>): precision–recall curves.</p>
Full article ">Figure 3
<p>Prediction probabilities of the XGBoost and Neural Network models in the test set.</p>
Full article ">Figure 4
<p>Feature importance (using SHAP values) for transfusion of at least one RBC unit. (<b>a</b>): Neural Network model; (<b>b</b>): XGBoost model.</p>
Full article ">Figure A1
<p>Missing data. (<b>a</b>): Pre-PBM dataset. (<b>b</b>): Post-PBM dataset.</p>
Full article ">
27 pages, 17615 KiB  
Article
Multiscale Feature Modeling and Interpretability Analysis of the SHAP Method for Predicting the Lifespan of Landslide Dams
by Zhengze Huang, Yuqi Bai, Hengyu Liu and Yun Lin
Appl. Sci. 2025, 15(5), 2305; https://doi.org/10.3390/app15052305 - 21 Feb 2025
Viewed by 237
Abstract
Landslide dams, formed by natural disasters or human activities, pose significant challenges for lifespan prediction, which is crucial for effective water conservancy management and disaster prevention. This study proposes a hybrid CNN–Transformer model optimized using the Improved Black-Winged Kite Algorithm (IBKA) aimed at [...] Read more.
Landslide dams, formed by natural disasters or human activities, pose significant challenges for lifespan prediction, which is crucial for effective water conservancy management and disaster prevention. This study proposes a hybrid CNN–Transformer model optimized using the Improved Black-Winged Kite Algorithm (IBKA) aimed at improving the accuracy of landslide dam lifespan prediction by combining local feature extraction with global dependency modeling. The model integrates CNN’s local feature extraction with Transformer’s global modeling capabilities, effectively capturing the nonlinear dynamics of key parameters affecting landslide dam lifespan. The IBKA ensures optimal parameter tuning, which enhances the model’s adaptability and generalization, especially when dealing with small-sample datasets. Experiments utilizing multi-source heterogeneous datasets compare the proposed model with traditional machine learning and deep-learning approaches, including LightGBM, MLP, SVR, CNN–Transformer, and BKA–CNN–Transformer. The results show that the IBKA–CNN–Transformer achieves R2 values of 0.99 on training data and 0.98 on testing data, surpassing the baseline methods. Moreover, SHapley Additive exPlanations analysis quantifies the influence of critical features such as dam length, reservoir capacity, and upstream catchment area on lifespan prediction, improving model interpretability. This approach not only provides scientific insights for risk assessment and decision making in landslide dam management but also demonstrates the potential of deep learning and optimization algorithms in broader geological disaster management applications. Full article
(This article belongs to the Section Civil Engineering)
Show Figures

Figure 1

Figure 1
<p>CNN–Transformer model framework.</p>
Full article ">Figure 2
<p>Schematic diagram of the BKA optimization algorithm.</p>
Full article ">Figure 3
<p>Intelligent prediction model frame diagram.</p>
Full article ">Figure 4
<p>Violin diagrams of various variables.</p>
Full article ">Figure 5
<p>Heat map of correlation.</p>
Full article ">Figure 6
<p>Fitted plot of predicted results.</p>
Full article ">Figure 6 Cont.
<p>Fitted plot of predicted results.</p>
Full article ">Figure 7
<p>Radar chart of training set prediction results.</p>
Full article ">Figure 8
<p>Radar chart of test set prediction results.</p>
Full article ">Figure 9
<p>SHAP analysis of various influencing factors.</p>
Full article ">Figure 10
<p>Importance of influencing factors on longevity of landslide dam.</p>
Full article ">
19 pages, 1707 KiB  
Article
Automated Anomaly Detection and Causal Analysis for Civil Aviation Using QAR Data
by Xin Dang, Congcong Hua and Chuitian Rong
Appl. Sci. 2025, 15(5), 2250; https://doi.org/10.3390/app15052250 - 20 Feb 2025
Viewed by 534
Abstract
Flight Operations Quality Assurance (FOQA) is an internationally recognized solution to ensure the safety of civil aircraft flights based on Quick Access Recorder (QAR) data. The traditional approach to anomaly detection in civil aviation is to detect the over-limit values of monitoring parameters [...] Read more.
Flight Operations Quality Assurance (FOQA) is an internationally recognized solution to ensure the safety of civil aircraft flights based on Quick Access Recorder (QAR) data. The traditional approach to anomaly detection in civil aviation is to detect the over-limit values of monitoring parameters for each monitoring event based on the standards issued by civil aviation authorities. Usually, for each anomaly detection operation routine, this only works for one monitoring event. Furthermore, the causal analyses for the detected anomaly events are based on the relevant worker’s expertise. In order to improve the efficiency of FOQA, this paper proposes an automated anomaly detection and causal analysis method called MAD-XFP. Due to the unique industry characteristics of QAR data and the requirements of FOQA, feature engineering and hyper-parameter optimization techniques are utilized to enhance the machine learning model. The proposed method can monitor multiple events in one routine and provide a causal analysis. In the causal analysis process, the Shapley additive interpretation method is applied to produce analysis report for detected anomalies. Experimental evaluations are conducted on real civil aviation datasets. The experimental results show that the proposed method can efficiently and automatically detect different abnormal events with high precision in the approach phase and produce preliminary causal analysis. Full article
Show Figures

Figure 1

Figure 1
<p>Down-sampling results of pitch angle parameters.</p>
Full article ">Figure 2
<p>Distribution of different anomalies in QAR data during approach phase.</p>
Full article ">Figure 3
<p>Distribution of samples using data balance techniques.</p>
Full article ">Figure 4
<p>Overview of <math display="inline"><semantics> <mrow> <mi mathvariant="sans-serif">MAD</mi> <mtext>-</mtext> <mi mathvariant="sans-serif">XFP</mi> </mrow> </semantics></math>.</p>
Full article ">Figure 5
<p>Distribution of feature importance of <math display="inline"><semantics> <mrow> <mi mathvariant="sans-serif">MAD</mi> <mtext>-</mtext> <mi mathvariant="sans-serif">XFP</mi> </mrow> </semantics></math> (top 20).</p>
Full article ">Figure 6
<p>Confusion matrix of the model: (<b>a</b>) unbalanced; (<b>b</b>) balanced.</p>
Full article ">Figure 7
<p>Overall performance evaluation.</p>
Full article ">Figure 8
<p>Results of sensitivity analysis.</p>
Full article ">Figure 9
<p>SHAP Interpretation Chart. (<b>a</b>) SHAP interpretation chart of <math display="inline"><semantics> <mrow> <mi mathvariant="sans-serif">MAD</mi> <mtext>-</mtext> <mi mathvariant="sans-serif">XFP</mi> </mrow> </semantics></math>; (<b>b</b>) SHAP interpretation chart of anomaly label 1 (removed IVV, RALTC, and their combinations).</p>
Full article ">Figure 10
<p>Example of anomaly detection and causal analysis. (<b>a</b>) high speed during approach phase; (<b>b</b>) causal analysis of high speed during approach.</p>
Full article ">Figure 11
<p>Multi-type anomaly detection.</p>
Full article ">Figure 12
<p>Features ranking for detected anomaly events. (<b>a</b>) ILS Heading Deviation; (<b>b</b>) Large Decline Rate; (<b>c</b>) ILS Glide Slope Deviation.</p>
Full article ">
28 pages, 12327 KiB  
Article
Global Dynamic Landslide Susceptibility Modeling Based on ResNet18: Revealing Large-Scale Landslide Hazard Evolution Trends in China
by Hui Jiang, Mingtao Ding, Liangzhi Li and Wubiao Huang
Appl. Sci. 2025, 15(4), 2038; https://doi.org/10.3390/app15042038 - 15 Feb 2025
Viewed by 342
Abstract
Large-scale and long-term landslide susceptibility assessments are crucial for revealing the patterns of landslide risk variation and for guiding the formulation of disaster prevention and mitigation policies at the national level. This study, through the establishment of a global dynamic landslide susceptibility model, [...] Read more.
Large-scale and long-term landslide susceptibility assessments are crucial for revealing the patterns of landslide risk variation and for guiding the formulation of disaster prevention and mitigation policies at the national level. This study, through the establishment of a global dynamic landslide susceptibility model, uses the multi-dimensional analysis strategy and studies the development trend of China’s large-scale landslide susceptibility. First, a global landslide dataset consisting of 8023 large-scale landslide events triggered by rainfall and earthquakes between 2001 and 2020 was constructed based on the GEE (Google Earth Engine) platform. Secondly, a global dynamic landslide susceptibility model was developed using the ResNet18 (18-layer residual neural network) DL (deep learning) framework, incorporating both dynamic and static LCFs (landslide conditioning factors). The model was utilized to generate sequential large-scale landslide susceptibility maps for China from 2001 to 2022. Finally, the MK (Mann–Kendall) test was used to investigate the change trends in the large-scale landslide susceptibility of China. The results of the study are as follows. (1) The ResNet18 model outperformed SVMs (support vector machines) and CNNs (convolutional neural networks), with an AUC value of 0.9362. (2) SHAP (Shapley Additive Explanations) analyses revealed that precipitation played an important factor in the occurrence of landslides in China. In addition, profile curvature, NDVI, and distance to faults are thought to have a significant impact on landslide susceptibility. (3) The large-scale landslide susceptibility trends in China are complex and varied. Particular emphasis should be placed on Southwest China, including Chongqing, Guizhou, and Sichuan, which exhibit high landslide susceptibility and notable upward trends, and also consider Northwest China, including Shaanxi and Shanxi, which have high susceptibility but decreasing trends. These results provide valuable insights for disaster prevention and mitigation in China. Full article
Show Figures

Figure 1

Figure 1
<p>Geomorphological zoning in China.</p>
Full article ">Figure 2
<p>Global landslide distribution.</p>
Full article ">Figure 3
<p>Dynamic landslide susceptibility evaluation process.</p>
Full article ">Figure 4
<p>The network architecture. (<b>a</b>) The CNN network architecture, (<b>b</b>) the residual unit, and (<b>c</b>) the ResNet18 network architecture.</p>
Full article ">Figure 5
<p>The <span class="html-italic">PCC</span> calculation results of condition factors. (Areas in the red box represent correlations between static factors, areas in the yellow box represent correlations between dynamic factors, and other areas represent correlations between dynamic and static factors.)</p>
Full article ">Figure 6
<p>The <span class="html-italic">MI</span> calculation results of condition factors.</p>
Full article ">Figure 7
<p>Spatial distribution of precipitation, NDVI, and land cover types (<b>a</b>,<b>c</b>,<b>e</b>) and their change trends test (<b>b</b>,<b>d</b>,<b>f</b>) during 2001–2022.</p>
Full article ">Figure 8
<p>ROC curve and accuracy evaluation of test data.</p>
Full article ">Figure 9
<p>Percentage area distribution of landslide susceptibility zones from 2001 to 2022.</p>
Full article ">Figure 10
<p>Spatial distribution of LSM (<b>a</b>) and spatial change trend test landslide susceptibility (<b>b</b>) from 2001 to 2022.</p>
Full article ">Figure 11
<p>Temporal of landslide susceptibility in typical regions of China, 2001–2022 (red solid line is linear regression, the blue circle is the predicted probability of landslide in the corresponding year).</p>
Full article ">Figure 12
<p>Ranking of feature importance based on SHAP method (<b>a</b>) and summary plot (<b>b</b>).</p>
Full article ">Figure 13
<p>Test and comparison of local datasets across various models: (<b>a</b>) SVM, (<b>b</b>) CNN, and (<b>c</b>) ResNet18.</p>
Full article ">Figure 14
<p>Comparison analysis of landslide susceptibility maps from previous studies. (<b>a</b>) Landslide hazard map developed by Liu [<a href="#B29-applsci-15-02038" class="html-bibr">29</a>]. (<b>b</b>) LSM prepared by Wang [<a href="#B31-applsci-15-02038" class="html-bibr">31</a>]. (<b>c</b>) LSM for 2017 based on the ResNet18 from this study. (<b>d</b>) LSM for 2020 based on the ResNet18 from this study.</p>
Full article ">
30 pages, 11000 KiB  
Article
From Data to Insights: Modeling Urban Land Surface Temperature Using Geospatial Analysis and Interpretable Machine Learning
by Nhat-Duc Hoang, Van-Duc Tran and Thanh-Canh Huynh
Sensors 2025, 25(4), 1169; https://doi.org/10.3390/s25041169 - 14 Feb 2025
Viewed by 354
Abstract
This study introduces an innovative machine learning method to model the spatial variation of land surface temperature (LST) with a focus on the urban center of Da Nang, Vietnam. Light Gradient Boosting Machine (LightGBM), support vector machine, random forest, and Deep Neural Network [...] Read more.
This study introduces an innovative machine learning method to model the spatial variation of land surface temperature (LST) with a focus on the urban center of Da Nang, Vietnam. Light Gradient Boosting Machine (LightGBM), support vector machine, random forest, and Deep Neural Network are employed to establish functional relationships between urban LST and its influencing factors. The machine learning approaches are trained and validated using remote sensing data from 2014, 2019, and 2024. Various explanatory variables representing topographical and spatial characteristics, as well as urban landscapes, are used. Experimental results show that LightGBM outperforms other benchmark methods. In addition, Shapley Additive Explanations are utilized to clarify the impact of the factors affecting LST. The analysis outcomes indicate that while the importance of these variables changes over time, urban density and greenspace density consistently emerge as the most influential factors. LightGBM attained R2 values of 0.85, 0.92, and 0.91 for the years 2014, 2019, and 2024, respectively. The findings of this work can be helpful for deeper understanding of urban heat stress dynamics and facilitate urban planning. Full article
Show Figures

Figure 1

Figure 1
<p>The study area.</p>
Full article ">Figure 2
<p>LST in the study area: (<b>a</b>) 2014, (<b>b</b>) 2019, and (<b>c</b>) 2024.</p>
Full article ">Figure 3
<p>Topographic features: (<b>a</b>) elevation, (<b>b</b>) slope, (<b>c</b>) aspect, and (<b>d</b>) TPI.</p>
Full article ">Figure 4
<p>Spatial features: (<b>a</b>) distance to coastlines and (<b>b</b>) distance to rivers.</p>
Full article ">Figure 5
<p>Spectral indices: (<b>a</b>) NDVI, (<b>b</b>) NDBI, (<b>c</b>) ANDWI, and (<b>d</b>) NDBSI.</p>
Full article ">Figure 5 Cont.
<p>Spectral indices: (<b>a</b>) NDVI, (<b>b</b>) NDBI, (<b>c</b>) ANDWI, and (<b>d</b>) NDBSI.</p>
Full article ">Figure 6
<p>LightGBM prediction model.</p>
Full article ">Figure 7
<p>The proposed framework: (<b>a</b>) density maps and (<b>b</b>) LST modeling.</p>
Full article ">Figure 8
<p>Maps of land covers: (<b>a</b>) 2014, (<b>b</b>) 2019, and (<b>c</b>) 2024.</p>
Full article ">Figure 9
<p>Maps of built-up and greenspace density.</p>
Full article ">Figure 10
<p>Correlations between the independent variables and LST.</p>
Full article ">Figure 11
<p>LightGBM prediction results: (<b>a</b>) LST in 2014, (<b>b</b>) LST in 2019, and (<b>c</b>) LST in 2024.</p>
Full article ">Figure 12
<p>Prediction results of benchmark models.</p>
Full article ">Figure 13
<p>SHAP impact plots: (<b>a</b>) 2014, (<b>b</b>) 2019, and (<b>c</b>) 2024.</p>
Full article ">Figure 13 Cont.
<p>SHAP impact plots: (<b>a</b>) 2014, (<b>b</b>) 2019, and (<b>c</b>) 2024.</p>
Full article ">Figure 14
<p>Proportions of land cover in each year: (<b>a</b>) 2014, (<b>b</b>) 2019, and (<b>c</b>) 2024.</p>
Full article ">
Back to TopTop