[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (498)

Search Parameters:
Keywords = Shapley values

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 4809 KiB  
Article
Interpretable Combinatorial Machine Learning-Based Shale Fracability Evaluation Methods
by Di Wang, Dingyu Jiao, Zihang Zhang, Runze Zhou, Weize Guo and Huai Su
Energies 2025, 18(1), 186; https://doi.org/10.3390/en18010186 (registering DOI) - 4 Jan 2025
Viewed by 275
Abstract
Shale gas, as an important unconventional hydrocarbon resource, has attracted much attention due to its great potential and the need for energy diversification. However, shale gas reservoirs with low permeability and low porosity pose challenges for extraction, making shale fracability evaluation crucial. Conventional [...] Read more.
Shale gas, as an important unconventional hydrocarbon resource, has attracted much attention due to its great potential and the need for energy diversification. However, shale gas reservoirs with low permeability and low porosity pose challenges for extraction, making shale fracability evaluation crucial. Conventional methods have limitations as they cannot comprehensively consider the effects of non-linear factors or quantitatively analyse the effects of factors. In this paper, an interpretable combinatorial machine learning shale fracability evaluation method is proposed, which combines XGBoost and Bayesian optimization techniques to mine the non-linear relationship between the influencing factors and fracability, and to achieve more accurate fracability evaluations with a lower error rate (maximum MAPE not more than 20%). SHAP(SHapley Additive exPlanation) value analyses were used to quantitatively assess the factor impacts, provide the characteristic importance ranking, and visualise the contribution trend through summary and dependency plots. Analyses of seven scenarios showed that ‘Vertical—Min Horizontal’ and ‘Vertical Stress’ had the greatest impact. This approach improves the accuracy and interpretability of the assessment and provides strong support for shale gas exploration and development by enhancing the understanding of the role of factors. Full article
(This article belongs to the Section H: Geo-Energy)
Show Figures

Figure 1

Figure 1
<p>Feature importance ranking chart for Scenario 1.</p>
Full article ">Figure 2
<p>Summary diagram of scenario 1.</p>
Full article ">Figure 3
<p>Vertical—Min Horizontal dependence diagram for scenario 1.</p>
Full article ">Figure 4
<p>Feature importance ranking chart for scenario 2.</p>
Full article ">Figure 5
<p>Summary diagram of scenario 2.</p>
Full article ">Figure 6
<p>Vertical—Min Horizontal dependence diagram for scenario 2.</p>
Full article ">Figure 7
<p>Feature importance ranking chart for scenario 3.</p>
Full article ">Figure 8
<p>Summary diagram of scenario 3.</p>
Full article ">Figure 9
<p>Vertical—Min Horizontal dependence diagram for scenario 3.</p>
Full article ">Figure 10
<p>Vertical Stress dependence diagram for scenario 3.</p>
Full article ">Figure 11
<p>Feature importance ranking chart for scenario 4.</p>
Full article ">Figure 12
<p>Summary diagram of scenario 4.</p>
Full article ">Figure 13
<p>Vertical Stress dependence diagram for scenario 4.</p>
Full article ">Figure 14
<p>Feature importance ranking chart for scenario 5.</p>
Full article ">Figure 15
<p>Summary diagram of scenario 5.</p>
Full article ">Figure 16
<p>Vertical Stress dependence diagram for scenario 5.</p>
Full article ">Figure 17
<p>Dependence diagram for the ‘Modulus of elasticity’ indicator in scenario 5.</p>
Full article ">Figure 18
<p>Feature importance ranking chart for scenario 6.</p>
Full article ">Figure 19
<p>Summary diagram of scenario 6.</p>
Full article ">Figure 20
<p>Vertical Stress dependence diagram for scenario 6.</p>
Full article ">Figure 21
<p>Feature importance ranking chart for scenario 7.</p>
Full article ">Figure 22
<p>Summary diagram of scenario 7.</p>
Full article ">Figure 23
<p>Vertical—Min Horizontal dependence diagram for scenario 7.</p>
Full article ">
15 pages, 714 KiB  
Article
Machine Learning Approaches for Predicting Maize Biomass Yield: Leveraging Feature Engineering and Comprehensive Data Integration
by Maryam Abbasi, Paulo Váz, José Silva and Pedro Martins
Sustainability 2025, 17(1), 256; https://doi.org/10.3390/su17010256 - 2 Jan 2025
Viewed by 280
Abstract
The efficient prediction of corn biomass yield is critical for optimizing crop production and addressing global challenges in sustainable agriculture and renewable energy. This study employs advanced machine learning techniques, including Gradient Boosting Machines (GBMs), Random Forests (RFs), Support Vector Machines (SVMs), and [...] Read more.
The efficient prediction of corn biomass yield is critical for optimizing crop production and addressing global challenges in sustainable agriculture and renewable energy. This study employs advanced machine learning techniques, including Gradient Boosting Machines (GBMs), Random Forests (RFs), Support Vector Machines (SVMs), and Artificial Neural Networks (ANNs), integrated with comprehensive environmental, soil, and crop management data from key agricultural regions in the United States. A novel framework combines feature engineering, such as the creation of a Soil Fertility Index (SFI) and Growing Degree Days (GDDs), and the incorporation of interaction terms to address complex non-linear relationships between input variables and biomass yield. We conduct extensive sensitivity analysis and employ SHAP (SHapley Additive exPlanations) values to enhance model interpretability, identifying SFI, GDDs, and cumulative rainfall as the most influential features driving yield outcomes. Our findings highlight significant synergies among these variables, emphasizing their critical role in rural environmental governance and precision agriculture. Furthermore, an ensemble approach combining GBMs, RFs, and ANNs outperformed individual models, achieving an RMSE of 0.80 t/ha and R2 of 0.89. These results underscore the potential of hybrid modeling for real-world applications in sustainable farming practices. Addressing the concerns of passive farmer participation, we propose targeted incentives, education, and institutional support mechanisms to enhance stakeholder collaboration in rural environmental governance. While the models assume rational decision-making, the inclusion of cultural and political factors warrants further investigation to improve the robustness of the framework. Additionally, a map of the study region and improved visualizations of feature importance enhance the clarity and relevance of our findings. This research contributes to the growing body of knowledge on predictive modeling in agriculture, combining theoretical rigor with practical insights to support policymakers and stakeholders in optimizing resource use and addressing environmental challenges. By improving the interpretability and applicability of machine learning models, this study provides actionable strategies for enhancing crop yield predictions and advancing rural environmental governance. Full article
Show Figures

Figure 1

Figure 1
<p>Learning curves for GBMs and RFs models. GBMs model achieves lower validation errors with smaller training sets, showcasing its learning efficiency and predictive power compared to RFs model.</p>
Full article ">Figure 2
<p>Top 10 features based on mean absolute SHAP values. SFI was the most critical factor, followed by GDDs and cumulative rainfall.</p>
Full article ">Figure 3
<p>Interaction effects of SFI and GDDs on biomass yield. The synergistic effect of high SFI and GDD values is evident, leading to higher yields.</p>
Full article ">Figure 4
<p>Partial dependence plots for key features. The plots show non-linear relationships, with diminishing returns for SFI and GDDs beyond certain thresholds.</p>
Full article ">Figure 5
<p>Sensitivity analysis of GBMs model to key input variables. SFI is the most sensitive variable, followed by GDDs and cumulative rainfall.</p>
Full article ">Figure 6
<p>Heatmap of feature correlations. Strong positive correlations between GDDs, cumulative rainfall, and biomass yield are observed, while planting density shows a negative correlation at higher levels.</p>
Full article ">
37 pages, 10558 KiB  
Article
Climate Impact on Evapotranspiration in the Yellow River Basin: Interpretable Forecasting with Advanced Time Series Models and Explainable AI
by Sheheryar Khan, Huiliang Wang, Umer Nauman, Rabia Dars, Muhammad Waseem Boota and Zening Wu
Remote Sens. 2025, 17(1), 115; https://doi.org/10.3390/rs17010115 - 1 Jan 2025
Viewed by 449
Abstract
Evapotranspiration (ET) plays a crucial role in the hydrological cycle, significantly impacting agricultural productivity and water resource management, particularly in water-scarce areas. This study explores the effects of key climate variables temperature, precipitation, solar radiation, wind speed, and humidity on ET from 2000 [...] Read more.
Evapotranspiration (ET) plays a crucial role in the hydrological cycle, significantly impacting agricultural productivity and water resource management, particularly in water-scarce areas. This study explores the effects of key climate variables temperature, precipitation, solar radiation, wind speed, and humidity on ET from 2000 to 2020, with forecasts extended to 2030. Advanced data preprocessing techniques, including Yeo-Johnson and Box-Cox transformations, Savitzky–Golay smoothing, and outlier elimination, were applied to improve data quality. Datasets from MODIS, TRMM, GLDAS, and ERA5 were utilized to enhance model accuracy. The predictive performance of various time series forecasting models, including Prophet, SARIMA, STL + ARIMA, TBATS, ARIMAX, and ETS, was systematically evaluated. This study also introduces novel algorithms for Explainable AI (XAI) and SHAP (SHapley Additive exPlanations), enhancing the interpretability of model predictions and improving understanding of how climate variables affect ET. This comprehensive methodology not only accurately forecasts ET but also offers a transparent approach to understanding climatic effects on ET. The results indicate that Prophet and ETS models demonstrate superior prediction accuracy compared to other models. The ETS model achieved the lowest Mean Absolute Error (MAE) values of 0.60 for precipitation, 0.51 for wind speed, and 0.48 for solar radiation. Prophet excelled with the lowest Root Mean Squared Error (RMSE) values of 0.62 for solar radiation, 0.67 for wind speed, and 0.74 for precipitation. SHAP analysis indicates that temperature has the strongest impact on ET predictions, with SHAP values ranging from −1.5 to 1.0, followed by wind speed (−0.75 to 0.75) and solar radiation (−0.5 to 0.5). Full article
(This article belongs to the Special Issue Advanced Techniques for Water-Related Remote Sensing (Second Edition))
Show Figures

Figure 1

Figure 1
<p>Representation of the study region YRBC while the Yellow River is highlighted in blue.</p>
Full article ">Figure 2
<p>Histogram of Box-Cox before and after transformation for precipitation and solar radiation. (<b>a</b>,<b>c</b>) Untransformed values show slight to moderate skewness and lighter tails. (<b>b</b>,<b>d</b>) Box-Cox transformed values exhibit reduced skewness and improve normality, demonstrating the transformation’s effectiveness.</p>
Full article ">Figure 3
<p>Flow chart of novel methodology for ET forecasting and analysis: the flowchart shows data preprocessing, model selection, and performance evaluation. SHAP analysis and a Surrogate Decision-Tree model improve interpretability and reveal how climate variables affect model predictions.</p>
Full article ">Figure 4
<p>Comparative heatmap of model performance metrics (MSE, RMSE, MAE) across climate variables (precipitation, temperature, solar radiation, wind speed, humidity) for models ETS, TBATS, Prophet, STL + ARIMA, and SARIMA. The color gradient shows error magnitude, with darker blue suggesting better model performance and red/orange indicating worse performance.</p>
Full article ">Figure 5
<p>ARIMAX forecast for ET and the incorporation of climate variables as exogenous variables. The ARIMAX model exhibits robust predictive capabilities for both solar radiation and wind speed, with MAE values of 0.58 and 0.58, respectively, and RMSE values below 1. Conversely, the forecast for temperature displays greater deviations (MAE: 1.17, RMSE: 1.71), indicating that the relationship between ET and temperature over time is more intricate to predict.</p>
Full article ">Figure 6
<p>SARIMA model forecast for ET, which does not include climate variables as exogenous variables. Solar radiation (MAE: 0.68, RMSE: 1.12) and wind speed exhibit comparatively low error in the SARIMA model’s forecast of ET in relation to a variety of climate factors, indicating that these variables are more predictable. Conversely, the model experiences greater difficulties with temperature predictions, as evidenced by an MAE of 1.55 and RMSE of 2.13.</p>
Full article ">Figure 7
<p>ETS model forecast for ET and climate variables: The ETS model effectively predicts ET and climate variables, demonstrating exceptional solar radiation (MAE: 0.48, RMSE: 0.64) and wind speed (MAE: 0.51, RMSE: 0.66). Nevertheless, the model demonstrates slightly higher errors for humidity (MAE: 0.95, RMSE: 1.27) and temperature (MAE: 1.01, RMSE: 1.37), suggesting that it has moderate difficulty in capturing variations in these factors.</p>
Full article ">Figure 8
<p>STL + ARIMA model forecast for ET and climate variables: The model accurately predicts wind speed and precipitation with comparatively low error values. However, it has poor performance when forecasting temperature, as evidenced by the extremely high MSE (89.87) and RMSE (9.48) values. Solar radiation and humidity also pose moderate challenges, as evidenced by MAE values exceeding 1.0.</p>
Full article ">Figure 9
<p>TBATS model forecast for ET and climate variables: The TBATS model yields accurate predictions for most climate variables, with the lowest errors in wind speed and solar radiation. Although temperature estimates are generally accurate, they exhibit a higher variance (MAE: 1.01), while humidity predictions also exhibit moderate forecasting errors (MAE: 0.90).</p>
Full article ">Figure 10
<p>Prophet model forecast for ET and climate variables: With MAE values of 0.47 and 0.55 for solar radiation and wind speed, respectively, the Prophet model performs well in terms of prediction. However, it faces greater difficulties with temperature (MAE: 0.99, RMSE: 1.30) and humidity (MAE: 1.01, RMSE: 1.25), where the forecast errors are marginally higher.</p>
Full article ">Figure 11
<p>Overall, radar performance comparison of forecasting models across climate variables: each chart shows MAE, MSE, RMSE, R, and NSE model errors. ETS, TBATS, and SARIMA have lower error values for most climatic variables, indicating improved accuracy and reliability. STL + ARIMA has larger errors, especially for temperature, indicating its difficulty anticipating volatile climate variables.</p>
Full article ">Figure 12
<p>(<b>a</b>,<b>b</b>): Decision tree surrogate models for SARIMA and ARIMAX: both models prioritize temperature and wind speed as primary splitting variables. The accuracy of ET prediction is substantially influenced by initial splits at temperature thresholds of ≤23.46 °C and wind speed thresholds of ≤0.46 for SARIMA and ≤0.43 for ARIMAX. (<b>c,d</b>): Decision Tree Surrogate Models for ETS and Prophet: With initial divides at temperature ≤ 23.46 °C and considerable secondary splits depending on precipitation (ETS) and wind speed (Prophet), both models emphasize temperature and wind speed as important predictors and demonstrate their impact on ET forecasting. (<b>e</b>,<b>f</b>): Decision Tree Surrogate Models for TBATS and STL + ARIMA: temperature and humidity are the primary factors, with TBATS emphasizing wind speed and STL + ARIMA concentrating on the intricate interactions between temperature and precipitation. This demonstrates their ability to capture a variety of ET prediction patterns.</p>
Full article ">Figure 12 Cont.
<p>(<b>a</b>,<b>b</b>): Decision tree surrogate models for SARIMA and ARIMAX: both models prioritize temperature and wind speed as primary splitting variables. The accuracy of ET prediction is substantially influenced by initial splits at temperature thresholds of ≤23.46 °C and wind speed thresholds of ≤0.46 for SARIMA and ≤0.43 for ARIMAX. (<b>c,d</b>): Decision Tree Surrogate Models for ETS and Prophet: With initial divides at temperature ≤ 23.46 °C and considerable secondary splits depending on precipitation (ETS) and wind speed (Prophet), both models emphasize temperature and wind speed as important predictors and demonstrate their impact on ET forecasting. (<b>e</b>,<b>f</b>): Decision Tree Surrogate Models for TBATS and STL + ARIMA: temperature and humidity are the primary factors, with TBATS emphasizing wind speed and STL + ARIMA concentrating on the intricate interactions between temperature and precipitation. This demonstrates their ability to capture a variety of ET prediction patterns.</p>
Full article ">Figure 13
<p>SHAP value analysis of feature impact on ET predictions across different models: SHAP value charts for six models (<b>a</b>) ARIMAX, (<b>b</b>) Prophet, (<b>c</b>) SARIMA, (<b>d</b>) ETS, (<b>e</b>) STL + ARIMA, and (<b>f</b>) TBATS show how climate parameters (temperature, precipitation, wind speed, humidity, solar radiation) affect ET forecasts. Blue dots indicate low feature values, whereas red points indicate high values. SHAP values, shown by the dots on the <span class="html-italic">x</span>-axis, measure each feature’s contribution to the model’s prediction for a single occurrence. Positive SHAP values improve ET predictions, while negative values decrease them. Across models, temperature and precipitation have the greatest impact, but the impacts of humidity and solar radiation vary.</p>
Full article ">
16 pages, 3935 KiB  
Article
Prediction of Persistent Tumor Status in Nasopharyngeal Carcinoma Post-Radiotherapy-Related Treatment: A Machine Learning Approach
by Hsien-Chun Tseng, Chao-Yu Shen, Pan-Fu Kao, Chun-Yi Chuang, Da-Yi Yan, Yi-Han Liao, Xuan-Ping Lu, Ting-Jung Sheu and Wei-Chih Shen
Cancers 2025, 17(1), 96; https://doi.org/10.3390/cancers17010096 - 31 Dec 2024
Viewed by 284
Abstract
Background/Objectives: The duration of the response to radiotherapy-related treatment is a critical prognostic indicator for patients with nasopharyngeal carcinoma (NPC). Persistent tumor status, including residual tumor presence and early recurrence, is associated with poorer survival outcomes. To address this, we developed a prediction [...] Read more.
Background/Objectives: The duration of the response to radiotherapy-related treatment is a critical prognostic indicator for patients with nasopharyngeal carcinoma (NPC). Persistent tumor status, including residual tumor presence and early recurrence, is associated with poorer survival outcomes. To address this, we developed a prediction model to identify patients at a high risk of persistent tumor status prior to initiating treatment. Methods: This retrospective study included 104 patients with NPC receiving radiotherapy-related treatment who had completed a 3-year follow-up period; 29 were classified into the persistent tumor status group and 75 into the disease-free group. Radiomic features were extracted from pretreatment positron emission tomography (PET) images and used to construct a prediction model by employing machine learning algorithms. The model’s diagnostic performance was assessed using the area under the receiver operating characteristic curve (AUC), whereas SHapley Additive exPlanations (SHAP) analysis was conducted to determine the contribution of individual features to the model. Results: The prediction model developed using the AdaBoost algorithm and validated through five-fold cross-validation achieved the highest AUC of 0.934. Its sensitivity, specificity, positive predictive value, negative predictive value, and accuracy were 89.66%, 86.67%, 72.22%, 95.59%, and 87.5%, respectively. SHAP analysis revealed that the feature of high dependence low metabolic uptake emphasis50 had the greatest impact on model predictions. Furthermore, patients classified as disease-free exhibited markedly higher overall survival rates compared with those with persistent tumor status. Conclusions: In conclusion, the proposed prediction model efficiently identified patients with NPC at a high risk of persistent tumor status by using radiomic features extracted from pretreatment PET images. Full article
(This article belongs to the Special Issue Advances in Radiation Therapy for Head and Neck Cancer)
Show Figures

Figure 1

Figure 1
<p>Overview of the study design illustrating the development of a machine learning model for predicting persistent tumor status in patients with NPC undergoing radiotherapy-related treatments.</p>
Full article ">Figure 2
<p>Flowchart of patient selection.</p>
Full article ">Figure 3
<p>Diagnostic performance of AI models.</p>
Full article ">Figure 4
<p>Overall survival curve.</p>
Full article ">Figure 5
<p>Mean SHAP value analysis of radiomic features.</p>
Full article ">Figure 6
<p>SHAP dependence plot for the top 5 radiomic features contributing to the AdaBoost model. (<b>a</b>) High dependence low metabolic uptake level emphasis<sub>50</sub>, (<b>b</b>) surface area<sub>50</sub>, (<b>c</b>) maximal correlation coefficient<sub>Diff</sub>, (<b>d</b>) inverse variance<sub>Diff</sub>, and (<b>e</b>) interquartile range<sub>60.</sub></p>
Full article ">
20 pages, 7510 KiB  
Article
Well-Production Forecasting Using Machine Learning with Feature Selection and Automatic Hyperparameter Optimization
by Ruibin Zhu, Ning Li, Yongqiang Duan, Gaofeng Li, Guohua Liu, Fengjiao Qu, Changjun Long, Xin Wang, Qinzhuo Liao and Gensheng Li
Energies 2025, 18(1), 99; https://doi.org/10.3390/en18010099 - 30 Dec 2024
Viewed by 304
Abstract
Well-production forecasting plays a crucial role in oil and gas development. Traditional methods, such as numerical simulations, require substantial computational effort, while empirical models tend to exhibit poor accuracy. To address these issues, machine learning, a widely adopted artificial intelligence approach, is employed [...] Read more.
Well-production forecasting plays a crucial role in oil and gas development. Traditional methods, such as numerical simulations, require substantial computational effort, while empirical models tend to exhibit poor accuracy. To address these issues, machine learning, a widely adopted artificial intelligence approach, is employed to develop production forecasting models in order to enhance the accuracy of oil and gas well-production predictions. This research focuses on the geological, engineering, and production data of 435 fracturing wells in the North China Oilfield. First, outliers were detected, and missing values were handled using the mean imputation and nearest neighbor methods. Subsequently, Pearson correlation coefficients were utilized to eliminate linearly irrelevant features and optimize the dataset. By calculating the gray correlation degrees, maximum mutual information, feature importance, and Shapley additive explanation (SHAP) values, an in-depth analysis of various dominant factors was conducted. To further assess the importance of these factors, the entropy weight method was employed. Ultimately, 19 features that were highly correlated with the target variable were successfully screened as inputs for subsequent models. Based on the AutoGluon framework, model training was conducted using 5-fold cross-validation combined with bagging and stacking techniques. The training results show that the model achieved an R2 of 0.79 on the training set, indicating good fitting ability. This study offers a promising approach for the development of oil and gas production forecasting models. Full article
Show Figures

Figure 1

Figure 1
<p>AutoGluon Framework Process Diagram.</p>
Full article ">Figure 2
<p>Box distribution graph after data normalization processing.</p>
Full article ">Figure 3
<p>Histogram of the overall distribution of quantitative data.</p>
Full article ">Figure 4
<p>Abnormal value determination.</p>
Full article ">Figure 5
<p>Distribution of missing values in each row of data.</p>
Full article ">Figure 6
<p>Distribution of missing rate of feature parameter data.</p>
Full article ">Figure 7
<p>IQR method identifies outliers in the data.</p>
Full article ">Figure 8
<p>MAD method identifies outliers in the data.</p>
Full article ">Figure 9
<p>Identifying outliers in data using the triple standard deviation method.</p>
Full article ">Figure 10
<p>DBSCAN Abnormal value determination.</p>
Full article ">Figure 11
<p>Comparison chart before and after missing value supplementation.</p>
Full article ">Figure 12
<p>Pearson coefficient heatmap of production and construction data.</p>
Full article ">Figure 13
<p>Combination of linear correlation above 0.4 in production and construction data.</p>
Full article ">Figure 14
<p>Using entropy weight method for comprehensive evaluation and analysis of main control factors.</p>
Full article ">Figure 15
<p>AutoGluon Prediction comparison chart.</p>
Full article ">
14 pages, 2603 KiB  
Article
Feature Engineering to Embed Process Knowledge: Analyzing the Energy Efficiency of Electric Arc Furnace Steelmaking
by Quantum Zhuo, Mansour N. Al-Harbi and Petrus C. Pistorius
Metals 2025, 15(1), 13; https://doi.org/10.3390/met15010013 - 28 Dec 2024
Viewed by 473
Abstract
The importance of electric arc furnace (EAF) steelmaking is expected to increase worldwide as parts of the industry transition to lower carbon dioxide emissions. This work analyzed one year’s operational data from an EAF plant that uses a large proportion of direct-reduced iron [...] Read more.
The importance of electric arc furnace (EAF) steelmaking is expected to increase worldwide as parts of the industry transition to lower carbon dioxide emissions. This work analyzed one year’s operational data from an EAF plant that uses a large proportion of direct-reduced iron (DRI) in the furnace feed. The data were used to test different approaches to quantifying the effects of process conditions on specific electricity consumption (kWh per ton of crude steel). In previous work, inputs such as the proportion of DRI, fluxes, natural gas, and oxygen were linearly correlated with the specific electricity consumption. The current work has confirmed that conventional multiple linear regression (MLR) reproduces electricity consumption trends in EAF steelmaking, but many model coefficients deviated significantly from expected values and appeared unphysical. The implementation of engineered features—the slag volume and total carbon input—in an MLR model resulted in coefficients that were closer to expectations, but did not improve prediction accuracy. Further improvement was obtained by applying the engineered features to a non-linear machine-learned model (based on XGBoost), yielding both physically reasonable trends and smaller prediction errors. Trends from Shapley dependence analysis (applied to the XGBoost model) are quantitatively consistent with theoretical trends. These include the energy needed to melt slag, and the endothermic effect of carbon additions. The fitted models demonstrate the potential to diagnose poor slag foaming by showing an increase in electricity consumption with increased oxygen use. This example demonstrates that practically important steelmaking process insights inferred via a linear regression approach can be improved by applying Shapley analysis to a machine-learned model based on engineered features. Full article
Show Figures

Figure 1

Figure 1
<p>Distribution of compositions (mass percentages) of direct reduced iron (total number of observations: 947).</p>
Full article ">Figure 2
<p>Contour plot of the RMSE values (in kWh/ton) obtained with test data after applying XGBoost models trained with different values of the learning rate (“eta”) and the maximum tree depth; the chosen model is the one with the lowest RMSE value (at a learning rate of 0.01 and a maximum depth of 4, for this example).</p>
Full article ">Figure 3
<p>Scatter plot matrices of the variation of and correlation between input variables for (<b>a</b>) variables similar to those used by Köhle et al. [<a href="#B6-metals-15-00013" class="html-bibr">6</a>] and (<b>b</b>) replacing the DRI and flux variables with the engineered features of slag volume and carbon input. Scatter plots are shown below the diagonal, variable distributions on the diagonal, and Pearson coefficients above the diagonal. See <a href="#metals-15-00013-t002" class="html-table">Table 2</a> for explanations of the variable names and their units.</p>
Full article ">Figure 4
<p>Illustration of the energy balances used to estimate the effects of different reactions and changes on the energy demand of EAF steelmaking: (<b>a</b>) Heating and melting pure oxides to form liquid slag; (<b>b</b>) combustion of methane with cold oxygen; (<b>c</b>) reaction of cold carbon with FeO; (<b>d</b>) a 1 °C increase in liquid steel temperature at tap; (<b>e</b>) reaction of cold injected oxygen with liquid iron.</p>
Full article ">Figure 5
<p>Comparison of fitted and actual electricity consumption for the three models; the shading indicates the distribution of points. The plots are for (<b>a</b>) multiple linear regression with original variables (RMSE = 23.1 kWh/ton); (<b>b</b>) multiple linear regression with engineered features (RMSE = 24.8 kWh/ton); and (<b>c</b>) XGBoost model with engineered features (RMSE = 18.6 kWh/ton on average).</p>
Full article ">Figure 6
<p>Beeswarm plot showing the range of effects of different input variables on electricity consumption (see <a href="#metals-15-00013-t002" class="html-table">Table 2</a> for an explanation of the variable names). The color scale shows the correlation between the change in the input variables (“features”) and the change in electricity consumption.</p>
Full article ">Figure 7
<p>Shapley dependence plots showing the effects of the major input variables on electricity consumption, as fitted with the XGBoost model (data points): (<b>a</b>) total carbon input, (<b>b</b>) oxygen consumption, (<b>c</b>) slag volume, (<b>d</b>) tap-to-tap time, and (<b>e</b>) natural gas usage. The broken lines show the relationships expected from the mass and energy balances, or—in the case of tap-to-tap time—the literature value [<a href="#B6-metals-15-00013" class="html-bibr">6</a>].</p>
Full article ">
14 pages, 3788 KiB  
Article
The Potential of SHAP and Machine Learning for Personalized Explanations of Influencing Factors in Myopic Treatment for Children
by Jun-Wei Chen, Hsin-An Chen, Tzu-Chi Liu, Tzu-En Wu and Chi-Jie Lu
Medicina 2025, 61(1), 16; https://doi.org/10.3390/medicina61010016 - 26 Dec 2024
Viewed by 355
Abstract
Background and Objectives: The rising prevalence of myopia is a significant global health concern. Atropine eye drops are commonly used to slow myopia progression in children, but their long-term use raises concern about intraocular pressure (IOP). This study uses SHapley Additive exPlanations (SHAP) [...] Read more.
Background and Objectives: The rising prevalence of myopia is a significant global health concern. Atropine eye drops are commonly used to slow myopia progression in children, but their long-term use raises concern about intraocular pressure (IOP). This study uses SHapley Additive exPlanations (SHAP) to improve the interpretability of machine learning (ML) model predicting end IOP, offering clinicians explainable insights for personalized patient management. Materials and Methods: This retrospective study analyzed data from 1191 individual eyes of 639 boys and 552 girls with myopia treated with atropine. The average age of the whole group was 10.6 ± 2.5 years old. The refractive error of spherical equivalent (SE) in myopia degree was base SE at 2.63D and end SE at 3.12D. Data were collected from clinical records, including demographic information, IOP measurements, and atropine treatment details. The patients were divided into two subgroups based on a baseline IOP of 14 mmHg. ML models, including Lasso, CART, XGB, and RF, were developed to predict the end IOP value. Then, the best-performing model was further interpreted using SHAP values. The SHAP module created a personalized and dynamic graphic to illustrate how various factors (e.g., age, sex, cumulative duration, and dosage of atropine treatment) affect the end IOP. Results: RF showed the best performance, with superior error metrics in both subgroups. The interpretation of RF with SHAP revealed that age and the recruitment duration of atropine consistently influenced IOP across subgroups, while other variables had varying effects. SHAP values also offer insights, helping clinicians understand how different factors contribute to predicted IOP value in individual children. Conclusions: SHAP provides an alternative approach to understand the factors affecting IOP in children with myopia treated with atropine. Its enhanced interpretability helps clinicians make informed decisions, improving the safety and efficacy of myopia management. This study demonstrates the potential of combining SHAP with ML models for personalized care in ophthalmology. Full article
(This article belongs to the Special Issue Ophthalmology: New Diagnostic and Treatment Approaches)
Show Figures

Figure 1

Figure 1
<p>Data preprocessing workflow.</p>
Full article ">Figure 2
<p>Modeling scheme.</p>
Full article ">Figure 3
<p>SHAP summary and feature importance plot of each base IOP subgroup. (<b>a</b>) SHAP summary plot of base IOP <math display="inline"><semantics> <mrow> <mo>≤</mo> <mn>14</mn> </mrow> </semantics></math> subgroup. (<b>b</b>) SHAP feature importance plot of base IOP <math display="inline"><semantics> <mrow> <mo>≤</mo> <mn>14</mn> </mrow> </semantics></math> subgroup. (<b>c</b>) SHAP summary plot of base IOP <math display="inline"><semantics> <mrow> <mo>&gt;</mo> <mn>14</mn> </mrow> </semantics></math> subgroup. (<b>d</b>) SHAP feature importance plot of base IOP <math display="inline"><semantics> <mrow> <mo>&gt;</mo> <mn>14</mn> </mrow> </semantics></math> subgroup.</p>
Full article ">Figure 4
<p>Three examples of individual case (panels (<b>a</b>–<b>c</b>)) explanations in base IOP <math display="inline"><semantics> <mrow> <mo>≤</mo> <mn>14</mn> </mrow> </semantics></math> subgroup. <math display="inline"><semantics> <mrow> <mi>f</mi> <mfenced separators="|"> <mrow> <mi>x</mi> </mrow> </mfenced> </mrow> </semantics></math>: model prediction outcome. <math display="inline"><semantics> <mrow> <mi>E</mi> <mfenced open="[" close="]" separators="|"> <mrow> <mi>f</mi> <mfenced separators="|"> <mrow> <mi>x</mi> </mrow> </mfenced> </mrow> </mfenced> </mrow> </semantics></math>: expected value.</p>
Full article ">Figure 5
<p>Three examples of individual case (panels (<b>a</b>–<b>c</b>)) explanations in base IOP <math display="inline"><semantics> <mrow> <mo>&gt;</mo> <mn>14</mn> </mrow> </semantics></math> subgroup. <math display="inline"><semantics> <mrow> <mi>f</mi> <mfenced separators="|"> <mrow> <mi>x</mi> </mrow> </mfenced> </mrow> </semantics></math>: model prediction outcome. <math display="inline"><semantics> <mrow> <mi>E</mi> <mfenced open="[" close="]" separators="|"> <mrow> <mi>f</mi> <mfenced separators="|"> <mrow> <mi>x</mi> </mrow> </mfenced> </mrow> </mfenced> </mrow> </semantics></math>: expected value.</p>
Full article ">
25 pages, 9354 KiB  
Article
Identification of Maize Kernel Varieties Using LF-NMR Combined with Image Data: An Explainable Approach Based on Machine Learning
by Chunguang Bi, Xinhua Bi, Jinjing Liu, He Chen, Mohan Wang, Helong Yu and Shaozhong Song
Plants 2025, 14(1), 37; https://doi.org/10.3390/plants14010037 - 26 Dec 2024
Viewed by 402
Abstract
The precise identification of maize kernel varieties is essential for germplasm resource management, genetic diversity conservation, and the optimization of agricultural production. To address the need for rapid and non-destructive variety identification, this study developed a novel interpretable machine learning approach that integrates [...] Read more.
The precise identification of maize kernel varieties is essential for germplasm resource management, genetic diversity conservation, and the optimization of agricultural production. To address the need for rapid and non-destructive variety identification, this study developed a novel interpretable machine learning approach that integrates low-field nuclear magnetic resonance (LF-NMR) with morphological image features through an optimized support vector machine (SVM) framework. First, LF-NMR signals were obtained from eleven maize kernel varieties, and ten key features were extracted from the transverse relaxation decay curves. Meanwhile, five image morphological features were selected using the recursive feature elimination (RFE) algorithm. Before modeling, principal component analysis (PCA) was used to determine the distribution features of the internal components for each maize variety. Subsequently, LF-NMR features and image morphological data were integrated to construct a classification model and the SVM hyperparameters were optimized using an improved differential evolution algorithm, achieving a final classification accuracy of 96.36%, which demonstrated strong robustness and precision. The model’s interpretability was further enhanced using Shapley values, which revealed the contributions of key features such as Max Signal and Signal at Max Curvature to classification decisions. This study provides an innovative technical solution for the efficient identification of maize varieties, supports the refined management of germplasm resources, and lays a foundation for genetic improvement and agricultural applications. Full article
(This article belongs to the Section Plant Modeling)
Show Figures

Figure 1

Figure 1
<p>Maize kernel samples.</p>
Full article ">Figure 2
<p>Schematic diagram of data acquisition: (<b>a</b>) image data; (<b>b</b>) LF-NMR data.</p>
Full article ">Figure 3
<p>Experimental flow chart.</p>
Full article ">Figure 4
<p>Heatmap of the morphological features of different types of kernels: (<b>a</b>) geometric features, (<b>b</b>) textural features, (<b>c</b>) color features.</p>
Full article ">Figure 5
<p>Results using feature selection: (<b>a</b>–<b>c</b>) MI, ReliefF, RFE to select 10 features; (<b>d</b>–<b>f</b>) MI, ReliefF, RFE to select 5 features; (<b>g</b>–<b>i</b>) comparison of identification results.</p>
Full article ">Figure 6
<p>Average T<sub>2</sub> relaxation time curves for eleven maize varieties: (<b>a</b>) 0–600 ms; (<b>b</b>) 0–100 ms.</p>
Full article ">Figure 7
<p>Comparison of performance metrics for each fusion strategy.</p>
Full article ">Figure 8
<p>Scores of 11 types of maize kernels for the first two principal components.</p>
Full article ">Figure 9
<p>Model confusion matrix visualization. (<b>a</b>) DE-OAA-SVM model confusion matrix visualization. (<b>b</b>) HDE-OAA-SVM model confusion matrix visualization.</p>
Full article ">Figure 10
<p>ROC curve visualization.</p>
Full article ">Figure 11
<p>Performance comparison of different models.</p>
Full article ">Figure 12
<p>SHAP summary plot.</p>
Full article ">Figure 13
<p>Shapley explanation for JD27: (<b>a</b>) summary plot of JD27; (<b>b</b>–<b>f</b>) SHAP dependence plots for Max Signal, a_dev, Slow Ratio, Signal at Max Curvature, and v_mean.</p>
Full article ">
19 pages, 7445 KiB  
Article
An Interpretable Model for Salinity Inversion Assessment of the South Bank of the Yellow River Based on Optuna Hyperparameter Optimization and XGBoost
by Xia Liu, Yu Hu, Xiang Li, Ruiqi Du, Youzhen Xiang and Fucang Zhang
Agronomy 2025, 15(1), 18; https://doi.org/10.3390/agronomy15010018 - 26 Dec 2024
Viewed by 250
Abstract
Soil salinization is a serious land degradation phenomenon, posing a severe threat to regional agricultural resource utilization and sustainable development. It has been a mainstream trend to use machine-learning methods to achieve monitoring of large-scale salinized soil quickly. However, machine learning model training [...] Read more.
Soil salinization is a serious land degradation phenomenon, posing a severe threat to regional agricultural resource utilization and sustainable development. It has been a mainstream trend to use machine-learning methods to achieve monitoring of large-scale salinized soil quickly. However, machine learning model training requires many samples and hyper-parameter optimization and lacks solvability. To compare the performance of different machine-learning models, this study conducted a soil sampling experiment on saline soils along the south bank of the Yellow River in Dalate Banner. The experiment lasted two years (2022 and 2023) during the spring bare soil period, collecting 304 soil samples. The soil salinity was estimated with the multi-source remote sensing satellite data by combining the extreme gradient boosting model (XGBoost), Optuna hyper-parameter optimization, and Shapley addition (SHAP) interpretable model. Correlation analysis and continuous variable projection were employed to identify key inversion factors. The regression effects of partial least squares regression (PLSR), geographically weighted regression (GWR), long short-term memory networks (LSTM), and extreme gradient boosting (XGBoost) were compared. The optimal model was selected to estimate soil salinity in the study area from 2019 to 2023. The results showed that the XGBoost model fitted optimally, the test set had high R2 (0.76) and the ratio of performance to deviation (2.05), and the estimation results were consistent with the measured salinity values. SHAP analysis revealed that the salinity index and topographic factors were the primary inversion factors. Notably, the same inversion factor influenced varying soil salinity estimates at different locations. The saline soils of the study area in 2019 and 2023 were 65% and 44%, respectively, and the overall trend of soil salinization decreased. From the viewpoint of spatial distribution, the degree of soil salinization showed a gradually increasing trend from south to north, and it was most serious on the side near the Yellow River. This study is of great significance for the quantitative estimation of salinized soil in the irrigated area on the south bank of the Yellow River, the prevention and control of soil salinization, and the sustainable development of agriculture. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>Overview of the study area and distribution of sampling sites ((<b>a</b>) geographic location map; (<b>b</b>,<b>c</b>) elevation map, distribution of sampling sites).</p>
Full article ">Figure 2
<p>(<b>a</b>) Descriptive statistics box plot of measured SSC (CV: coefficient of variation); (<b>b</b>) SSC distribution map of different types of saline–alkali soils.</p>
Full article ">Figure 3
<p>Correlation analysis between soil salinity and inversion factors.</p>
Full article ">Figure 4
<p>Performance comparison of different model training sets.</p>
Full article ">Figure 5
<p>Performance comparison of different model test sets.</p>
Full article ">Figure 6
<p>Soil content grading chart 2019–2023.</p>
Full article ">Figure 7
<p>Area of different classes of saline land from 2019 to 2023 ((<b>a</b>) change in area; (<b>b</b>) rate of change).</p>
Full article ">Figure 8
<p>Transfer matrix between areas of different types of saline soils.</p>
Full article ">Figure 9
<p>(<b>a</b>) SHAP global interpretation map: feature summary map for SHAP; (<b>b</b>) heat map of SHAP-based features.</p>
Full article ">Figure 10
<p>SSC inversion data-processing procedure. Table (<b>a</b>) in the figure shows the 8th data point, and table (<b>b</b>) shows the 15th.</p>
Full article ">
22 pages, 1084 KiB  
Article
Unsupervised Identification for 2-Additive Capacity by Principal Component Analysis and Kendall’s Correlation Coefficient in Multi-Criteria Decision-Making
by Xueting Guan, Kaihong Guo, Ran Zhang and Xiao Han
Mathematics 2025, 13(1), 23; https://doi.org/10.3390/math13010023 - 25 Dec 2024
Viewed by 231
Abstract
With the Multi-Criteria Decision-Making (MCDM) problems becoming increasingly complex, traditional MCDM methods cannot effectively handle ambiguous, incomplete, or uncertain data. While several novel types of MCDM methods have been proposed to address this limitation, they fail to consider the potentially complex interactions among [...] Read more.
With the Multi-Criteria Decision-Making (MCDM) problems becoming increasingly complex, traditional MCDM methods cannot effectively handle ambiguous, incomplete, or uncertain data. While several novel types of MCDM methods have been proposed to address this limitation, they fail to consider the potentially complex interactions among decision criteria. An effective capacity identification methodology is definitely needed to conquer this issue. In this paper, we develop a novel unsupervised method for identifying 2-additive capacities by means of Principal Component Analysis (PCA) and Kendall’s correlation coefficient. During the process, some significant results are achieved. Firstly, the Shapley values of decision criteria are derived by using the PCA, through a combination of the variance contribution rate of each Principal Component (PC) and its corresponding eigenvector. Secondly, Kendall’s correlation coefficient stemmed from the decision data created to help identify the Shapley interaction index for each pair of criteria by unsupervised learning. The optimization model equipped with a new form of monotonicity conditions is then established to further determine the optimal Shapley interaction index. With these two kinds of indices, a desired monotone 2-additive capacity is finally identified in an objective and efficient manner. Numerical experiments demonstrate that our proposal can adequately consider the importance of criteria and accurately identify the types of Shapley interaction indices between criteria, and is thus able to produce more convincing and logical results compared with other unsupervised identification methods. Full article
Show Figures

Figure 1

Figure 1
<p>Overview of the identification process for 2-additive capacities.</p>
Full article ">Figure 2
<p>Visualization of 2-additive capacity values by four unsupervised identification methods. References: Rowley et al. (2015) [<a href="#B24-mathematics-13-00023" class="html-bibr">24</a>], Duarte (2018) [<a href="#B25-mathematics-13-00023" class="html-bibr">25</a>], Pelegrina &amp; Duarte (2024) [<a href="#B28-mathematics-13-00023" class="html-bibr">28</a>].</p>
Full article ">Figure 3
<p>Final preference rankings of the materials by four unsupervised identification methods. References: Rowley et al. (2015) [<a href="#B24-mathematics-13-00023" class="html-bibr">24</a>], Duarte (2018) [<a href="#B25-mathematics-13-00023" class="html-bibr">25</a>], Pelegrina &amp; Duarte (2024) [<a href="#B28-mathematics-13-00023" class="html-bibr">28</a>]. The numbers on top of the bars represent the ranking of each material based on its evaluation value for the respective method.</p>
Full article ">
18 pages, 10675 KiB  
Article
Combining Physical Hydrological Model with Explainable Machine Learning Methods to Enhance Water Balance Assessment in Glacial River Basins
by Ruibiao Yang, Jinglu Wu, Guojing Gan, Ru Guo and Hongliang Zhang
Water 2024, 16(24), 3699; https://doi.org/10.3390/w16243699 - 22 Dec 2024
Viewed by 362
Abstract
The implementation of accurate water balance assessment in glacier basins is essential for the management and sustainable development of water resources in the basins. In this study, a hybrid modeling framework was constructed to enhance runoff prediction and water balance assessment in glacier [...] Read more.
The implementation of accurate water balance assessment in glacier basins is essential for the management and sustainable development of water resources in the basins. In this study, a hybrid modeling framework was constructed to enhance runoff prediction and water balance assessment in glacier basins. An improved physical hydrological model (SEGSWAT+) was combined with a machine learning model (ML) to capture the relationship between runoff residuals and water balance components through the Shapley additive explanations (SHAP) method. Based on the enhancement of the runoff fitting results of the existing model, the runoff residuals are decomposed and used to correct the hydrological process component values, thus improving the accuracy of the water balance results. We evaluated the performance and correction results of the method using various ML methods. We analyzed the results for two consecutive periods from 1959 to 2022 for the glacial sub-basins of three tributaries of the Upper Ili River Basin in central Asia. The results show that the hybrid framework based on extreme gradient boosting (XGBoost) with an average NSE value of 0.93 has the best performance, and the bias based on the evapotranspiration component and soil water content change component is reduced by 3.2–5%, proving the effectiveness of the water balance correction. This study advances the interpretation of ML models for hydrologic assessment of areas with complex hydrodynamic characteristics. Full article
Show Figures

Figure 1

Figure 1
<p>Hybrid modeling framework.</p>
Full article ">Figure 2
<p>Geographic location of the study area.</p>
Full article ">Figure 3
<p>Distribution of absolute values of relative errors between hydrologic model simulated evapotranspiration (ET) and remotely sensed products.</p>
Full article ">Figure 4
<p>Comparison of calibration (Cal) and validation (Val) performance between ML models. The vertical axis represents the R<sup>2</sup> and NRMSE values, both ranging from 0 to 1. Higher R<sup>2</sup> values indicate better model fit, while lower NRMSE values denote greater predictive accuracy.</p>
Full article ">Figure 5
<p>Modeling results and performance assessment of total outlet runoff from a watershed. (<b>I</b>) Runoff sequence fitting results of multiple hybrid models in two consecutive periods, with black hollow circles as observations. (<b>II</b>) Performance evaluation in two consecutive periods, with (<b>a</b>,<b>c</b>) as calibration periods and (<b>b</b>,<b>d</b>) as validation periods. (<b>III</b>) Comparison of the simulation performance of different combinations of methods in four periods, where the pink line represents the NRMSE and the gray solid line indicates the correlation coefficient (r) between simulations and observations. The horizontal and vertical axes represent the ratio of the standard deviation of the observed values to the corresponding simulated values. Specifically, REF on the axes indicates that the standard deviation is zero; the simulated results are in perfect agreement with the observations.</p>
Full article ">Figure 6
<p>Summary of SHAP value distribution. The red color represents the high value of the corresponding eigenvalue and the blue color represents the low value. The SHAP value greater than 0 means a positive contribution to the residual prediction, and a SHAP value less than 0 represents a negative contribution. The upper dashed line is for water-generating elements and the lower line is for water-dissipating elements.</p>
Full article ">Figure 7
<p>Importance ranking of water balance elements among different models, expressed using the average of the absolute values of SHAP.</p>
Full article ">Figure 8
<p>Local interpretation analysis from residual predictions from low to high force plots, with red representing pushing predictions higher and blue representing pushing predictions lower. The length of the bars represents the degree of influence contribution. f(x) is the residual prediction output from the model. (<b>a</b>) median of the first residual quartile; (<b>b</b>) median of the second residual quartile; (<b>c</b>) median of the third residual quartile; (<b>d</b>) median of the fourth residual quartile.</p>
Full article ">Figure 9
<p>Relative soil change (ΔSW) and evapotranspiration (ET) errors before and after calibration. Lower values indicate smaller errors.</p>
Full article ">Figure 10
<p>Box plots of standardized SHAP values for different seasons for water balance elements. The blue line represents the previous period of this study, and the red line represents the subsequent period (Spr, Sum, Aut, and Win represent spring, summer, fall, and winter seasons, respectively).</p>
Full article ">
22 pages, 4009 KiB  
Article
Advanced Ensemble Machine-Learning Models for Predicting Splitting Tensile Strength in Silica Fume-Modified Concrete
by Nadia Moneem Al-Abdaly, Mohammed E. Seno, Mustafa A. Thwaini, Hamza Imran, Krzysztof Adam Ostrowski and Kazimierz Furtak
Buildings 2024, 14(12), 4054; https://doi.org/10.3390/buildings14124054 - 20 Dec 2024
Viewed by 369
Abstract
The splitting tensile strength of concrete is crucial for structural integrity, as tensile stresses from load and environmental changes often lead to cracking. This study investigates the effectiveness of advanced ensemble machine-learning models, including LightGBM, GBRT, XGBoost, and AdaBoost, in accurately predicting the [...] Read more.
The splitting tensile strength of concrete is crucial for structural integrity, as tensile stresses from load and environmental changes often lead to cracking. This study investigates the effectiveness of advanced ensemble machine-learning models, including LightGBM, GBRT, XGBoost, and AdaBoost, in accurately predicting the splitting tensile strength of silica fume-enhanced concrete. Using a robust database split into training (80%) and testing (20%) sets, we assessed model performance through R2, RMSE, and MAE metrics. Results demonstrate that GBRT and XGBoost achieved superior predictive accuracy, with R2 scores reaching 0.999 in training and high precision in testing (XGBoost: R2 = 0.965, RMSE = 0.337; GBRT: R2 = 0.955, RMSE = 0.381), surpassing both LightGBM and AdaBoost. This study highlights GBRT and XGBoost as reliable, efficient alternatives to traditional testing methods, offering substantial time and cost savings. Additionally, SHapley Additive exPlanations (SHAP) analysis was conducted to identify key input features and to elucidate their influence on splitting tensile strength, providing valuable insights into the predictive behavior of silica fume-enhanced concrete. The SHAP analysis reveals that the water-to-binder ratio and curing duration are the most critical factors influencing the splitting tensile strength of silica fume concrete. Full article
(This article belongs to the Section Building Materials, and Repair & Renovation)
Show Figures

Figure 1

Figure 1
<p>Research methodology.</p>
Full article ">Figure 2
<p>Structure of Adaboost.</p>
Full article ">Figure 3
<p>Structure of Gradient boosting procedure.</p>
Full article ">Figure 4
<p>Grid search cv methodology [<a href="#B48-buildings-14-04054" class="html-bibr">48</a>].</p>
Full article ">Figure 5
<p>Box Plot Distribution of Mix design components and the Tensile Strength of SF Concrete.</p>
Full article ">Figure 6
<p>Heat map between predictors and target variables.</p>
Full article ">Figure 7
<p>The visual comparison of RMSE and R<sup>2</sup> metrics across various suggested models.</p>
Full article ">Figure 8
<p>Comparative scatter plots for predictive models of splitting tensile.</p>
Full article ">Figure 9
<p>Error Distribution Histograms for Predicted Splitting tensile by Different Models: (<b>a</b>) Training; (<b>b</b>) Testing.</p>
Full article ">Figure 10
<p>Cumulative frequency error comparison of intelligent models used in research.</p>
Full article ">Figure 11
<p>Taylor Diagram Representation for Model Predictions of splitting tensile strength.</p>
Full article ">Figure 12
<p>Overview of the impact of each feature on splitting tensile strength.</p>
Full article ">Figure 13
<p>Negative and positive impact of the features on splitting tensile strength.</p>
Full article ">
29 pages, 13369 KiB  
Article
Cooperative Behavior of Prosumers in Integrated Energy Systems
by Natalia Aizenberg, Evgeny Barakhtenko and Gleb Mayorov
Mathematics 2024, 12(24), 4005; https://doi.org/10.3390/math12244005 - 20 Dec 2024
Viewed by 267
Abstract
The technical complexity of organizing energy systems’ operation has recently been compounded by the complexity of reconciling the interests of individual entities involved in interactions. This study proposes a possible solution to the problem of modeling their relationships within a large system. Our [...] Read more.
The technical complexity of organizing energy systems’ operation has recently been compounded by the complexity of reconciling the interests of individual entities involved in interactions. This study proposes a possible solution to the problem of modeling their relationships within a large system. Our solution takes into account multiple levels of interactions, imperfect information, and conflicting interests. We present a mathematical statement of the problem of optimal interactions between the centralized system and prosumers in the integrated energy system (IES) with due consideration of the layered architecture of the IES. The paper also contributes a model for arranging the interactions between centralized and distributed energy sources for cases when IES prosumers form coalitions. The implementation of this model is based on multi-agent techniques and cooperative game theory tools. In order to arrive at a rational arrangement of the interactions of prosumers in the IES, the model implements different approaches to the allocation of the coalition’s total payoff (the Shapley value, Modiclus, PreNucleolus solution concepts). Furthermore, we propose a criterion for deciding on the “best” imputation. We contribute a multi-agent system that implements the proposed model and use a test IES setup to validate the model by simulations. The results of the simulations ensure optimal interactions between the entities involved in the energy supply process within the IES and driven by their own interests. The results also elucidate the conditions that make it feasible for prosumers to form coalitions. Full article
(This article belongs to the Special Issue Mathematical Modeling and Applications in Industrial Organization)
Show Figures

Figure 1

Figure 1
<p>Architecture of the multi-agent system.</p>
Full article ">Figure 2
<p>Schematic of the test integrated energy system.</p>
Full article ">Figure 3
<p>Simulation setup 2. Full cooperation: all prosumers interact with each other.</p>
Full article ">Figure 4
<p>Total generation of (<b>a</b>) electricity and (<b>b</b>) thermal energy.</p>
Full article ">Figure 5
<p>Payoff of prosumers involved in a coalition (relative to the case of no cooperation) depending on the solution concept.</p>
Full article ">Figure 6
<p>Generalized property ω of the solutions obtained for the Shapley value, Modiclus and PreNucleolus solution concepts in the two considered cases.</p>
Full article ">Figure A1
<p>Total cost of energy supply to prosumers involved in coalitions.</p>
Full article ">Figure A2
<p>Total heat generated.</p>
Full article ">Figure A3
<p>Electricity generated by centralized sources.</p>
Full article ">Figure A4
<p>Electricity generated by prosumers.</p>
Full article ">Figure A5
<p>Simulation setup 1. No cooperation: no prosumers interact with each other.</p>
Full article ">Figure A6
<p>Total cost of energy supply to four prosumers involved in coalitions.</p>
Full article ">Figure A7
<p>Electricity generated by prosumers.</p>
Full article ">Figure A8
<p>Total heat generated.</p>
Full article ">Figure A9
<p>Simulation setup 2. Full cooperation for four prosumers (the fifth one is disconnected): all prosumers interact with each other.</p>
Full article ">
14 pages, 1147 KiB  
Article
Collaborative Game-Theoretic Optimization of Public Transport Fare Policies: A Global Framework for Sustainable Urban Mobility
by Ekinhan Eriskin
Sustainability 2024, 16(24), 11199; https://doi.org/10.3390/su162411199 - 20 Dec 2024
Viewed by 399
Abstract
Urbanization intensifies the need for sustainable public transportation that balances financial viability, environmental sustainability, and social equity. Traditional fare-setting methods often focus narrowly on financial objectives, neglecting broader impacts. This study introduces a novel collaborative game-theoretic model integrating user sentiment analysis to optimize [...] Read more.
Urbanization intensifies the need for sustainable public transportation that balances financial viability, environmental sustainability, and social equity. Traditional fare-setting methods often focus narrowly on financial objectives, neglecting broader impacts. This study introduces a novel collaborative game-theoretic model integrating user sentiment analysis to optimize fare policies. By incorporating utilities of passengers, operators, and governments, and employing the Shapley value for fair benefit distribution, this model aims to maximize social welfare. The methodology frames fare optimization as a cooperative game among stakeholders, integrating passenger preferences through sentiment analysis. The social welfare function combines the utilities of all stakeholders and is maximized under operational, environmental, and financial constraints. Implemented in Python and applied to Isparta, Turkey, the model identifies an optimal fare of 19.5 TL (ranged between 14 and 26.50 TL) that maximizes social welfare, aligning closely with existing fares. Shapley value analysis distributes the benefits, assigning 221,457 (35.6%) units to passengers, 54,562 (8.7%) units to operators, and 347,433 (55.7%) units to the government, highlighting significant environmental gains for the government. Sensitivity analyses confirm the model’s robustness across varying trip volumes, suggesting its applicability to diverse urban settings. This research contributes to socially equitable and user-centric fare policies by providing a comprehensive framework aligning stakeholder interests. Policymakers can leverage this model to design fare strategies promoting sustainability, efficiency, and collaboration in public transportation systems. Full article
Show Figures

Figure 1

Figure 1
<p>Pseudocode of the developed algorithm.</p>
Full article ">Figure 2
<p>The social welfare change based on fare pricing.</p>
Full article ">Figure 3
<p>The operator utility change based on fare pricing.</p>
Full article ">Figure 4
<p>The government utility change based on fare pricing.</p>
Full article ">Figure 5
<p>Shapley values for the coalition.</p>
Full article ">
19 pages, 13841 KiB  
Article
Spatial Prediction of Soil Water Content by Bayesian Optimization–Deep Forest Model with Landscape Index and Soil Texture Data
by Weihao Yang, Ruofan Zhen, Fanyue Meng, Xiaohang Yang, Miao Lu and Yingqiang Song
Agronomy 2024, 14(12), 3039; https://doi.org/10.3390/agronomy14123039 - 19 Dec 2024
Viewed by 698
Abstract
The accurate prediction of the spatial variability for soil water content (SWC) in farmland is essential for water resource management and sustainable agricultural development. However, natural factors introduce uncertainty and result in poor alignment when predicting farmland SWC, leading to low accuracy. To [...] Read more.
The accurate prediction of the spatial variability for soil water content (SWC) in farmland is essential for water resource management and sustainable agricultural development. However, natural factors introduce uncertainty and result in poor alignment when predicting farmland SWC, leading to low accuracy. To address this, this study introduced a novel indicator: landscape indices. These indices include the largest patch index (LPI), edge density (ED), aggregation index (AI), patch cohesion index (COH), contagion index (CON), landscape division index (DIV), percentage of like adjacencies (PLA), Shannon evenness index (SHEI), and Shannon diversity index (SHDI). A Bayesian optimization–deep forest (BO–DF) model was developed to leverage these indices for predicting the spatial variability of SWC. Statistical analysis revealed that landscape indices exhibited skewed distributions and weak linear correlations with SWC (r < 0.2). Despite this, incorporating landscape index variables into the BO–DF model significantly improved prediction accuracy, with R2 increasing by 35.85%. This model demonstrated a robust nonlinear fitting capability for the spatial variability of SWC. Spatial mapping of SWC using the BO–DF model indicated that high-value areas were predominantly located in the eastern and southern regions of the Yellow River Delta in China. Furthermore, the SHapley additive explanation (SHAP) analysis highlighted that landscape indices were key drivers in predicting SWC. These findings underscore the potential of landscape indices as valuable variables for spatial SWC prediction, supporting regional strategies for sustainable agricultural development. Full article
(This article belongs to the Special Issue Advanced Machine Learning in Agriculture)
Show Figures

Figure 1

Figure 1
<p>The geographical location of the study area and the distribution of sampling points.</p>
Full article ">Figure 2
<p>The load values of the component matrix of soil texture feature data. The pie chart shows the cumulative contribution rate of the first three principal components (PCs) to the total variance.</p>
Full article ">Figure 3
<p>Schematic diagram of the structure of the BO–DF model for the SWC prediction. In this case, soil texture, landscape index, and SWC are used as input factors that ultimately act on SWC prediction. Each cascade level consists of two random forests (black) and two completely random woods (red) to ensure model diversity and generalization ability.</p>
Full article ">Figure 4
<p>The matrix scatter plot illustrates the relationship between soil water content (SWC) and various auxiliary variables derived from Sentinel-2 L2A imagery. The significance of the Pearson correlation coefficient (p) is included in the plot to assess the statistical relevance of these relationships. The environmental variables presented in the plot include largest patch index (LPI), edge density (ED), aggregation index (AI), patch cohesion index (COH), contagion index (CON), landscape division index (DIV), percentage of like adjacencies (PLA), Shannon evenness index (SHEI), Shannon diversity index (SHDI), and soil texture features (PC1, PC2, PC3, PC4, PC5).</p>
Full article ">Figure 5
<p>The iterative trend of RMSE during the BO algorithm. The horizontal axis is the sampling values of super hyper-parameters max_depth, n_trees, n_estimators, and min_samples_split, and the vertical axis is the sampling values of super hyper-parameters max_layers, min_samples_split, min_samples_leaf, and n_bins. The red circle is the optimal hyper-parameter range obtained after multiple iterations. The color band on the right side represents the value of RMSE.</p>
Full article ">Figure 6
<p>The scatter trend for the prediction of SWC by the BO–DF model in (a) only soil texture variables and (b) with the addition of the landscape index.</p>
Full article ">Figure 7
<p>Spatial mapping of the SWC in the study area.</p>
Full article ">Figure 8
<p>The driving importance of auxiliary variables for the SWC by SHAP analysis. The closer the color is to red, the larger the feature value is, and the closer the color is to blue, the smaller the feature value is.</p>
Full article ">
Back to TopTop