[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Article in Journal
Application of Chromatographic and Thermal Methods to Study Fatty Acids Composition and Positional Distribution, Oxidation Kinetic Parameters and Melting Profile as Important Factors Characterizing Amaranth and Quinoa Oils
Previous Article in Journal
A Resource Utilization Prediction Model for Cloud Data Centers Using Evolutionary Algorithms and Machine Learning Techniques
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimum Design of Cylindrical Walls Using Ensemble Learning Methods

1
Department of Civil Engineering, Istanbul University-Cerrahpasa, Istanbul 34320, Turkey
2
Department of Civil Engineering, Turkish-German University, Istanbul 34820, Turkey
3
Department of Civil, Geological and Mining Engineering, Polytechnique Montréal, Montréal, QC H3C 3A7, Canada
4
Department of Civil and Environmental Engineering, Temple University, Philadelphia, PA 19122, USA
5
College of IT Convergence, Gachon University, Seongnam 13120, Korea
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2022, 12(4), 2165; https://doi.org/10.3390/app12042165
Submission received: 20 January 2022 / Revised: 14 February 2022 / Accepted: 15 February 2022 / Published: 18 February 2022
Figure 1
<p>Vertical (<b>left</b>) and horizontal (<b>right</b>) cross-sections of a water tank.</p> ">
Figure 2
<p>Flow chart of the harmony search algorithm.</p> ">
Figure 3
<p>Machine learning workflow.</p> ">
Figure 4
<p>Comparison of the optimized and predicted h values (4 design variables). (<b>a</b>) LightGBM, (<b>b</b>) Random Forest, (<b>c</b>) XGBoost, (<b>d</b>) CatBoost.</p> ">
Figure 5
<p>Model accuracies with 4 design variables. (<b>a</b>) LightGBM, (<b>b</b>) Random Forest, (<b>c</b>) XGBoost, (<b>d</b>) CatBoost.</p> ">
Figure 6
<p>Comparison of the optimized and predicted h values (3 design variables). (<b>a</b>) LightGBM, (<b>b</b>) Random Forest, (<b>c</b>) XGBoost, (<b>d</b>) CatBoost.</p> ">
Figure 7
<p>Model accuracies with 3 design variables. (<b>a</b>) LightGBM, (<b>b</b>) Random Forest, (<b>c</b>) XGBoost, (<b>d</b>) CatBoost.</p> ">
Figure 7 Cont.
<p>Model accuracies with 3 design variables. (<b>a</b>) LightGBM, (<b>b</b>) Random Forest, (<b>c</b>) XGBoost, (<b>d</b>) CatBoost.</p> ">
Figure 8
<p>Comparison of the actual and predicted (Random Forest) wall thickness values.</p> ">
Figure 9
<p>Shapley values of the LightGBM model.</p> ">
Figure 10
<p>Shapley values of the XGBoost model.</p> ">
Figure 11
<p>Shapley values of the CatBoost model.</p> ">
Figure 12
<p>Shapley values of the Random Forest model.</p> ">
Figure 13
<p>Feature dependence plots for XGBoost, LightGBM, Random Forest and CatBoost. (<b>a</b>) XGBoost dependence plot for r, (<b>b</b>) XGBoost dependence plot for H, (<b>c</b>) XGBoost dependence plot for C/S, (<b>d</b>) LightGBM dependence plot for r, (<b>e</b>) LightGBM dependence plot for H, (<b>f</b>) LightGBM dependence plot for C/S, (<b>g</b>) RF dependence plot for r, (<b>h</b>) RF dependence plot for H, (<b>i</b>) RF dependence plot for C/S, (<b>j</b>) CatBoost dependence plot for r, (<b>k</b>) CatBoost dependence plot for H, (<b>l</b>) CatBoost dependence plot for C/S.</p> ">
Versions Notes

Abstract

:
The optimum cost of the structure design is one of the major goals of structural engineers. The availability of large datasets with preoptimized structural configurations can facilitate the process of optimum design significantly. The current study uses a dataset of 7744 optimum design configurations for a cylindrical water tank. Each of them was obtained by using the harmony search algorithm. The database used contains unique combinations of height, radius, total cost, material unit cost, and corresponding wall thickness that minimize the total cost. It was used to create ensemble learning models such as Random Forest, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Categorical Gradient Boosting (CatBoost). Generated machine learning models were able to predict the optimum wall thickness corresponding to new data with high accuracy. Using SHapely Additive exPlanations (SHAP), the height of a cylindrical wall was found to have the greatest impact on the optimum wall thickness followed by radius and the ratio of concrete unit cost to steel unit cost.

1. Introduction

The cost optimization of structures is an area of structural analysis and design that has gained traction in recent years with the advent of metaheuristic optimization techniques and increasing availability of computing power. A number of past studies have optimized various civil engineering structures, such as retaining walls (Kayabekir et al. [1]), concrete-filled steel tubular columns (Cakiroglu et al. [2]), cantilever soldier piles (Arama et al. [3]), and truss structures (Bekdaş et al. [4]), utilizing advanced optimization techniques, such as the harmony search algorithm, social spider algorithm, and Jaya algorithm. Bekdaş [5] demonstrated the applicability of the harmony search algorithm to the cost optimization of cylindrical reinforced concrete water tanks with the dimensions and cross-sections as shown in Figure 1. The main focus of that study is to find the optimized wall dimensions to minimize the total cost of manufacturing the water tank.
In the study of Bekdas [5], the harmony search algorithm was applied to optimize water tanks with three different total volumes and five different heights. The structural response of the cylindrical wall under the water pressure was computed using the superposition method from Hetenyi [6] and the ACI 318—Building Code Requirements [7]. The current study builds on the previous work conducted by increasing the number of optimization cases and saving the optimum results to develop an extensive database. In addition to the geometric variables as shown in Figure 1, the dimensionless variable C/S, which is the ratio between the unit costs of concrete and steel used in the construction, has been introduced, representing the ratio of the unit cost of concrete to the unit cost of steel. The ranges of the variables used in generating the database of optimum design configurations are given in Table 1. (C and S stand for Concrete and Steel, respectively.)
The current study aims at demonstrating the efficiency of ensemble machine learning techniques in predicting the optimum dimensions of a liquid tank with a cylindrical wall. The availability of large sets of data with geometric design variables and corresponding optimum cost makes it possible to predict certain dimensions of a structure without the explicit use of optimization algorithms. The current study shows that for given values of height, radius, and material unit costs, the optimum wall thickness of a cylindrical tank can be found with high accuracy.

2. Optimization and Machine Learning Methodologies

A database has been prepared, which includes 7744 different combinations of three geometric parameters and one parameter that contains information about the material. The geometric properties are the wall height (H), wall radius (r), and wall thickness (h). The material parameter is C/S. While calculating the C/S ratio, the concrete unit price ranges between 30 and 100 with steps of 10, whereas the steel unit price ranges between 300 and 1000 with a step size of 100. In this way, 64 different combinations of the concrete and steel unit prices are included in the database. For each combination of H, r, and C/S, the h value that minimizes the total cost has been obtained using the harmony search algorithm. In each optimization step, the structural response was calculated using the superposition method, which can be found in Bekdaş [8,9] and Bekdaş and Nigdeli [10]. The result of this optimization was used for training four different ensemble learning models where the database is split into a training and test set at a ratio of 30% to 70%. The models were trained as regression models where the target variable predicted by the model is the wall thickness (h) that minimizes the total cost when H, r, and C/S are given. The models were trained both with and without incorporating the cost of the structure into the training set. As the ensemble learning methods, Random Forest (RF), LightGBM, XGBoost, and CatBoost are implemented since these four algorithms are reported as the best-performing algorithms in the recent research literature [11,12,13,14,15]. The harmony search procedure is used in database generation and elaborated in the following section.

2.1. Harmony Search Procedure

The harmony search algorithm was developed by Geem et al. [16] and has been applied to numerous engineering problems. Cross-sectional optimization of plate girders (Cakiroglu et al. [17]), truss-sizing optimization (Degertekin et al. [18]), optimum design of active tuned mass dampers (Kayabekir et al. [19,20]), axially symmetric cylindrical walls (Bekdaş [21])and additive manufacturing (Toklu et al. [22]) are just a few of the many areas where the harmony search algorithm has been applied. Harmony search is a population-based metaheuristic algorithm that starts with an initial population of optimum solution candidates. In the current study, each solution candidate is a vector that consists of the wall thickness, height, radius, and C/S ratio of a water tank. This randomly generated initial population is called the initial harmony memory (HM) matrix, and each optimum solution candidate in HM is called a harmony vector (HV). The size of the population is called the harmony memory size (HMS). In the next step for each harmony vector, the corresponding cost and internal forces are calculated. The generation of new harmony vectors through harmony search iterations that perform better than the existing harmony vectors and replace them is at the core of this algorithm. The harmony search iteration step is described in Equation (1).
k = [ int ] ( rand   HMS ) , x i , new = x i , min + rand x i , max x i , min ,             if   HMCR > rand   x i , k + rand PAR x i , max x i , min ,             if   HMCR rand
In Equation (1), HMCR and PAR are the harmony memory consideration rate and the pitch adjustment rate, defined as HMCR = 0.5 (1−iter/maxiter) and PAR = 0.05 (1−iter/maxiter). After every harmony search iteration, the newly generated harmony vectors are compared to the existing vectors in the HM matrix. If they perform better than some of the harmony vectors in terms of cost, they will be replaced with the worst-performing harmony vector. The whole process is illustrated with a flow chart in Figure 2.

2.2. Ensemble Learning Methods

Once the dataset with the optimum dimensions has been prepared, the next step is to split the dataset into a training and a test set. In this study, the training and test sets have been generated from 70% and 30% of the entire dataset, respectively. Afterwards, the training set is forwarded into an ensemble learning model (Random Forest, LightGBM, XGBoost or CatBoost), where the 10-fold cross-validation procedure is applied using the training set. In this process, the entire training set is split into ten subsets of equal size such that nine of these subsets are used to train the ensemble learning model, and the remaining subset is used for measuring the model performance. This procedure results in 10 different models from which the best performer is used to predict the optimum wall thickness on the test set. In addition to the model parameters trained using the data of the training set, ensemble learning models further depend on hyperparameters which can be tuned using grid search. Table 2 shows the hyperparameters used for each ensemble learning model.
Once the process of cross-validation and hyperparameter tuning is complete, the performance of the predictive models on the test set is measured using accuracy metrics such as the coefficient of determination (R2), root-mean-square error (RMSE), mean average error (MAE) and mean average percentage error (MAPE). Finally, the impact of different design variables on the model predictions can be explained using the SHAP algorithm. The following diagram in Figure 3 summarizes the entire machine learning workflow.

3. Results and Discussion

Table 3 and Figure 4 and Figure 5 show the performance of four different ensemble learning models, which were trained with including the structural cost in the training set. In Figure 4, the horizontal axis shows the h values that were predicted by the ML algorithm, and the vertical axis shows the h values that were obtained through the harmony search optimization. Figure 5 shows a visual representation of the accuracies and errors associated with each ML model.
Since, in practice, the optimum cost may not be readily available, predictive models for the optimum wall thickness were also generated using only the geometric variables H, r, and the material variable C/S ratio as the design variables. The comparison of the h values predicted using the ensemble learning algorithms with the h values obtained through the harmony search optimization can be seen in Figure 6 for the case of three design variables. The accuracy values corresponding to the design with three variables are listed in Table 3 and visually represented in Figure 7. The comparison between Table 3 and Table 4 shows that the models exhibited relatively similar performance with the model trained without the cost variable performing slightly better. In the case of four input variables, the CatBoost model has the best performance on the test set with an R2 value of 0.9999, while reducing the number of input variables increased the R2 value of all models to the same level. Furthermore, the CatBoost model also has the lowest RMSE, MAE and MAPE values in the case of four input variables, whereas the Random Forest model is observed to have at least an order of magnitude smaller RMSE, MAE and MAPE values on the test set compared to the other models in the case of three input variables. The LightGBM, XGBoost, and CatBoost models had RMSE, MAE and MAPE values close to each other, with XGBoost having the largest RMSE, MAE and MAPE values in the case of three input variables. However, it should be noted that all of the models were able to achieve R2 values close to 1, which indicates high model accuracy. The high accuracy of the models can also be seen in Figure 8 where the actual and predicted wall thickness values are plotted for the random forest model with three input variables.

Interpretation of the ML Models

The SHAP algorithm and Shapley values of input variables are commonly used for the interpretation of machine learning models and to quantify the contribution of a variable to the machine learning model output. The Shapley values of a variable indicate the effect of its variable on the predicted outcome of a machine learning model. These values are calculated using Equation (2), where ϕ i are the Shapley values, M is the number of input variables, x is a vector of input variables, x′ is a simplified input vector that is related to x with a mapping function, and f is an explanatory model. Further details of the SHAP algorithm can be found in Lundberg and Lee [23], Mangalathu et al. [24], and Somala et al. [25].
ϕ i f , x = z   x z ! M z 1 ! M ! f x z f x z \ i
Figure 9, Figure 10, Figure 11 and Figure 12 show the Shapley values of all three input variables in this study for different ensemble machine learning models. In these plots, every data sample is represented by a dot, and the color of a dot shows whether a variable has a high or low value in its range. In these plots, a positive SHAP value indicates a positive effect on the model outcome, whereas a negative SHAP value indicates a negative effect on the output. The order of the variables along the vertical axis indicates their impact on the model outcome such that H has the greatest impact, followed by r and C/S. It follows that high values of H and r lead to high values of optimum wall thickness. On the other hand, C/S only has a minor effect compared to the other two variables since C/S is associated with an order of magnitude smaller SHAP values.
The feature dependence plots in Figure 13 were obtained by plotting the SHAP value of each input variable for its actual value using four different ensemble learning models. In each of these plots, the colors of the dots represent the values of the most strongly dependent variable. A common denominator of all plots in Figure 13 is that increasing the values of H and r has an increasing effect on the predicted output in all models. This indicates a positive correlation between H and r and the corresponding optimum wall thickness. On the other hand, increasing the C/S value leads to reduced SHAP values associated with this variable up to the point where C/S is equal to about 0.15. For greater values of C/S, the SHAP value stays constant. The variation in the SHAP value associated with C/S is affected by the model used. In all models, r was the variable most strongly dependent on C/S. Furthermore, in all models, the variable most dependent on H was r, and the variable most dependent on r was H. It could be observed that increasing the values of both r and H have an increasing effect on the optimum wall thickness. The feature dependence plots of all models show that at H = 10 m and r = 10 m, the SHAP value changes its sign. For H 10 the r value and the SHAP value are inversely proportional, whereas the opposite is true for H > 10. A similar relationship can be observed in the dependence plots of r too.

4. Conclusions

The availability of large sets of data and the effective implementation of machine learning algorithms open new possibilities in the process of structural design. The current study combines metaheuristic optimization techniques and ensemble learning methods to design structures with optimum cost. The main conclusions of the study can be listed as follows:
  • Using a database of 7744 sample points, ensemble learning models could be trained that predict the optimum wall thickness of a reinforced concrete water tank with 99% accuracy. The best predictive model accuracy could be obtained through the Random Forest and CatBoost models for the datasets with three and four input variables, respectively.
  • The height, radius, and concrete unit cost to steel unit cost ratio can be used for the prediction of the optimum wall thickness for the optimum construction cost.
  • The height (H) of the water tank was the most important variable that affects the optimum wall thickness, followed by the radius (r) and the unit cost ratio (C/S).
  • Increases in the values of H and r also have an increasing effect on the optimum wall thickness, whereas the opposite effect could be observed for the C/S ratio up to a certain point beyond which further increases in C/S do not have a significant effect on the SHAP value.
Future research is possible for various geometries of water tanks and using different types of material on the structural cost. Furthermore, the database for training the machine learning models can be extended using finite element analysis results.

Author Contributions

Methodology, G.B.; formal analysis (coding), G.B., K.I. and C.C.; writing—original draft preparation, G.B. and C.C.; writing—review and editing, S.K. and Z.W.G.; visualization, C.C.; supervision, G.B. and Z.W.G.; funding acquisition, Z.W.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Metrics of Model Accuracy.
Table A1. Metrics of Model Accuracy.
Root mean square error (RMSE) RMSE = i = 1 n y i y ˜ i 2     n
Coefficient of determination (R2):
R 2 = n i = 1 n y i y ˜ i i = 1 n y i i = 1 n y ˜ i n i = 1 n y i 2 i = 1 n y i 2 n i = 1 n y ˜ i 2 y ˜ i 2 2
Mean absolute percentage error (MAPE): MAPE = 1 n i = 1 n y i y ˜ i max ϵ , y i
Mean absolute error (MAE): MAE = i = 1 n y i y ˜ i n

References

  1. Yücel, M.; Kayabekir, A.E.; Bekdaş, G.; Nigdeli, S.M.; Kim, S.; Geem, Z.W. Adaptive-Hybrid Harmony Search Algorithm for Multi-Constrained Optimum Eco-Design of Reinforced Concrete Retaining Walls. Sustainability 2021, 13, 1639. [Google Scholar] [CrossRef]
  2. Cakiroglu, C.; Islam, K.; Bekdaş, G.; Billah, M. CO2 Emission and Cost Optimization of Concrete-Filled Steel Tubular (CFST) Columns Using Metaheuristic Algorithms. Sustainability 2021, 13, 8092. [Google Scholar] [CrossRef]
  3. Kayabekir, A.E.; Arama, Z.A.; Bekdaş, G.; Nigdeli, S.M.; Geem, Z.W. Eco-Friendly Design of Reinforced Concrete Retaining Walls: Multi-objective Optimization with Harmony Search Applications. Sustainability 2020, 12, 6087. [Google Scholar] [CrossRef]
  4. Bekdaş, G.; Yucel, M.; Nigdeli, S. Evaluation of Metaheuristic-Based Methods for Optimization of Truss Structures via Various Algorithms and Lèvy Flight Modification. Buildings 2021, 11, 49. [Google Scholar] [CrossRef]
  5. Bekdas, G. Optimum design of axially symmetric cylindrical reinforced concrete walls. Struct. Eng. Mech. 2014, 51, 361–375. [Google Scholar] [CrossRef]
  6. Hetenyi, M. Beams on Elastic Foundation; University of Michigan Press: Ann Arbor, MI, USA, 1967. [Google Scholar]
  7. ACI 318M-05. Building Code Requirements for Structural Concrete and Commentary; American Concrete Institute: Farmington Hills, MI, USA, 2005. [Google Scholar]
  8. Bekdaş, G. Harmony Search Algorithm Approach for Optimum Design of Post-Tensioned Axially Symmetric Cylindrical Reinforced Concrete Walls. J. Optim. Theory Appl. 2014, 164, 342–358. [Google Scholar] [CrossRef]
  9. Bekdaş, G. New improved metaheuristic approaches for optimum design of posttensioned axially symmetric cylindrical reinforced concrete walls. Struct. Des. Tall Spéc. Build. 2018, 27, e1461. [Google Scholar] [CrossRef]
  10. Bekdaş, G.; Nigdeli, S.M. Optimum Reduction of Flexural Effect of Axially Symmetric Cylindrical Walls with Post-tensioning Forces. KSCE J. Civ. Eng. 2017, 22, 2425–2432. [Google Scholar] [CrossRef]
  11. Degtyarev, V.; Naser, M. Boosting machines for predicting shear strength of CFS channels with staggered web perforations. Structures 2021, 34, 3391–3403. [Google Scholar] [CrossRef]
  12. Lee, S.; Vo, T.P.; Thai, H.-T.; Lee, J.; Patel, V. Strength prediction of concrete-filled steel tubular columns using Categorical Gradient Boosting algorithm. Eng. Struct. 2021, 238, 112109. [Google Scholar] [CrossRef]
  13. Shahriar, S.; Kayes, I.; Hasan, K.; Hasan, M.; Islam, R.; Awang, N.; Hamzah, Z.; Rak, A.; Salam, M. Potential of ARIMA-ANN, ARIMA-SVM, DT and CatBoost for Atmospheric PM2.5 Forecasting in Bangladesh. Atmosphere 2021, 12, 100. [Google Scholar] [CrossRef]
  14. Naser, M.Z. An engineer’ s guide to eXplainable Artificial Intelligence and Interpretable Machine Learning: Navigating cau-sality, forced goodness, and the false perception of inference. Autom. Constr. 2021, 129, 103821. [Google Scholar] [CrossRef]
  15. Mangalathu, S.; Jang, H.; Hwang, S.H.; Jeon, J.S. Data-driven machine-learning-based seismic failure mode identifi-cation of reinforced concrete shear walls. Eng. Struct. 2020, 208, 110331. [Google Scholar] [CrossRef]
  16. Geem, Z.W.; Kim, J.H.; Loganathan, G.V. A new heuristic optimization algorithm: Harmony search. Simulation 2001, 76, 60–68. [Google Scholar] [CrossRef]
  17. Cakiroglu, C.; Bekdaş, G.; Kim, S.; Geem, Z.W. Optimisation of Shear and Lateral–Torsional Buckling of Steel Plate Girders Using Meta-Heuristic Algorithms. Appl. Sci. 2020, 10, 3639. [Google Scholar] [CrossRef]
  18. Degertekin, S.; Minooei, M.; Santoro, L.; Trentadue, B.; Lamberti, L. Large-Scale Truss-Sizing Optimization with Enhanced Hybrid HS Algorithm. Appl. Sci. 2021, 11, 3270. [Google Scholar] [CrossRef]
  19. Kayabekir, A.E.; Bekdaş, G.; Nigdeli, S.M.; Geem, Z.W. Optimum Design of PID Controlled Active Tuned Mass Damper via Modified Harmony Search. Appl. Sci. 2020, 10, 2976. [Google Scholar] [CrossRef]
  20. Kayabekir, A.E.; Nigdeli, S.M.; Bekdaş, G. A hybrid metaheuristic method for optimization of active tuned mass dampers. Comput. Civ. Infrastruct. Eng. 2021. [Google Scholar] [CrossRef]
  21. Bekdaş, G. Optimum design of post-tensioned axially symmetric cylindrical walls using novel hybrid metaheuristic methods. Struct. Des. Tall Speéc. Build. 2018, 28, e1550. [Google Scholar] [CrossRef] [Green Version]
  22. Toklu, Y.C.; Bekdaş, G.; Geem, Z.W. Harmony Search Optimization of Nozzle Movement for Additive Manufacturing of Concrete Structures and Concrete Elements. Appl. Sci. 2020, 10, 4413. [Google Scholar] [CrossRef]
  23. Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 4768–4777. [Google Scholar]
  24. Mangalathu, S.; Hwang, S.H.; Jeon, J.S. Failure mode and effects analysis of RC members based on ma-chine-learning-based SHapley Additive exPlanations (SHAP) approach. Eng. Struct. 2020, 219, 110927. [Google Scholar] [CrossRef]
  25. Somala, S.N.; Chanda, S.; Karthikeyan, K.; Mangalathu, S. Explainable Machine learning on New Zealand strong motion for PGV and PGA. Structures 2021, 34, 4977–4985. [Google Scholar] [CrossRef]
Figure 1. Vertical (left) and horizontal (right) cross-sections of a water tank.
Figure 1. Vertical (left) and horizontal (right) cross-sections of a water tank.
Applsci 12 02165 g001
Figure 2. Flow chart of the harmony search algorithm.
Figure 2. Flow chart of the harmony search algorithm.
Applsci 12 02165 g002
Figure 3. Machine learning workflow.
Figure 3. Machine learning workflow.
Applsci 12 02165 g003
Figure 4. Comparison of the optimized and predicted h values (4 design variables). (a) LightGBM, (b) Random Forest, (c) XGBoost, (d) CatBoost.
Figure 4. Comparison of the optimized and predicted h values (4 design variables). (a) LightGBM, (b) Random Forest, (c) XGBoost, (d) CatBoost.
Applsci 12 02165 g004
Figure 5. Model accuracies with 4 design variables. (a) LightGBM, (b) Random Forest, (c) XGBoost, (d) CatBoost.
Figure 5. Model accuracies with 4 design variables. (a) LightGBM, (b) Random Forest, (c) XGBoost, (d) CatBoost.
Applsci 12 02165 g005
Figure 6. Comparison of the optimized and predicted h values (3 design variables). (a) LightGBM, (b) Random Forest, (c) XGBoost, (d) CatBoost.
Figure 6. Comparison of the optimized and predicted h values (3 design variables). (a) LightGBM, (b) Random Forest, (c) XGBoost, (d) CatBoost.
Applsci 12 02165 g006
Figure 7. Model accuracies with 3 design variables. (a) LightGBM, (b) Random Forest, (c) XGBoost, (d) CatBoost.
Figure 7. Model accuracies with 3 design variables. (a) LightGBM, (b) Random Forest, (c) XGBoost, (d) CatBoost.
Applsci 12 02165 g007aApplsci 12 02165 g007b
Figure 8. Comparison of the actual and predicted (Random Forest) wall thickness values.
Figure 8. Comparison of the actual and predicted (Random Forest) wall thickness values.
Applsci 12 02165 g008
Figure 9. Shapley values of the LightGBM model.
Figure 9. Shapley values of the LightGBM model.
Applsci 12 02165 g009
Figure 10. Shapley values of the XGBoost model.
Figure 10. Shapley values of the XGBoost model.
Applsci 12 02165 g010
Figure 11. Shapley values of the CatBoost model.
Figure 11. Shapley values of the CatBoost model.
Applsci 12 02165 g011
Figure 12. Shapley values of the Random Forest model.
Figure 12. Shapley values of the Random Forest model.
Applsci 12 02165 g012
Figure 13. Feature dependence plots for XGBoost, LightGBM, Random Forest and CatBoost. (a) XGBoost dependence plot for r, (b) XGBoost dependence plot for H, (c) XGBoost dependence plot for C/S, (d) LightGBM dependence plot for r, (e) LightGBM dependence plot for H, (f) LightGBM dependence plot for C/S, (g) RF dependence plot for r, (h) RF dependence plot for H, (i) RF dependence plot for C/S, (j) CatBoost dependence plot for r, (k) CatBoost dependence plot for H, (l) CatBoost dependence plot for C/S.
Figure 13. Feature dependence plots for XGBoost, LightGBM, Random Forest and CatBoost. (a) XGBoost dependence plot for r, (b) XGBoost dependence plot for H, (c) XGBoost dependence plot for C/S, (d) LightGBM dependence plot for r, (e) LightGBM dependence plot for H, (f) LightGBM dependence plot for C/S, (g) RF dependence plot for r, (h) RF dependence plot for H, (i) RF dependence plot for C/S, (j) CatBoost dependence plot for r, (k) CatBoost dependence plot for H, (l) CatBoost dependence plot for C/S.
Applsci 12 02165 g013
Table 1. Design variable ranges.
Table 1. Design variable ranges.
VariableUpper BoundLower BoundIncrement
H [m]1551
r [m]1551
C [USD]1003010
S [USD]1000300100
Table 2. Hyperparameters for the Ensemble Learning Models.
Table 2. Hyperparameters for the Ensemble Learning Models.
ModelParameterValue
Random ForestNumber of estimators100
-Minimum samples for split2
-Minimum samples of leaf node1
XGBoostNumber of estimators100
-Learning rate0.3
-Subsample ratio of the training instances1
-Maximum depth of a tree6
LightGBMNumber of estimators100
-Maximum number of decision leaves31
-Maximum depth of a tree−1 (no limit)
-Learning rate0.1
CatBoostNumber of iterations1000
-Learning rate0.05
-Depth6
-Bootstrap typeMVS
Table 3. Performances of the ensemble learning algorithms with the structural cost included in the training set. (See Appendix A for R2, RMSE, MAE, and MAPE).
Table 3. Performances of the ensemble learning algorithms with the structural cost included in the training set. (See Appendix A for R2, RMSE, MAE, and MAPE).
-R2RMSEMAEMAPE
Train (LightGBM)0.99960.00370.00250.0064
Test (LightGBM)0.99930.00460.00310.0077
Train (RF)0.99990.00150.00050.0012
Test (RF)0.99960.00360.00120.0030
Train (XGBoost)0.99990.00190.00120.0032
Test (XGBoost)0.99940.00430.00260.0063
Train (CatBoost)0.99990.00110.00080.0022
Test (CatBoost)0.99990.00140.00100.0027
Table 4. Performances of the ensemble learning algorithms trained without the cost variable.
Table 4. Performances of the ensemble learning algorithms trained without the cost variable.
R2RMSEMAEMAPE
Train (LightGBM)0.99991.86 × 10−41.39 × 10−43.86 × 10−4
Test (LightGBM)0.99991.98 × 10−41.50 × 10−44.09 × 10−4
Train (RF)0.99991.60 × 10−56.40 × 10−72.10 × 10−6
Test (RF)0.99992.90 × 10−51.20 × 10−64.10 × 10−6
Train (XGBoost)0.99994.00 × 10−42.49 × 10−47.76 × 10−4
Test (XGBoost)0.99994.50 × 10−42.74 × 10−48.51 × 10−4
Train (CatBoost)0.99991.10 × 10−48.10 × 10−52.35 × 10−4
Test (CatBoost)0.99991.40 × 10−49.45 × 10−52.75 × 10−4
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Bekdaş, G.; Cakiroglu, C.; Islam, K.; Kim, S.; Geem, Z.W. Optimum Design of Cylindrical Walls Using Ensemble Learning Methods. Appl. Sci. 2022, 12, 2165. https://doi.org/10.3390/app12042165

AMA Style

Bekdaş G, Cakiroglu C, Islam K, Kim S, Geem ZW. Optimum Design of Cylindrical Walls Using Ensemble Learning Methods. Applied Sciences. 2022; 12(4):2165. https://doi.org/10.3390/app12042165

Chicago/Turabian Style

Bekdaş, Gebrail, Celal Cakiroglu, Kamrul Islam, Sanghun Kim, and Zong Woo Geem. 2022. "Optimum Design of Cylindrical Walls Using Ensemble Learning Methods" Applied Sciences 12, no. 4: 2165. https://doi.org/10.3390/app12042165

APA Style

Bekdaş, G., Cakiroglu, C., Islam, K., Kim, S., & Geem, Z. W. (2022). Optimum Design of Cylindrical Walls Using Ensemble Learning Methods. Applied Sciences, 12(4), 2165. https://doi.org/10.3390/app12042165

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop