[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (25)

Search Parameters:
Keywords = TPOT

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 11145 KiB  
Article
Regional Soil Moisture Estimation Leveraging Multi-Source Data Fusion and Automated Machine Learning
by Shenglin Li, Pengyuan Zhu, Ni Song, Caixia Li and Jinglei Wang
Remote Sens. 2025, 17(5), 837; https://doi.org/10.3390/rs17050837 - 27 Feb 2025
Viewed by 175
Abstract
Soil moisture (SM) monitoring in farmland at a regional scale is crucial for precision irrigation management and ensuring food security. However, existing methods for SM estimation encounter significant challenges related to accuracy, generalizability, and automation. This study proposes an integrated data fusion method [...] Read more.
Soil moisture (SM) monitoring in farmland at a regional scale is crucial for precision irrigation management and ensuring food security. However, existing methods for SM estimation encounter significant challenges related to accuracy, generalizability, and automation. This study proposes an integrated data fusion method to systematically assess the potential of three automated machine learning (AutoML) frameworks—tree-based pipeline optimization tool (TPOT), AutoGluon, and H2O AutoML—in retrieving SM. To evaluate the impact of input variables on estimation accuracy, six input scenarios were designed: multispectral data (MS), thermal infrared data (TIR), MS combined with TIR, MS with auxiliary data, TIR with auxiliary data, and a comprehensive combination of MS, TIR, and auxiliary data. The research was conducted in a winter wheat cultivation area within the People’s Victory Canal Irrigation Area, focusing on the 0–40 cm soil layer. The results revealed that the scenario incorporating all data types (MS + TIR + auxiliary) achieved the highest retrieval accuracy. Under this scenario, all three AutoML frameworks demonstrated optimal performance. AutoGluon demonstrated superior performance in most scenarios, particularly excelling in the MS + TIR + auxiliary data scenario. It achieved the highest retrieval accuracy with a Pearson correlation coefficient (R) value of 0.822, root mean square error (RMSE) of 0.038 cm3/cm3, and relative root mean square error (RRMSE) of 16.46%. This study underscores the critical role of input data types and fusion strategies in enhancing SM estimation accuracy and highlights the significant advantages of AutoML frameworks for regional-scale SM retrieval. The findings offer a robust technical foundation and theoretical guidance for advancing precision irrigation management and efficient SM monitoring. Full article
Show Figures

Figure 1

Figure 1
<p>Study area and distribution of sampling sites. Triangular markers indicate the locations of soil moisture (SM) sampling points. (<b>a</b>) Location of Henan Province, China; (<b>b</b>) Digital Elevation Model (DEM) of Henan Province and the location of the study area in Henan Province; (<b>c</b>) DEM of the study area; (<b>d</b>) Land cover classification map of the study area and the locations of sampling points.</p>
Full article ">Figure 2
<p>Flowchart showing overall methodology for soil moisture (SM) estimation.</p>
Full article ">Figure 3
<p>Statistical distribution of the full dataset, training set, and testing set.</p>
Full article ">Figure 4
<p>Statistical indicators of soil moisture estimation accuracy under six input scenarios, including <span class="html-italic">R</span>, <span class="html-italic">RMSE</span>, and <span class="html-italic">RRMSE</span>.</p>
Full article ">Figure 5
<p>Box plot illustrating the error distribution of the three AutoML algorithms under different scenarios.</p>
Full article ">Figure 6
<p>Scatter plot of the prediction results from three AutoML algorithms using SC6 (MS + TIR + auxiliary) as the input variables.</p>
Full article ">Figure 7
<p>Spatial and temporal distribution maps of soil moisture (SM).</p>
Full article ">Figure 8
<p>Distribution maps of soil moisture (SM) estimation using AutoGluon, TPOT, and H2O AutoML for 21 March 2015 and 3 April 2015. The first column represents 21 March 2015, and the second column represents 3 April 2015.</p>
Full article ">
30 pages, 4440 KiB  
Article
Simplatab: An Automated Machine Learning Framework for Radiomics-Based Bi-Parametric MRI Detection of Clinically Significant Prostate Cancer
by Dimitrios I. Zaridis, Vasileios C. Pezoulas, Eugenia Mylona, Charalampos N. Kalantzopoulos, Nikolaos S. Tachos, Nikos Tsiknakis, George K. Matsopoulos, Daniele Regge, Nikolaos Papanikolaou, Manolis Tsiknakis, Kostas Marias and Dimitrios I. Fotiadis
Bioengineering 2025, 12(3), 242; https://doi.org/10.3390/bioengineering12030242 - 26 Feb 2025
Viewed by 307
Abstract
Background: Prostate cancer (PCa) diagnosis using MRI is often challenged by lesion variability. Methods: This study introduces Simplatab, an open-source automated machine learning (AutoML) framework designed for, but not limited to, automating the entire machine Learning pipeline to facilitate the detection of clinically [...] Read more.
Background: Prostate cancer (PCa) diagnosis using MRI is often challenged by lesion variability. Methods: This study introduces Simplatab, an open-source automated machine learning (AutoML) framework designed for, but not limited to, automating the entire machine Learning pipeline to facilitate the detection of clinically significant prostate cancer (csPCa) using radiomics features. Unlike existing AutoML tools such as Auto-WEKA, Auto-Sklearn, ML-Plan, ATM, Google AutoML, and TPOT, Simplatab offers a comprehensive, user-friendly framework that integrates data bias detection, feature selection, model training with hyperparameter optimization, explainable AI (XAI) analysis, and post-training model vulnerabilities detection. Simplatab requires no coding expertise, provides detailed performance reports, and includes robust data bias detection, making it particularly suitable for clinical applications. Results: Evaluated on a large pan-European cohort of 4816 patients from 12 clinical centers, Simplatab supports multiple machine learning algorithms. The most notable features that differentiate Simplatab include ease of use, a user interface accessible to those with no coding experience, comprehensive reporting, XAI integration, and thorough bias assessment, all provided in a human-understandable format. Conclusions: Our findings indicate that Simplatab can significantly enhance the usability, accountability, and explainability of machine learning in clinical settings, thereby increasing trust and accessibility for AI non-experts. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Schematic representation of Simplatab AutoML framework.</p>
Full article ">Figure 2
<p>(<b>A</b>) Desktop app, (<b>B</b>) introduction page, (<b>C</b>) introduction page for individuals with vision impairment, and (<b>D</b>) the parameter selection from the front-end.</p>
Full article ">Figure 3
<p>Bias assessment using nine metrics with respect to different MR vendors (Siemens, Phillips, General Electric, and Toshiba) and target class (csPCa) for the retrospective and the prospective sets.</p>
Full article ">Figure 4
<p>AUC-ROC (<b>left</b>) and precision–recall curves (<b>right</b>) for the prospective dataset.</p>
Full article ">Figure 5
<p>Heatmap plot with the SHAP values for each feature ordered by importance, correlated with the XGBoost outcome, for the external dataset.</p>
Full article ">Figure 6
<p>Feature importance in the XGBoost model, for the external dataset.</p>
Full article ">Figure A1
<p>Data bias detection by client age group.</p>
Full article ">Figure A2
<p>Heatmap with the SHAP values (<b>left</b>) and importance of each feature for model decision (<b>right</b>) of the XGBoost model.</p>
Full article ">Figure A3
<p>Data bias detection by customer gender (male/female).</p>
Full article ">Figure A4
<p>Heatmap with the SHAP values (<b>left</b>) and the importance of each feature for model decision (<b>right</b>) for the XGBoost model.</p>
Full article ">
21 pages, 7635 KiB  
Article
Developing an Hourly Water Level Prediction Model for Small- and Medium-Sized Agricultural Reservoirs Using AutoML: Case Study of Baekhak Reservoir, South Korea
by Jeongho Han and Joo Hyun Bae
Agriculture 2025, 15(1), 71; https://doi.org/10.3390/agriculture15010071 - 30 Dec 2024
Viewed by 698
Abstract
This study focuses on developing an hourly water level prediction model for small- and medium-sized agricultural reservoirs using the Tree-based Pipeline Optimization Tool (TPOT), an automated machine learning (AutoML) technique. The study area is the Baekhak Reservoir in South Korea, and various precipitation-related [...] Read more.
This study focuses on developing an hourly water level prediction model for small- and medium-sized agricultural reservoirs using the Tree-based Pipeline Optimization Tool (TPOT), an automated machine learning (AutoML) technique. The study area is the Baekhak Reservoir in South Korea, and various precipitation-related and reservoir water storage data were collected. Using these collected data, we compared widely used individual machine learning and deep learning models with the pipeline models generated by TPOT. The comparison showed that pipeline models, which included various preprocessing and ensemble techniques, exhibited higher predictive accuracy than individual machine learning and even deep learning models. The optimal pipeline model was evaluated for its performance in predicting water levels during an extreme rainfall event, demonstrating its effectiveness for hourly water level prediction. However, issues such as the overprediction of peak water levels and delays in predicting sudden water level changes were observed, likely due to inaccuracies in the ultra-short-term forecast precipitation data and the lack of information on reservoir operations (e.g., gate openings and drainage plans for agriculture). This study highlights the potential of AutoML techniques for use in hydrological modeling, and demonstrates their contribution to more efficient water management and flood prevention strategies in agricultural reservoirs. Full article
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Distribution of agricultural reservoirs managed by the Korea Rural community Corporation in South Korea and (<b>b</b>) location of Baekhak Reservoir, Baekhak Disaster Prevention Weather Station, and the central points of high-resolution gridded precipitation data used for model development.</p>
Full article ">Figure 2
<p>Schematic illustration of the process for determining the best-performing model by comparing pipeline models generated by TPOT with individually trained machine learning models.</p>
Full article ">Figure 3
<p>Methods for splitting training and test data for each dataset according to the cross-validation method. (<b>a</b>) Long-term datasets (1 January 2010–22 July 2024, n = 25,999). (<b>b</b>) Long-term datasets (1 January 2010–22 July 2024, n = 117,414).</p>
Full article ">Figure 4
<p>Time series variations of reservoir water levels and precipitation in each dataset. The black dotted lines in each plot represent the full water level (50 m). (<b>a</b>) Short-term datasets (30 June 2021–22 July 2024). (<b>b</b>) Short-term datasets (30 June 2021–22 July 2024).</p>
Full article ">Figure 5
<p>Scatter plots showing the relationship between the 1 h-ahead water level (m) and each feature on the short-term dataset. In each plot, the <span class="html-italic">y</span>-axis represents the 1 h-ahead water level and the <span class="html-italic">x</span>-axis corresponds to the specific feature indicated in the plot title.</p>
Full article ">Figure 6
<p>Comparison graph of 15-fold cross-validation results. (<b>a</b>) Short-term datasets (30 June 2021–22 July 2024). (<b>b</b>) Long-term datasets (1 January 2010–22 July 2024).</p>
Full article ">Figure 7
<p>Temporal variation of predicted and observed reservoir water level during the testing period. The blue dotted lines denote the full water level of 50 m of the Baekhak Reservoir. In the legend in each graph, ‘Pecip.’ denotes precipitation, and ‘W.L.’ denotes the water level. ‘Forecast Precip.’ denotes the ultra-short-term forecast precipitation. Green circles highlight where temporal distinct misalignments occurred and increased as a lead time increased.</p>
Full article ">Figure 8
<p>Comparison of predicted and observed water levels (lead time = 3 h) based on different input features.</p>
Full article ">
14 pages, 8478 KiB  
Article
Estimating Rainfall Erosivity in North Korea Using Automated Machine Learning: Insights into Regional Soil Erosion Risks
by Jeongho Han and Seoro Lee
Land 2024, 13(12), 2038; https://doi.org/10.3390/land13122038 - 28 Nov 2024
Viewed by 571
Abstract
Soil erosion due to rainfall is a critical environmental issue in North Korea, exacerbated by deforestation and climate change. This study aims to estimate rainfall erosivity (RE) in North Korea using automated machine learning (AutoML), with a particular focus on regional soil erosion [...] Read more.
Soil erosion due to rainfall is a critical environmental issue in North Korea, exacerbated by deforestation and climate change. This study aims to estimate rainfall erosivity (RE) in North Korea using automated machine learning (AutoML), with a particular focus on regional soil erosion risks. North Korean data were sourced from the European Centre for Medium-Range Weather Forecasts (ECMWF) ReAnalysis 5 dataset, while South Korean data were obtained from the Korea Meteorological Administration. Data from 50 stations in South Korea (2013–2019) and 27 stations in North Korea (1980–2020) were used. The GradientBoostingRegressor (GBR) model, optimized using the Tree-based Pipeline Optimization Tool (TPOT), was trained on South Korean data. The model’s performance was evaluated using metrics such as the root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2), achieving high predictive accuracy across eight stations in South Korea. Using the optimized model, RE in North Korea was estimated, and the spatial distribution of RE was analyzed using the Kriging interpolation. Results reveal significant regional variability, with the southern and western areas displaying the highest erosivity. These findings provide valuable insights into soil erosion management and the development of sustainable agricultural and environmental strategies in North Korea. Full article
(This article belongs to the Section Land, Soil and Water)
Show Figures

Figure 1

Figure 1
<p>The geographical extent of the study area, covering North Korea and its proximity to South Korea, along with the locations of meteorological stations used in this study.</p>
Full article ">Figure 2
<p>Annual precipitation and average rainfall intensity of North Korea with their trends during the study period.</p>
Full article ">Figure 3
<p>Changes in cross-validation accuracy over generations during the TPOT model pipeline optimization.</p>
Full article ">Figure 4
<p>Scatter plot of estimated monthly RE from best TPOT model and monthly RE at eight test stations in South Korea (Solid line stands for 1:1 line, and dash line stands for regression lines, respectively).</p>
Full article ">Figure 5
<p>Bar chart showing the correlation (<span class="html-italic">r</span>) values between input features and RE for each weather station used for testing.</p>
Full article ">Figure 6
<p>Spatial distribution of average annual RE, average annual rainfall, and CV across North Korea.</p>
Full article ">
17 pages, 4057 KiB  
Article
A Comparative Analysis of Automated Machine Learning Tools: A Use Case for Autism Spectrum Disorder Detection
by Rana Tuqeer Abbas, Kashif Sultan, Muhammad Sheraz and Teong Chee Chuah
Information 2024, 15(10), 625; https://doi.org/10.3390/info15100625 - 11 Oct 2024
Cited by 1 | Viewed by 1186
Abstract
Automated Machine Learning (AutoML) enhances productivity and efficiency by automating the entire process of machine learning model development, from data preprocessing to model deployment. These tools are accessible to users with varying levels of expertise and enable efficient, scalable, and accurate classification across [...] Read more.
Automated Machine Learning (AutoML) enhances productivity and efficiency by automating the entire process of machine learning model development, from data preprocessing to model deployment. These tools are accessible to users with varying levels of expertise and enable efficient, scalable, and accurate classification across different applications. This paper evaluates two popular AutoML tools, the Tree-Based Pipeline Optimization Tool (TPOT) version 0.10.2 and Konstanz Information Miner (KNIME) version 5.2.5, comparing their performance in a classification task. Specifically, this work analyzes autism spectrum disorder (ASD) detection in toddlers as a use case. The dataset for ASD detection was collected from various rehabilitation centers in Pakistan. TPOT and KNIME were applied to the ASD dataset, with TPOT achieving an accuracy of 85.23% and KNIME achieving 83.89%. Evaluation metrics such as precision, recall, and F1-score validated the reliability of the models. After selecting the best models with optimal accuracy, the most important features for ASD detection were identified using these AutoML tools. The tools optimized the feature selection process and significantly reduced diagnosis time. This study demonstrates the potential of AutoML tools and feature selection techniques to improve early ASD detection and outcomes for affected children and their families. Full article
(This article belongs to the Special Issue Real-World Applications of Machine Learning Techniques)
Show Figures

Figure 1

Figure 1
<p>Workflow of the proposed system using AutoML tools.</p>
Full article ">Figure 2
<p>Distribution of autistic vs. non-autistic children in the dataset.</p>
Full article ">Figure 3
<p>Machine learning pipeline automated by TPOT.</p>
Full article ">Figure 4
<p>KNIME’s workflow and functionality.</p>
Full article ">Figure 5
<p>Classification report of the model generated by TPOT.</p>
Full article ">Figure 6
<p>Confusion matrix of the model generated by TPOT.</p>
Full article ">Figure 7
<p>ROC curve for the model performance optimized using TPOT.</p>
Full article ">Figure 8
<p>Important features identified by the TPOT model.</p>
Full article ">Figure 9
<p>KNIME’s AutoML workflow for model optimization.</p>
Full article ">Figure 10
<p>AutoML’s summary view in KNIME, showing the best model.</p>
Full article ">Figure 11
<p>Confusion matrix of the model generated by KNIME.</p>
Full article ">Figure 12
<p>ROC curve for model performance in KNIME.</p>
Full article ">Figure 13
<p>Random Forest workflow in KNIME for identifying important features.</p>
Full article ">Figure 14
<p>Important features identified by KNIME model.</p>
Full article ">
31 pages, 1004 KiB  
Article
Daily Streamflow Forecasting Using AutoML and Remote-Sensing-Estimated Rainfall Datasets in the Amazon Biomes
by Matteo Bodini
Signals 2024, 5(4), 659-689; https://doi.org/10.3390/signals5040037 - 10 Oct 2024
Viewed by 1450
Abstract
Reliable streamflow forecasting is crucial for several tasks related to water-resource management, including planning reservoir operations, power generation via Hydroelectric Power Plants (HPPs), and flood mitigation, thus resulting in relevant social implications. The present study is focused on the application of Automated Machine-Learning [...] Read more.
Reliable streamflow forecasting is crucial for several tasks related to water-resource management, including planning reservoir operations, power generation via Hydroelectric Power Plants (HPPs), and flood mitigation, thus resulting in relevant social implications. The present study is focused on the application of Automated Machine-Learning (AutoML) models to forecast daily streamflow in the area of the upper Teles Pires River basin, located in the region of the Amazon biomes. The latter area is characterized by extensive water-resource utilization, mostly for power generation through HPPs, and it has a limited hydrological data-monitoring network. Five different AutoML models were employed to forecast the streamflow daily, i.e., auto-sklearn, Tree-based Pipeline Optimization Tool (TPOT), H2O AutoML, AutoKeras, and MLBox. The AutoML input features were set as the time-lagged streamflow and average rainfall data sourced from four rain gauge stations and one streamflow gauge station. To overcome the lack of training data, in addition to the previous features, products estimated via remote sensing were leveraged as training data, including PERSIANN, PERSIANN-CCS, PERSIANN-CDR, and PDIR-Now. The selected AutoML models proved their effectiveness in forecasting the streamflow in the considered basin. In particular, the reliability of streamflow predictions was high both in the case when training data came from rain and streamflow gauge stations and when training data were collected by the four previously mentioned estimated remote-sensing products. Moreover, the selected AutoML models showed promising results in forecasting the streamflow up to a three-day horizon, relying on the two available kinds of input features. As a final result, the present research underscores the potential of employing AutoML models for reliable streamflow forecasting, which can significantly advance water-resource planning and management within the studied geographical area. Full article
(This article belongs to the Special Issue Rainfall Estimation Using Signals)
Show Figures

Figure 1

Figure 1
<p>Map representation of the upper Teles Pires River basin. On the left side, a map of the state of Brazil delineates the boundaries of all its 27 federal states. In particular, the Teles Pires River basin extends across the states of Mato Grosso and Pará. On the right side, both the entire basin and the upper basin of the Teles Pires River are represented, where the latter one is reported with latitude and longitude coordinates. The figure reported by Oliveira et al. [<a href="#B30-signals-05-00037" class="html-bibr">30</a>] under the terms of the Creative Commons Attribution License—CC BY 4.0 (<a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a> accessed on 1 September 2024).</p>
Full article ">Figure 2
<p>Characterization map of the upper Teles Pires River basin reporting a digital elevation model. The reported altitudes range from a minimum altitude of 272 m to a maximum of 895 m and are color-coded according to the reported legend on the left side. The entire drainage network, fluviometric, pluviometric, and meteorological stations are reported on the map. Figure adapted from Oliveira et al. [<a href="#B30-signals-05-00037" class="html-bibr">30</a>] under the terms of the Creative Commons Attribution License—CC BY 4.0 (<a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a> accessed on 1 September 2024).</p>
Full article ">Figure 3
<p>The Figure reports the average streamflow <math display="inline"><semantics> <msub> <mover> <mi>Q</mi> <mo>¯</mo> </mover> <mi>d</mi> </msub> </semantics></math> (in m<sup>3</sup> s<sup>−1</sup>) recorded at the Teles Pires Fluviometric Station identified with code 017210000 (Latitude: −12.67º and Longitude: −55.79º) in <a href="#signals-05-00037-t001" class="html-table">Table 1</a> over the period from January 1985 to November 2023. The data shows significant seasonal fluctuations over months. Indeed, as reported in <a href="#sec3dot1-signals-05-00037" class="html-sec">Section 3.1</a>, peak rainfall usually occurs from October to April (rainy season), resulting in higher streamflow during such months, while the lowest precipitation period is from May to September (dry season), thus leading to lower induced streamflow (refer to <a href="#sec1-signals-05-00037" class="html-sec">Section 1</a> to deepen the relationship between rainfall and river flows).</p>
Full article ">Figure 4
<p>The top four subplots in the Figure report the rainfall data collected from the four rain gauge stations listed in <a href="#signals-05-00037-t001" class="html-table">Table 1</a>. Each subplot shows the total rainfall over time for the respective station on the <span class="html-italic">y</span>-axis. The final subplot displays the computed average daily rainfall, <math display="inline"><semantics> <msub> <mover> <mi>P</mi> <mo>¯</mo> </mover> <mi>d</mi> </msub> </semantics></math>, in red, calculated using the Thiessen polygon method. The data spans from January 1985 to November 2023, with the <span class="html-italic">x</span>-axis representing the measurement years for all the subplots. Additional details from <a href="#signals-05-00037-t001" class="html-table">Table 1</a>, such as station names, types, and geographical coordinates, were also reported.</p>
Full article ">Figure 5
<p>The four subplots in the Figure report the average rainfall data <math display="inline"><semantics> <msub> <mover> <mi>P</mi> <mo>¯</mo> </mover> <mi>d</mi> </msub> </semantics></math> computed from the four remote-sensing products, listed in <a href="#signals-05-00037-t002" class="html-table">Table 2</a>. Each subplot shows the average rainfall over time for the respective remote-sensing product on the <span class="html-italic">y</span>-axis. The time-spans for the computed averages are reported in the title of each subplot, with the <span class="html-italic">x</span>-axis representing the measurement years.</p>
Full article ">Figure 6
<p>The figure reports the forecasting performance gained by AutoML models for each selected metric, over all the employed features. The figure reports the test set forecasting performance through five separate subplots, each corresponding to a different performance metric, previously described in <a href="#sec3dot4-signals-05-00037" class="html-sec">Section 3.4</a>. Results are displayed by providing summary statistics in each subplot through box plots, color-coded with different colors, each representing a different time-lag (refer to the legends).</p>
Full article ">Figure 7
<p>The figure reports the forecasting performance gained for each metric when each selected feature was used as input over all the employed AutoML models. The figure reports the test set forecasting performance through five separate subplots, each corresponding to a different performance metric, previously described in <a href="#sec3dot4-signals-05-00037" class="html-sec">Section 3.4</a>. Results are displayed by providing summary statistics in each subplot through box plots, color-coded with different colors, each representing a different time-lag (refer to the legends).</p>
Full article ">Figure 8
<p>Average streamflow observed data and predictions obtained from the top performing AutoML model for each input feature. In the figure, AutoML models were evaluated on the data contained in the respective test set kept for each feature set (refer to <a href="#signals-05-00037-t003" class="html-table">Table 3</a>) and for each selected time-lag <span class="html-italic">l</span> (<math display="inline"><semantics> <mrow> <mn>1</mn> <mo>≤</mo> <mi>l</mi> <mo>≤</mo> <mn>3</mn> </mrow> </semantics></math>). The latter data were unseen by the trained AutoML models for all the selected input features. The reported top performing AutoML models and respective input features according to the <math display="inline"><semantics> <msubsup> <mi>AutoML</mi> <mi>score</mi> <mi mathvariant="normal">T</mi> </msubsup> </semantics></math> metric were H2O AutoML for Thiessen, auto-sklearn for PERSIANN, H2O AutoML for PERSIANN-CCS, AutoKeras for PERSIANN-CDR, and auto-sklearn for PDIR-Now. Observed average streamflow data were reported in red color, while time-series data predicted from AutoML models were reported with blue dashed lines. Horizontal thick black lines define the boundaries outside which predictions were not computed (since data were contained in the training set or were not available for the considered feature set).</p>
Full article ">
16 pages, 1777 KiB  
Article
Metabolomics Biomarker Discovery to Optimize Hepatocellular Carcinoma Diagnosis: Methodology Integrating AutoML and Explainable Artificial Intelligence
by Fatma Hilal Yagin, Radwa El Shawi, Abdulmohsen Algarni, Cemil Colak, Fahaid Al-Hashem and Luca Paolo Ardigò
Diagnostics 2024, 14(18), 2049; https://doi.org/10.3390/diagnostics14182049 - 15 Sep 2024
Cited by 1 | Viewed by 1471
Abstract
Background: This study aims to assess the efficacy of combining automated machine learning (AutoML) and explainable artificial intelligence (XAI) in identifying metabolomic biomarkers that can differentiate between hepatocellular carcinoma (HCC) and liver cirrhosis in patients with hepatitis C virus (HCV) infection. Methods: We [...] Read more.
Background: This study aims to assess the efficacy of combining automated machine learning (AutoML) and explainable artificial intelligence (XAI) in identifying metabolomic biomarkers that can differentiate between hepatocellular carcinoma (HCC) and liver cirrhosis in patients with hepatitis C virus (HCV) infection. Methods: We investigated publicly accessible data encompassing HCC patients and cirrhotic controls. The TPOT tool, which is an AutoML tool, was used to optimize the preparation of features and data, as well as to select the most suitable machine learning model. The TreeSHAP approach, which is a type of XAI, was used to interpret the model by assessing each metabolite’s individual contribution to the categorization process. Results: TPOT had superior performance in distinguishing between HCC and cirrhosis compared to other AutoML approaches AutoSKlearn and H2O AutoML, in addition to traditional machine learning models such as random forest, support vector machine, and k-nearest neighbor. The TPOT technique attained an AUC value of 0.81, showcasing superior accuracy, sensitivity, and specificity in comparison to the other models. Key metabolites, including L-valine, glycine, and DL-isoleucine, were identified as essential by TPOT and subsequently verified by TreeSHAP analysis. TreeSHAP provided a comprehensive explanation of the contribution of these metabolites to the model’s predictions, thereby increasing the interpretability and dependability of the results. This thorough assessment highlights the strength and reliability of the AutoML framework in the development of clinical biomarkers. Conclusions: This study shows that AutoML and XAI can be used together to create metabolomic biomarkers that are specific to HCC. The exceptional performance of TPOT in comparison to traditional models highlights its capacity to identify biomarkers. Furthermore, TreeSHAP boosted model transparency by highlighting the relevance of certain metabolites. This comprehensive method has the potential to enhance the identification of biomarkers and generate precise, easily understandable, AI-driven solutions for diagnosing HCC. Full article
Show Figures

Figure 1

Figure 1
<p>A diagram of the proposed method in the current research.</p>
Full article ">Figure 2
<p>Nemenyi Test (α = 0.05) comparing the AUC of testing data for AutoML techniques and traditional machine learning techniques.</p>
Full article ">Figure 3
<p>Feature importance ranking based on SHAP values.</p>
Full article ">Figure 4
<p>SHAP waterfall plot for a representative true positive sample.</p>
Full article ">Figure 5
<p>SHAP waterfall plot for a representative true negative sample.</p>
Full article ">Figure 6
<p>Partial dependence plot of L-valine 1 showing its SHAP value and interaction with 2,3-butanediol 2.</p>
Full article ">
20 pages, 1896 KiB  
Article
Prediction of Endocrine-Disrupting Chemicals Related to Estrogen, Androgen, and Thyroid Hormone (EAT) Modalities Using Transcriptomics Data and Machine Learning
by Guillaume Ollitrault, Marco Marzo, Alessandra Roncaglioni, Emilio Benfenati, Enrico Mombelli and Olivier Taboureau
Toxics 2024, 12(8), 541; https://doi.org/10.3390/toxics12080541 - 26 Jul 2024
Viewed by 1828
Abstract
Endocrine-disrupting chemicals (EDCs) are chemicals that can interfere with homeostatic processes. They are a major concern for public health, and they can cause adverse long-term effects such as cancer, intellectual impairment, obesity, diabetes, and male infertility. The endocrine system is a complex machinery, [...] Read more.
Endocrine-disrupting chemicals (EDCs) are chemicals that can interfere with homeostatic processes. They are a major concern for public health, and they can cause adverse long-term effects such as cancer, intellectual impairment, obesity, diabetes, and male infertility. The endocrine system is a complex machinery, with the estrogen (E), androgen (A), and thyroid hormone (T) modes of action being of major importance. In this context, the availability of in silico models for the rapid detection of hazardous chemicals is an effective contribution to toxicological assessments. We developed Qualitative Gene expression Activity Relationship (QGexAR) models to predict the propensities of chemically induced disruption of EAT modalities. We gathered gene expression profiles from the LINCS database tested on two cell lines, i.e., MCF7 (breast cancer) and A549 (adenocarcinomic human alveolar basal epithelial). We optimized our prediction protocol by testing different feature selection methods and classification algorithms, including CATBoost, XGBoost, Random Forest, SVM, Logistic regression, AutoKeras, TPOT, and deep learning models. For each EAT endpoint, the final prediction was made according to a consensus prediction as a function of the best model obtained for each cell line. With the available data, we were able to develop a predictive model for estrogen receptor and androgen receptor binding and thyroid hormone receptor antagonistic effects with a consensus balanced accuracy on a validation set ranging from 0.725 to 0.840. The importance of each predictive feature was further assessed to identify known genes and suggest new genes potentially involved in the mechanisms of action of EAT perturbation. Full article
(This article belongs to the Collection Predictive Toxicology)
Show Figures

Figure 1

Figure 1
<p>Protocol for the development and evaluation of predictive models using transcriptomics data. Endocrine disruptors activity related to ER, AR and TR were retrieved from CERAPP [<a href="#B19-toxics-12-00541" class="html-bibr">19</a>], CoMPARA [<a href="#B20-toxics-12-00541" class="html-bibr">20</a>], and Gadaleta et al. [<a href="#B35-toxics-12-00541" class="html-bibr">35</a>] respectively. Transcriptomics data were retrieved from L1000 [<a href="#B36-toxics-12-00541" class="html-bibr">36</a>].</p>
Full article ">Figure 2
<p>UMAP of the transcriptomics profiles for each chemical in each dataset for the two cell lines. Each dot corresponds to a chemical. Blue dots correspond to chemicals in the MCF7 dataset and red to A549. Only landmark genes were considered. (<b>a</b>) Profile for the ER binding dataset. (<b>b</b>) Profile for the AR binding dataset. (<b>c</b>) Profile for the TR antagonist dataset.</p>
Full article ">Figure 3
<p>UMAP of the transcriptomics profiles using the landmark genes for each chemical in each dataset for the two cell lines. In orange are the active chemicals, and in blue are the inactive chemicals, considering the specific endpoint. (<b>a</b>) Profile for the ER binding dataset. (<b>b</b>) Profile for the AR binding dataset. (<b>c</b>) Profile for the TR antagonist dataset.</p>
Full article ">Figure 4
<p>Mean BA from cross-validation (BA-CV) with standard deviation (BA-CV sd) after 10 iterations of five-fold cross-validation on the training set for all the cell lines and all the models for all the endpoint datasets using the landmark features selected by the multiSURF algorithm.</p>
Full article ">Figure 5
<p>Predictive performance of the top-performing models retained for each cell line across the three endpoints on the validation set was determined by varying the cosine similarity thresholds from 0 (not similar) to 1 (similar). The <span class="html-italic">x</span>-axis denotes the cosine similarity threshold utilized for inclusion of the transcriptomics profile of the validation set in the performance calculation. Cosine similarity is determined by averaging the similarity scores of the three most similar transcriptomic profiles, utilizing landmark and best-inferred genes. The solid lines indicate BA, whereas the bars represent the coverage of the validation set.</p>
Full article ">
15 pages, 1560 KiB  
Article
ML-Based Detection of DDoS Attacks Using Evolutionary Algorithms Optimization
by Fauzia Talpur, Imtiaz Ali Korejo, Aftab Ahmed Chandio, Ali Ghulam and Mir. Sajjad Hussain Talpur
Sensors 2024, 24(5), 1672; https://doi.org/10.3390/s24051672 - 5 Mar 2024
Cited by 7 | Viewed by 3778
Abstract
The escalating reliance of modern society on information and communication technology has rendered it vulnerable to an array of cyber-attacks, with distributed denial-of-service (DDoS) attacks emerging as one of the most prevalent threats. This paper delves into the intricacies of DDoS attacks, which [...] Read more.
The escalating reliance of modern society on information and communication technology has rendered it vulnerable to an array of cyber-attacks, with distributed denial-of-service (DDoS) attacks emerging as one of the most prevalent threats. This paper delves into the intricacies of DDoS attacks, which exploit compromised machines numbering in the thousands to disrupt data services and online commercial platforms, resulting in significant downtime and financial losses. Recognizing the gravity of this issue, various detection techniques have been explored, yet the quantity and prior detection of DDoS attacks has seen a decline in recent methods. This research introduces an innovative approach by integrating evolutionary optimization algorithms and machine learning techniques. Specifically, the study proposes XGB-GA Optimization, RF-GA Optimization, and SVM-GA Optimization methods, employing Evolutionary Algorithms (EAs) Optimization with Tree-based Pipelines Optimization Tool (TPOT)-Genetic Programming. Datasets pertaining to DDoS attacks were utilized to train machine learning models based on XGB, RF, and SVM algorithms, and 10-fold cross-validation was employed. The models were further optimized using EAs, achieving remarkable accuracy scores: 99.99% with the XGB-GA method, 99.50% with RF-GA, and 99.99% with SVM-GA. Furthermore, the study employed TPOT to identify the optimal algorithm for constructing a machine learning model, with the genetic algorithm pinpointing XGB-GA as the most effective choice. This research significantly advances the field of DDoS attack detection by presenting a robust and accurate methodology, thereby enhancing the cybersecurity landscape and fortifying digital infrastructures against these pervasive threats. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

Figure 1
<p>TPOT: Searching Pipelines Optimization with Genetic Algorithms.</p>
Full article ">Figure 2
<p>The components in a typical pipeline that are being examined by a Data Scientist are highlighted in yellow on both the right and left sides. The highlighted portion in the middle is an indication of the search for the optimal ML pipeline which is performed by TPOT.</p>
Full article ">Figure 3
<p>The framework of the proposed DDoS diagnosis procedure.</p>
Full article ">Figure 4
<p>ROC(AUC) performance of a classification model.</p>
Full article ">
18 pages, 1554 KiB  
Article
Towards Cleaner Ports: Predictive Modeling of Sulfur Dioxide Shipping Emissions in Maritime Facilities Using Machine Learning
by Carlos D. Paternina-Arboleda, Dayana Agudelo-Castañeda, Stefan Voß and Shubhendu Das
Sustainability 2023, 15(16), 12171; https://doi.org/10.3390/su151612171 - 9 Aug 2023
Cited by 9 | Viewed by 2295
Abstract
Maritime ports play a pivotal role in fostering the growth of domestic and international trade and economies. As ports continue to expand in size and capacity, the impact of their operations on air quality and climate change becomes increasingly significant. While nearby regions [...] Read more.
Maritime ports play a pivotal role in fostering the growth of domestic and international trade and economies. As ports continue to expand in size and capacity, the impact of their operations on air quality and climate change becomes increasingly significant. While nearby regions may experience economic benefits, there are significant concerns regarding the emission of atmospheric pollutants, which have adverse effects on both human health and climate change. Predictive modeling of port emissions can serve as a valuable tool in identifying areas of concern, evaluating the effectiveness of emission reduction strategies, and promoting sustainable development within ports. The primary objective of this research is to utilize machine learning frameworks to estimate the emissions of SO2 from ships during various port activities, including hoteling, maneuvering, and cruising. By employing these models, we aim to gain insights into the emission patterns and explore strategies to mitigate their impact. Through our analysis, we have identified the most effective models for estimating SO2 emissions. The AutoML TPOT framework emerges as the top-performing model, followed by Non-Linear Regression with interaction effects. On the other hand, Linear Regression exhibited the lowest performance among the models evaluated. By employing these advanced machine learning techniques, we aim to contribute to the body of knowledge surrounding port emissions and foster sustainable practices within the maritime industry. Full article
(This article belongs to the Special Issue Sustainability in Logistics and Supply Chain Management)
Show Figures

Figure 1

Figure 1
<p>Data after transformation (example); <span class="html-italic">Methods and Procedures</span>.</p>
Full article ">Figure 2
<p>Preview of dataset after label encoding.</p>
Full article ">Figure 3
<p>Methodology for predicting emission inventories. Adapted from [<a href="#B34-sustainability-15-12171" class="html-bibr">34</a>].</p>
Full article ">
15 pages, 342 KiB  
Article
The Imbalanced Classification of Fraudulent Bank Transactions Using Machine Learning
by Alexey Ruchay, Elena Feldman, Dmitriy Cherbadzhi and Alexander Sokolov
Mathematics 2023, 11(13), 2862; https://doi.org/10.3390/math11132862 - 26 Jun 2023
Cited by 4 | Viewed by 4335
Abstract
This article studies the development of a reliable AI model to detect fraudulent bank transactions, including money laundering, and illegal activities with goods and services. The proposed machine learning model uses the CreditCardFraud dataset and utilizes multiple algorithms with different parameters. The results [...] Read more.
This article studies the development of a reliable AI model to detect fraudulent bank transactions, including money laundering, and illegal activities with goods and services. The proposed machine learning model uses the CreditCardFraud dataset and utilizes multiple algorithms with different parameters. The results are evaluated using Accuracy, Precision, Recall, F1 score, and IBA. We have increased the reliability of the imbalanced classification of fraudulent credit card transactions in comparison to the best known results by using the Tomek links resampling algorithm of the imbalanced CreditCardFraud dataset. The reliability of the results, using the proposed model based on the TPOT and RandomForest algorithms, has been confirmed by using 10-fold cross-validation. It is shown that on the dataset the accuracy of the proposed model detecting fraudulent bank transactions reaches 99.99%. Full article
(This article belongs to the Special Issue Mathematics and Financial Economics)
Show Figures

Figure 1

Figure 1
<p>A flowchart of the proposed model.</p>
Full article ">Figure 2
<p>The data normalization reliability on a test dataset using metrics accuracy and <span class="html-italic">IBA</span>.</p>
Full article ">Figure 3
<p>The results of classification method reliability on the test dataset; <span class="html-italic">Precision</span>, <span class="html-italic">Recall</span>, and <span class="html-italic">F</span>1 Score are specified for transaction class “fraudulent”.</p>
Full article ">Figure 4
<p>The results of classification methods’ reliability [<a href="#B18-mathematics-11-02862" class="html-bibr">18</a>,<a href="#B19-mathematics-11-02862" class="html-bibr">19</a>,<a href="#B20-mathematics-11-02862" class="html-bibr">20</a>,<a href="#B21-mathematics-11-02862" class="html-bibr">21</a>,<a href="#B22-mathematics-11-02862" class="html-bibr">22</a>,<a href="#B23-mathematics-11-02862" class="html-bibr">23</a>,<a href="#B24-mathematics-11-02862" class="html-bibr">24</a>,<a href="#B25-mathematics-11-02862" class="html-bibr">25</a>,<a href="#B26-mathematics-11-02862" class="html-bibr">26</a>,<a href="#B27-mathematics-11-02862" class="html-bibr">27</a>,<a href="#B28-mathematics-11-02862" class="html-bibr">28</a>] on a test dataset using metrics <span class="html-italic">Accuracy</span>, <span class="html-italic">Precision</span>, <span class="html-italic">Recall</span>, <span class="html-italic">F</span>1 Score, and <span class="html-italic">IBA</span>. Metrics’ precision, recall, and <span class="html-italic">F</span>1 Score are specified for transaction class “fraudulent”. We mark the best results in bold.</p>
Full article ">Figure 4 Cont.
<p>The results of classification methods’ reliability [<a href="#B18-mathematics-11-02862" class="html-bibr">18</a>,<a href="#B19-mathematics-11-02862" class="html-bibr">19</a>,<a href="#B20-mathematics-11-02862" class="html-bibr">20</a>,<a href="#B21-mathematics-11-02862" class="html-bibr">21</a>,<a href="#B22-mathematics-11-02862" class="html-bibr">22</a>,<a href="#B23-mathematics-11-02862" class="html-bibr">23</a>,<a href="#B24-mathematics-11-02862" class="html-bibr">24</a>,<a href="#B25-mathematics-11-02862" class="html-bibr">25</a>,<a href="#B26-mathematics-11-02862" class="html-bibr">26</a>,<a href="#B27-mathematics-11-02862" class="html-bibr">27</a>,<a href="#B28-mathematics-11-02862" class="html-bibr">28</a>] on a test dataset using metrics <span class="html-italic">Accuracy</span>, <span class="html-italic">Precision</span>, <span class="html-italic">Recall</span>, <span class="html-italic">F</span>1 Score, and <span class="html-italic">IBA</span>. Metrics’ precision, recall, and <span class="html-italic">F</span>1 Score are specified for transaction class “fraudulent”. We mark the best results in bold.</p>
Full article ">Figure 5
<p>The 10-fold cross-validation of the Accuracy score and the IBA metric for RandomForestClassifier, LGBMClassifier, XGBClassifier, CatBoostClassifier, and TPOT. Mean: average of various metrics; SD (<math display="inline"><semantics><mrow><mo>×</mo><msup><mn>10</mn><mrow><mo>−</mo><mn>4</mn></mrow></msup></mrow></semantics></math>): standard deviation of various metrics.</p>
Full article ">
13 pages, 1629 KiB  
Article
Predicting Hepatotoxicity Associated with Low-Dose Methotrexate Using Machine Learning
by Qiaozhi Hu, Hualing Wang and Ting Xu
J. Clin. Med. 2023, 12(4), 1599; https://doi.org/10.3390/jcm12041599 - 17 Feb 2023
Cited by 4 | Viewed by 2860
Abstract
An accurate prediction of the hepatotoxicity associated with low-dose methotrexate can provide evidence for a reasonable treatment choice. This study aimed to develop a machine learning-based prediction model to predict hepatotoxicity associated with low-dose methotrexate and explore the associated risk factors. Eligible patients [...] Read more.
An accurate prediction of the hepatotoxicity associated with low-dose methotrexate can provide evidence for a reasonable treatment choice. This study aimed to develop a machine learning-based prediction model to predict hepatotoxicity associated with low-dose methotrexate and explore the associated risk factors. Eligible patients with immune system disorders, who received low-dose methotrexate at West China Hospital between 1 January 2018, and 31 December 2019, were enrolled. A retrospective review of the included patients was conducted. Risk factors were selected from multiple patient characteristics, including demographics, admissions, and treatments. Eight algorithms, including eXtreme Gradient Boosting (XGBoost), AdaBoost, CatBoost, Gradient Boosting Decision Tree (GBDT), Light Gradient Boosting Machine (LightGBM), Tree-based Pipeline Optimization Tool (TPOT), Random Forest (RF), and Artificial Neural Network (ANN), were used to establish the prediction model. A total of 782 patients were included, and hepatotoxicity was detected in 35.68% (279/782) of the patients. The Random Forest model with the best predictive capacity was chosen to establish the prediction model (receiver operating characteristic curve 0.97, accuracy 64.33%, precision 50.00%, recall 32.14%, and F1 39.13%). Among the 15 risk factors, the highest score was a body mass index of 0.237, followed by age (0.198), the number of drugs (0.151), and the number of comorbidities (0.144). These factors demonstrated their importance in predicting hepatotoxicity associated with low-dose methotrexate. Using machine learning, this novel study established a predictive model for low-dose methotrexate-related hepatotoxicity. The model can improve medication safety in patients taking methotrexate in clinical practice. Full article
Show Figures

Figure 1

Figure 1
<p>The flow chart illustrating patient selection.</p>
Full article ">Figure 2
<p>Visual presentation of eight machine learning models (<b>A</b>) the precision-recall curve, and (<b>B</b>) the receiver operating characteristic (ROC) curve.</p>
Full article ">Figure 3
<p>Importance score ranking for risk factors.</p>
Full article ">Figure 4
<p>SHAP values of the important risk factors.</p>
Full article ">
21 pages, 6273 KiB  
Article
Sea Surface Salinity Inversion Model for Changjiang Estuary and Adjoining Sea Area with SMAP and MODIS Data Based on Machine Learning and Preliminary Application
by Xiaoyu Zhang, Mingfei Wu, Wencong Han, Lei Bi, Yongheng Shang and Yingchun Yang
Remote Sens. 2022, 14(21), 5358; https://doi.org/10.3390/rs14215358 - 26 Oct 2022
Cited by 1 | Viewed by 2186
Abstract
Sea surface salinity (SSS) is one of the most important basic parameters for studying the oceanographic processes and is of great significance in identifying oceanic currents. However, for a long time, the salinity observation in the estuary and coastal waters has not been [...] Read more.
Sea surface salinity (SSS) is one of the most important basic parameters for studying the oceanographic processes and is of great significance in identifying oceanic currents. However, for a long time, the salinity observation in the estuary and coastal waters has not been well resolved due to the technology limitation. In this study, the SSS inversion models for the Changjiang Estuary and the adjacent sea waters were established based on machine learning methods, using SMAP (Soil Moisture Active and Passive) salinity data combined with the specific bands and bands ratios of MODIS (Moderate Resolution Imaging Spectroradiometer). The performance of the three machine learning methods (Random Forest, Particle Swarm Optimization Support Vector Regression (PSO-SVR) and Automatic Machine Learning (TPOT)) are compared with accuracy verification by the in-situ measured SSS. Random Forest is proven to be effective for the SSS inversion in flood season, whereas TPOP performs the best for the dry season. The machine learning-based models effectively solve the problem of insufficient time span of SSS observation from salinity satellites. At the same time, an empirical algorithm was established for the SSS inversion for the sea areas with low salinity (<30 psu) where the machine learning based model fails with great errors. The average deviation of the complex SSS inversion models is −0.86 psu, validated with Copernicus Global Ocean Reanalysis Data. The long term series SSS dataset of March and August from 2003 to 2020 was then constructed to observe the salinity distribution characteristics of the flood season and the dry season, respectively. It is indicated that the distribution pattern of CDW can be divided into three categories: northeast-oriented expansion pattern, multi direction isotropic expansion pattern, and a turn pattern of which CDW shows changing direction, namely the northeast-southeast expansion pattern. The pattern of CDW expansion is indicated to be the comprehensive effect of the interaction of different currents. In addition, it is noteworthy that CDW shows increasing expansion with decreasing SSS in the front plume, especially in the flood season. This study not only gives a feasible solution for effective SSS observation, but also provides a dataset of basic oceanographic parameters for studying the coastal biogeochemical processes, evolution of land-sea interaction, and changing trend of material and energy transport by the CDW in the west Pacific boundary. Full article
(This article belongs to the Special Issue Progresses in Agro-Geoinformatics)
Show Figures

Figure 1

Figure 1
<p>Sketch map of Geographical location, main currents and the stations collecting the in-situ data of the study area.</p>
Full article ">Figure 2
<p>The technology road mapping used in this study.</p>
Full article ">Figure 3
<p>The correlation coefficient between MODIS\Rrs, band ratios and measured SSS.</p>
Full article ">Figure 4
<p>Feature importance of random forest output.</p>
Full article ">Figure 5
<p>Accuracy validation of Random Forest ((<b>a</b>): August; (<b>b</b>): March), PSO-SVR ((<b>c</b>): August; (<b>d</b>): March), TPOT ((<b>e</b>): August; (<b>f</b>): March). Both horizontal and vertical coordinate units are psu.</p>
Full article ">Figure 6
<p>Comparison of in-situ SSS and estimated SSS. The broken lines of August (<b>left</b>) and April (<b>right</b>) were shown in the chart.</p>
Full article ">Figure 7
<p>Comparisons between the inversion SSS and monthly SMAP\SSS product ((<b>a</b>,<b>c</b>) are the inversion SSS for August and March 2020; (<b>b</b>,<b>d</b>) are SMAP\SSS product in August and March 2020).</p>
Full article ">Figure 8
<p>Accuracy evaluation of inversion SSS by empirical model for the near shore sea waters.</p>
Full article ">Figure 9
<p>The process of SSS inversion based on the complex models.</p>
Full article ">Figure 10
<p>The inversion results in August 2020 based on complex models (<b>a</b>) and with machine learning based model only (<b>b</b>).</p>
Full article ">Figure 11
<p>The expansion type of the Yangtze River’s diluent water ((<b>a</b>) is the northeast expansion type in 2005, (<b>b</b>,<b>e</b>) are the northeast expansion type in 2004 and 2017, (<b>c</b>) is the northeast expansion type in 2020, (<b>d</b>,<b>f</b>) are the multidirectional expansion type in 2014 and 2015 type).</p>
Full article ">Figure 12
<p>Spatial variation trend of SSS in March and August in the waters adjacent to the Yangtze Estuary from 2003 to 2020 ((<b>a</b>) is March, (<b>b</b>) is August).</p>
Full article ">Figure A1
<p>SSS inversion results for August 2003–2020.</p>
Full article ">Figure A2
<p>SSS inversion results for March 2003–2020.</p>
Full article ">
21 pages, 3234 KiB  
Article
Automated Classification of Atherosclerotic Radiomics Features in Coronary Computed Tomography Angiography (CCTA)
by Mardhiyati Mohd Yunus, Ahmad Khairuddin Mohamed Yusof, Muhd Zaidi Ab Rahman, Xue Jing Koh, Akmal Sabarudin, Puteri N. E. Nohuddin, Kwan Hoong Ng, Mohd Mustafa Awang Kechik and Muhammad Khalis Abdul Karim
Diagnostics 2022, 12(7), 1660; https://doi.org/10.3390/diagnostics12071660 - 8 Jul 2022
Cited by 7 | Viewed by 3398
Abstract
Radiomics is the process of extracting useful quantitative features of high-dimensional data that allows for automated disease classification, including atherosclerotic disease. Hence, this study aimed to quantify and extract the radiomic features from Coronary Computed Tomography Angiography (CCTA) images and to evaluate the [...] Read more.
Radiomics is the process of extracting useful quantitative features of high-dimensional data that allows for automated disease classification, including atherosclerotic disease. Hence, this study aimed to quantify and extract the radiomic features from Coronary Computed Tomography Angiography (CCTA) images and to evaluate the performance of automated machine learning (AutoML) model in classifying the atherosclerotic plaques. In total, 202 patients who underwent CCTA examination at Institut Jantung Negara (IJN) between September 2020 and May 2021 were selected as they met the inclusion criteria. Three primary coronary arteries were segmented on axial sectional images, yielding a total of 606 volume of interest (VOI). Subsequently, the first order, second order, and shape order of radiomic characteristics were extracted for each VOI. Model 1, Model 2, Model 3, and Model 4 were constructed using AutoML-based Tree-Pipeline Optimization Tools (TPOT). The heatmap confusion matrix, recall (sensitivity), precision (PPV), F1 score, accuracy, receiver operating characteristic (ROC), and area under the curve (AUC) were analysed. Notably, Model 1 with the first-order features showed superior performance in classifying the normal coronary arteries (F1 score: 0.88; Inverse F1 score: 0.94), as well as in classifying the calcified (F1 score: 0.78; Inverse F1 score: 0.91) and mixed plaques (F1 score: 0.76; Inverse F1 score: 0.86). Moreover, Model 2 consisting of second-order features was proved useful, specifically in classifying the non-calcified plaques (F1 score: 0.63; Inverse F1 score: 0.92) which are a key point for prediction of cardiac events. Nevertheless, Model 3 comprising the shape-based features did not contribute to the classification of atherosclerotic plaques. Overall, TPOT shown promising capabilities in terms of finding the best pipeline and tailoring the model using CCTA-based radiomic datasets. Full article
(This article belongs to the Section Medical Imaging and Theranostics)
Show Figures

Figure 1

Figure 1
<p>Tree-based pipeline from TPOT. Reprinted with permission from Ref. [<a href="#B37-diagnostics-12-01660" class="html-bibr">37</a>]. 2019, Oxford University Press.</p>
Full article ">Figure 2
<p>Overall flow of patient selection.</p>
Full article ">Figure 3
<p>Overall research workflow.</p>
Full article ">Figure 4
<p>(<b>a</b>) Before segmentation of proximal LAD and (<b>b</b>) after segmentation of non-calcified lesion on proximal LAD using semi-automated (growth from seed) type of segmentation which was colored into yellow colour.</p>
Full article ">Figure 5
<p>LIFEx software is used to perform semi-automated segmentation on RCA, LAD, and LCX. (<b>a1</b>) Mid RCA with a mixed calcified atherosclerotic plaque seen. (<b>a2</b>) The mixed calcified plaque was enclosed by the VOI placement (pink colour) on the mid RCA. (<b>b1</b>) Proximal LAD with a non-calcified atherosclerotic plaque was seen. (<b>b2</b>) The non-calcified plaque was surrounded by the VOI placement (yellow colour) on the proximal LAD. (<b>c1</b>) Proximal LCX with a calcified atherosclerotic plaque was observed. (<b>c2</b>) The calcified atherosclerotic plaque was surrounded by the VOI placement (blue colour) on the proximal LCX.</p>
Full article ">Figure 5 Cont.
<p>LIFEx software is used to perform semi-automated segmentation on RCA, LAD, and LCX. (<b>a1</b>) Mid RCA with a mixed calcified atherosclerotic plaque seen. (<b>a2</b>) The mixed calcified plaque was enclosed by the VOI placement (pink colour) on the mid RCA. (<b>b1</b>) Proximal LAD with a non-calcified atherosclerotic plaque was seen. (<b>b2</b>) The non-calcified plaque was surrounded by the VOI placement (yellow colour) on the proximal LAD. (<b>c1</b>) Proximal LCX with a calcified atherosclerotic plaque was observed. (<b>c2</b>) The calcified atherosclerotic plaque was surrounded by the VOI placement (blue colour) on the proximal LCX.</p>
Full article ">Figure 6
<p>Pipeline search by TPOT. Initially, raw data was split into input and output variables.</p>
Full article ">Figure 7
<p>Heatmap confusion matrix for (<b>a</b>) Model 1, (<b>b</b>) Model 2, (<b>c</b>) Model 3 and (<b>d</b>) Model 4. Each column of the matrix represents the occurrence in a predicted class, whereas each row represents the occurrence in an actual class.</p>
Full article ">Figure 8
<p>ROC curve for (<b>a</b>) Model 1, (<b>b</b>) Model 2, (<b>c</b>) Model 3 and (<b>d</b>) Model 4.</p>
Full article ">
16 pages, 1830 KiB  
Article
Analysis of Heart Rate Variability and Game Performance in Normal and Cognitively Impaired Elderly Subjects Using Serious Games
by Chun-Ju Hou, Yen-Ting Chen, Mycel A. Capilayan, Min-Wei Huang and Ji-Jer Huang
Appl. Sci. 2022, 12(9), 4164; https://doi.org/10.3390/app12094164 - 20 Apr 2022
Cited by 5 | Viewed by 2468
Abstract
Cognitive decline is one of the primary concerns in the elderly population. Serious games have been used for different purposes related to elderly care, such as physical therapy, cognitive training and mood management. There has been scientific evidence regarding the relationship between cognition [...] Read more.
Cognitive decline is one of the primary concerns in the elderly population. Serious games have been used for different purposes related to elderly care, such as physical therapy, cognitive training and mood management. There has been scientific evidence regarding the relationship between cognition and the autonomic nervous system (ANS) through heart rate variability (HRV). This paper explores the changes in the ANS among elderly people of normal and impaired cognition through measured HRV. Forty-eight subjects were classified into two groups: normal cognition (NC) (n = 24) and mild cognitive impairment (MCI) (n = 24). The subjects went through the following experiment flow: rest for 3 min (Rest 1), play a cognitive aptitude game (Game 1), rest for another 3 min (Rest 2), then play two reaction-time games (Game 2&3). Ten HRV features were extracted from measured electrocardiography (ECG) signals. Based on statistical analysis, there was no significant difference on the HRV between the two groups, but the experiment sessions do have a significant effect. There was no significant interaction between sessions and cognitive status. This implies that the HRV between the two groups have no significant difference, and they will experience similar changes in their HRV regardless of their cognitive status. Based on the game performance, there was a significant difference between the two groups of elderly people. Tree-based pipeline optimization tool (TPOT) was used for generating a machine learning pipeline for classification. Classification accuracy of 68.75% was achieved using HRV features, but higher accuracies of 83.33% and 81.20% were achieved using game performance or both HRV and game performance features, respectively. These results show that HRV has the potential to be used for detection of mild cognition impairment, but game performance can yield better accuracy. Thus, serious games have the potential to be used for assessing cognitive decline among the elderly. Full article
(This article belongs to the Special Issue The Applications of Machine Learning in Biomedical Science)
Show Figures

Figure 1

Figure 1
<p>Experiment flow.</p>
Full article ">Figure 2
<p>Cognitive Game based on Nostalgia Theory.</p>
Full article ">Figure 3
<p>Whack-a-Mole Game.</p>
Full article ">Figure 4
<p>Hit-the-Ball Game. The subject has to distinguish between the two ball types and press the correct button when the football reaches the target area.</p>
Full article ">Figure 5
<p>Traditional Machine Learning Pipeline.</p>
Full article ">
Back to TopTop