[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,292)

Search Parameters:
Keywords = black-box

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 870 KiB  
Article
Reversible Adversarial Examples with Minimalist Evolution for Recognition Control in Computer Vision
by Shilong Yang, Lu Leng, Ching-Chun Chang and Chin-Chen Chang
Appl. Sci. 2025, 15(3), 1142; https://doi.org/10.3390/app15031142 - 23 Jan 2025
Abstract
As artificial intelligence increasingly automates the recognition and analysis of visual content, it poses significant risks to privacy, security, and autonomy. Computer vision systems can surveil and exploit data without consent. With these concerns in mind, we introduce a novel method to control [...] Read more.
As artificial intelligence increasingly automates the recognition and analysis of visual content, it poses significant risks to privacy, security, and autonomy. Computer vision systems can surveil and exploit data without consent. With these concerns in mind, we introduce a novel method to control whether images can be recognized by computer vision systems using reversible adversarial examples. These examples are generated to evade unauthorized recognition, allowing only systems with permission to restore the original image by removing the adversarial perturbation with zero-bit error. A key challenge with prior methods is their reliance on merely restoring the examples to a state in which they can be correctly recognized by the model; however, the restored images are not fully consistent with the original images, and they require excessive auxiliary information to achieve reversibility. To achieve zero-bit error restoration, we utilize the differential evolution algorithm to optimize adversarial perturbations while minimizing distortion. Additionally, we introduce a dual-color space detection mechanism to localize perturbations, eliminating the need for extra auxiliary information. Ultimately, when combined with reversible data hiding, adversarial attacks can achieve reversibility. Experimental results demonstrate that the PSNR and SSIM between the restored images by the method and the original images are and 1, respectively. The PSNR and SSIM between the reversible adversarial examples and the original images are 48.32 dB and 0.9986, respectively. Compared to state-of-the-art methods, the method maintains high visual fidelity at a comparable attack success rate. Full article
23 pages, 2620 KiB  
Article
AGTM Optimization Technique for Multi-Model Fractional-Order Controls of Spherical Tanks
by Sabavath Jayaram, Cristiano Maria Verrelli and Nithya Venkatesan
Mathematics 2025, 13(3), 351; https://doi.org/10.3390/math13030351 - 22 Jan 2025
Viewed by 410
Abstract
Spherical tanks are widely utilized in process industries due to their substantial storage capacity. These industries’ inherent challenges necessitate using highly efficient controllers to manage various process parameters, especially given their nonlinear behavior. This paper proposes the Approximate Generalized Time Moments (AGTM) optimization [...] Read more.
Spherical tanks are widely utilized in process industries due to their substantial storage capacity. These industries’ inherent challenges necessitate using highly efficient controllers to manage various process parameters, especially given their nonlinear behavior. This paper proposes the Approximate Generalized Time Moments (AGTM) optimization technique for designing the parameters of multi-model fractional-order controllers for regulating the output (liquid level) of a real-time nonlinear spherical tank. System identification for different regions of the nonlinear process is here innovatively conducted using a black-box model, which is determined to be nonlinear and approximated as a First Order Plus Dead Time (FOPDT) system over each region. Both model identification and controller design are performed in simulation and real-time using a National Instruments NI DAQmx 6211 Data Acquisition (DAQ) card (NI SYSTEMS INDIA PVT. LTD., Bangalore Karnataka, India) and MATLAB/SIMULINK software (MATLAB R2021a). The performance of the overall algorithm is evaluated through simulation and experimental testing, with several setpoints and load changes, and is compared to the performance of other algorithms tuned within the same framework. While traditional approaches, such as integer-order controllers or linear approximations, often struggle to provide consistent performance across the operating range of spherical tanks, it is originally shown how the combination of multi-model fractional-order controller design—AGTM optimization method—GA for expansion point selection and index minimization has benefits in specifically controlling a (difficult to be controlled) nonlinear process. Full article
(This article belongs to the Special Issue Fractional Calculus and Mathematical Applications, 2nd Edition)
Show Figures

Figure 1

Figure 1
<p>Open-loop input–output response curve with an S-shaped characteristic.</p>
Full article ">Figure 2
<p>Block diagram of a general closed-loop system featuring the fractional-order controller.</p>
Full article ">Figure 3
<p>Schematic representation of the AGTM optimization methodology (<span class="html-italic">y</span> is the liquid level, <span class="html-italic">r</span> is its reference value).</p>
Full article ">Figure 4
<p>(<b>a</b>) Flowchart of the AGTM optimization algorithm. (<b>b</b>) Flowchart of the GA optimization algorithm.</p>
Full article ">Figure 5
<p>All Controllers servo response over different regions.</p>
Full article ">Figure 6
<p>All Controller regulatory responses under load variations over different regions.</p>
Full article ">
17 pages, 8641 KiB  
Article
Image-Based Tactile Deformation Simulation and Pose Estimation for Robot Skill Learning
by Chenfeng Fu, Longnan Li, Yuan Gao, Weiwei Wan, Kensuke Harada, Zhenyu Lu and Chenguang Yang
Appl. Sci. 2025, 15(3), 1099; https://doi.org/10.3390/app15031099 - 22 Jan 2025
Viewed by 336
Abstract
The TacTip is a cost-effective, 3D-printed optical tactile sensor commonly used in deep learning and reinforcement learning for robotic manipulation. However, its specialized structure, which combines soft materials of varying hardnesses, makes it challenging to simulate the distribution of numerous printed markers on [...] Read more.
The TacTip is a cost-effective, 3D-printed optical tactile sensor commonly used in deep learning and reinforcement learning for robotic manipulation. However, its specialized structure, which combines soft materials of varying hardnesses, makes it challenging to simulate the distribution of numerous printed markers on pins. This paper aims to create an interpretable, AI-applicable simulation of the deformation of TacTip under varying pressures and interactions with different objects, addressing the black-box nature of learning and simulation in haptic manipulation. The research focuses on simulating the TacTip sensor’s shape using a fully tunable, chain-based mathematical model, refined through comparisons with real-world measurements. We integrated the WRS system with our theoretical model to evaluate its effectiveness in object pose estimation. The results demonstrated that the prediction accuracy for all markers across a variety of contact scenarios exceeded 92%. Full article
(This article belongs to the Special Issue Recent Advances in Autonomous Systems and Robotics, 2nd Edition)
Show Figures

Figure 1

Figure 1
<p>Dobot Magician manipulator and customized dual-TacTip end effector setup.</p>
Full article ">Figure 2
<p>Three- dimensional-printed triangular prism part.</p>
Full article ">Figure 3
<p>Contact feedback with pillow part.</p>
Full article ">Figure 4
<p>Calculation of chain point position in cross-section.</p>
Full article ">Figure 5
<p>Initial chain shape and displaced chain shape.</p>
Full article ">Figure 6
<p>Composition of each chain.</p>
Full article ">Figure 7
<p>Locating contact surface boundary.</p>
Full article ">Figure 8
<p>Unwrapping of TacTip skin and object surface for accurate boundary location.</p>
Full article ">Figure 9
<p>Designated intersection point distribution pattern.</p>
Full article ">Figure 10
<p>Distribution pattern of Transverse and Vertical Chains.</p>
Full article ">Figure 11
<p>Dual-arm assembly in WRS.</p>
Full article ">Figure 12
<p>Tube point cloud generation.</p>
Full article ">Figure 13
<p>Simulated TacTip shape with line contact.</p>
Full article ">Figure 14
<p>Simulated TacTip shape with flat surface contact.</p>
Full article ">Figure 15
<p>Simulated TacTip shape with curve surface contact.</p>
Full article ">Figure 16
<p>Simulated displacement pattern of pins in the Designated Crossing Point model. (Contacted area in green and “vector areas” in blue).</p>
Full article ">Figure 17
<p>Simulated displacement pattern of pins in the Transverse and Vertical Chains model. (Contacted area in green, “vector areas” in blue and “parallel areas” in yellow).</p>
Full article ">Figure 18
<p>The displacement pattern of pins in real-world feedback. (Contacted area in green, “vector areas” in blue and “parallel areas” in yellow).</p>
Full article ">Figure 19
<p>Planned grasp poses on tube with Franka gripper.</p>
Full article ">Figure 20
<p>Configuration of Kawasaki RS007L manipulator (Kawasaki Heavy Industries, Akashi, Hyogo, Japan) with Franka gripper (Franka Robotics, München, Germany).</p>
Full article ">Figure 21
<p>Continuous displacement (lower right) and depth (upper right) feedback with gripper holding tube with Pose A.</p>
Full article ">Figure 22
<p>Continuous displacement (lower right) and depth (upper right) feedback with gripper holding tube with different Pose B.</p>
Full article ">
24 pages, 7108 KiB  
Article
Explainable AI Using On-Board Diagnostics Data for Urban Buses Maintenance Management: A Study Case
by Bernardo Tormos, Benjamín Pla, Ramón Sánchez-Márquez and Jose Luis Carballo
Information 2025, 16(2), 74; https://doi.org/10.3390/info16020074 - 21 Jan 2025
Viewed by 325
Abstract
Industry 4.0, leveraging tools like AI and the massive generation of data, is driving a paradigm shift in maintenance management. Specifically, in the realm of Artificial Intelligence (AI), traditionally “black box” models are now being unveiled through explainable AI techniques, which provide insights [...] Read more.
Industry 4.0, leveraging tools like AI and the massive generation of data, is driving a paradigm shift in maintenance management. Specifically, in the realm of Artificial Intelligence (AI), traditionally “black box” models are now being unveiled through explainable AI techniques, which provide insights into model decision-making processes. This study addresses the underutilization of these techniques alongside On-Board Diagnostics data by maintenance management teams in urban bus fleets for addressing key issues affecting vehicle reliability and maintenance needs. In the context of urban bus fleets, diesel particulate filter regeneration processes frequently operate under suboptimal conditions, accelerating engine oil degradation and increasing maintenance costs. Due to limited documentation on the control system of the filter, the maintenance team faces obstacles in proposing solutions based on a comprehensive understanding of the system’s behavior and control logic. The objective of this study is to analyze and predict the various states during the diesel particulate filter regeneration process using Machine Learning and explainable artificial intelligence techniques. The insights obtained aim to provide the maintenance team with a deeper understanding of the filter’s control logic, enabling them to develop proposals grounded in a comprehensive understanding of the system. This study employs a combination of traditional Machine Learning models, including XGBoost, LightGBM, Random Forest, and Support Vector Machine. The target variable, representing three possible regeneration states, was transformed using a one-vs-rest approach, resulting in three binary classification tasks where each target state was individually classified against all other states. Additionally, explainable AI techniques such as Shapley Additive Explanations, Partial Dependence Plots, and Individual Conditional Expectation were applied to interpret and visualize the conditions influencing each regeneration state. The results successfully associate two states with specific operating conditions and establish operational thresholds for key variables, offering practical guidelines for optimizing the regeneration process. Full article
(This article belongs to the Special Issue Machine Learning and Artificial Intelligence with Applications)
Show Figures

Figure 1

Figure 1
<p>Overview of the After-treatment System Used in the Studied Bus.</p>
Full article ">Figure 2
<p>Correlation analysis for the temperatures in the after-treatment system.</p>
Full article ">Figure 3
<p>Confusion Matrix for State 4.</p>
Full article ">Figure 4
<p>Confusion Matrix for State 8.</p>
Full article ">Figure 5
<p>Confusion Matrix for State 32.</p>
Full article ">Figure 6
<p>Feature Contribution Analysis Using Beeswarm Plot (State 4).</p>
Full article ">Figure 7
<p>Feature Contribution Analysis Using Beeswarm Plot (State 8).</p>
Full article ">Figure 8
<p>SHAP Dependence Plot for NOx inlet in the SCR with DPF Pressure Delta (State 4).</p>
Full article ">Figure 9
<p>SHAP Dependence Plot for NOx inlet in the SCR with DPF Pressure Delta (State 8).</p>
Full article ">Figure 10
<p>Feature Contribution Analysis Using Beeswarm Plot (State 32).</p>
Full article ">Figure 11
<p>SHAP Dependence Plot for NOx Delta in the SCR with Duration (State 32).</p>
Full article ">Figure 12
<p>SHAP Dependence Plot for Outlet Temperature in the DOC with NOx Delta in the SCR (State 32).</p>
Full article ">Figure 13
<p>SHAP Dependence Plot for Speed with Engine Speed (State 8).</p>
Full article ">Figure 14
<p>SHAP Dependence Plot for Engine Speed with DPF Pressure Delta (State 8).</p>
Full article ">Figure 15
<p>PDP and ICE Plot for Outlet Temperature in the DOC (State 4).</p>
Full article ">Figure 16
<p>PDP and ICE Plot for DPF Pressure Delta (State 8).</p>
Full article ">Figure 17
<p>PDP and ICE Plot for Outlet Temperature in the DOC (State 8).</p>
Full article ">Figure 18
<p>PDP and ICE Plot for DPF Backpressure inlet (State 8).</p>
Full article ">Figure 19
<p>Bivariate PDP for Speed and Outlet Temperature in the DOC (State 4).</p>
Full article ">Figure 20
<p>Bivariate PDP for Speed and Outlet Temperature in the DOC (State 8).</p>
Full article ">
15 pages, 5662 KiB  
Article
A Facile Electrode Modification Approach Based on Metal-Free Carbonaceous Carbon Black/Carbon Nanofibers for Electrochemical Sensing of Bisphenol A in Food
by Jin Wang, Zhen Yang, Shuanghuan Gu, Mingfei Pan and Longhua Xu
Foods 2025, 14(2), 314; https://doi.org/10.3390/foods14020314 - 18 Jan 2025
Viewed by 472
Abstract
Bisphenol A (BPA) is a typical environmental estrogen that is distributed worldwide and has the potential to pose a hazard to the ecological environment and human health. The development of an efficient and sensitive sensing strategy for the monitoring of BPA residues is [...] Read more.
Bisphenol A (BPA) is a typical environmental estrogen that is distributed worldwide and has the potential to pose a hazard to the ecological environment and human health. The development of an efficient and sensitive sensing strategy for the monitoring of BPA residues is of paramount importance. A novel electrochemical sensor based on carbon black and carbon nanofibers composite (CB/f-CNF)-assisted signal amplification has been successfully constructed for the amperometric detection of BPA in foods. Herein, the hybrid CB/f-CNF was prepared using a simple one-step ultrasonication method, and exhibited good electron transfer capability and excellent catalytic properties, which can be attributed to the large surface area of carbon black and the strong enhancement of the conductivity and porosity of carbon nanofibers, which promote a faster electron transfer process on the electrode surface. Under the optimized conditions, the proposed CB/f-CNF/GCE sensor exhibited a wide linear response range (0.4–50.0 × 10−6 mol/L) with a low limit of detection of 5.9 × 10−8 mol/L for BPA quantification. Recovery tests were conducted on canned peaches and boxed milk, yielding satisfactory recoveries of 86.0–102.6%. Furthermore, the developed method was employed for the rapid and sensitive detection of BPA in canned meat and packaged milk, demonstrating comparable accuracy to the HPLC method. This work presents an efficient signal amplification strategy through the utilization of carbon/carbon nanocomposite sensitization technology. Full article
Show Figures

Figure 1

Figure 1
<p>Schematic illustration of the construction of the CB/f-CNF/GCE sensor. Note: Black line, red line, blue line, purple line, yellow line, green line and cyan line are the DPV curves of CB/f-CNF/GCE in bisphenol A solution of 0.4 μM, 1 μM, 2 μM, 6 μM, 10 μM, 20 μM and 50 μM, respectively.</p>
Full article ">Figure 2
<p>SEM images of CNF (<b>A</b>), f-CNF (<b>B</b>), CB (<b>C</b>), and CB/f-CNF (<b>D</b>). XRD patterns for the as-synthesized CB, f-CNF, and CB/f-CNF (<b>E</b>); Raman spectra of carbon black, f-CNF, and CB/f-CNF composite (<b>F</b>).</p>
Full article ">Figure 3
<p>The CV (<b>A</b>) and EIS (<b>B</b>) in a [Fe(CN)6]<sup>3−/4−</sup> redox probe solution response of GCE, CB/GCE, f-CNF/GCE, and CB/f-CNF/GCE. CV responses for GCE (<b>C</b>) and (<b>D</b>) The CB/f-CNF/GCE was analyzed at different scan rates, ranging from 10 to 100 mV s<sup>−1</sup>, in a 2.0 mmol·L<sup>−1</sup> [Fe(CN)6]<sup>3/4−</sup> solution. Note: Different lines from top to bottom are the CV curves of GCE and CB/f-CNF/GCE under the sweep speed of 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 mv respectively.</p>
Full article ">Figure 4
<p>(<b>A</b>) CV curves of bare GCE, CB/GCE, f-CNF/GCE and CB/f-CNF/GCE in the BR containing 50 μmol L<sup>−1</sup> BPA and CV curves of CB/f-CNF/GCE in a BPA-free BR solution (blank). (<b>B</b>) CV curves of CB/f-CNF/GCE in the BR containing 20 μmol L<sup>−1</sup> BPA at various scan rates: 20, 40, 60, 80, and 100 mV s<sup>−1</sup>. (<b>C</b>) The linear relationship of the BPA oxidation peak currents versus the scan rates. (<b>D</b>) The relationship between the BPA oxidation peak potentials and the natural logarithm of scan rates. Note: The black, red, blue, green and purple lines are the CV curves of CB/f-CNF/GCE in BR containing 20 μmol L<sup>−1</sup> BPA at: 20, 40, 60, 80 and 100 mV s<sup>−1</sup> scanning rates, respectively.</p>
Full article ">Figure 5
<p>(<b>A</b>) DPV curves of CB/f-CNF/GCE for different concentrations of BPA. (<b>B</b>) The linear relationship between the peak current and the concentration of BPA was studied, along with anti-interference performance (<b>C</b>) and (<b>D</b>) repeatability experiments of CB/f-CNF/GCE. Note: In the <a href="#foods-14-00314-f005" class="html-fig">Figure 5</a>A, the DPV curves of CB/f-CNF/GCE in bisphenol A solution of 0.4 μM, 1 μM, 2 μM, 6 μM, 10 μM, 20 μM and 50 μM are shown in black lines, red lines, blue lines, purple lines, yellow lines, green lines and cyan lines respectively.</p>
Full article ">
45 pages, 801 KiB  
Review
Artificial Intelligence and Neuroscience: Transformative Synergies in Brain Research and Clinical Applications
by Razvan Onciul, Catalina-Ioana Tataru, Adrian Vasile Dumitru, Carla Crivoi, Matei Serban, Razvan-Adrian Covache-Busuioc, Mugurel Petrinel Radoi and Corneliu Toader
J. Clin. Med. 2025, 14(2), 550; https://doi.org/10.3390/jcm14020550 - 16 Jan 2025
Viewed by 3567
Abstract
The convergence of Artificial Intelligence (AI) and neuroscience is redefining our understanding of the brain, unlocking new possibilities in research, diagnosis, and therapy. This review explores how AI’s cutting-edge algorithms—ranging from deep learning to neuromorphic computing—are revolutionizing neuroscience by enabling the analysis of [...] Read more.
The convergence of Artificial Intelligence (AI) and neuroscience is redefining our understanding of the brain, unlocking new possibilities in research, diagnosis, and therapy. This review explores how AI’s cutting-edge algorithms—ranging from deep learning to neuromorphic computing—are revolutionizing neuroscience by enabling the analysis of complex neural datasets, from neuroimaging and electrophysiology to genomic profiling. These advancements are transforming the early detection of neurological disorders, enhancing brain–computer interfaces, and driving personalized medicine, paving the way for more precise and adaptive treatments. Beyond applications, neuroscience itself has inspired AI innovations, with neural architectures and brain-like processes shaping advances in learning algorithms and explainable models. This bidirectional exchange has fueled breakthroughs such as dynamic connectivity mapping, real-time neural decoding, and closed-loop brain–computer systems that adaptively respond to neural states. However, challenges persist, including issues of data integration, ethical considerations, and the “black-box” nature of many AI systems, underscoring the need for transparent, equitable, and interdisciplinary approaches. By synthesizing the latest breakthroughs and identifying future opportunities, this review charts a path forward for the integration of AI and neuroscience. From harnessing multimodal data to enabling cognitive augmentation, the fusion of these fields is not just transforming brain science, it is reimagining human potential. This partnership promises a future where the mysteries of the brain are unlocked, offering unprecedented advancements in healthcare, technology, and beyond. Full article
(This article belongs to the Section Clinical Neurology)
Show Figures

Figure 1

Figure 1
<p>Workflow demonstrating how molecular feature outputs are processed through machine learning algorithms, culminating in predictions based on trained models.</p>
Full article ">
12 pages, 3271 KiB  
Review
Explainable AI in Digestive Healthcare and Gastrointestinal Endoscopy
by Miguel Mascarenhas, Francisco Mendes, Miguel Martins, Tiago Ribeiro, João Afonso, Pedro Cardoso, João Ferreira, João Fonseca and Guilherme Macedo
J. Clin. Med. 2025, 14(2), 549; https://doi.org/10.3390/jcm14020549 - 16 Jan 2025
Viewed by 322
Abstract
An important impediment to the incorporation of artificial intelligence-based tools into healthcare is their association with so-called black box medicine, a concept arising due to their complexity and the difficulties in understanding how they reach a decision. This situation may compromise the clinician’s [...] Read more.
An important impediment to the incorporation of artificial intelligence-based tools into healthcare is their association with so-called black box medicine, a concept arising due to their complexity and the difficulties in understanding how they reach a decision. This situation may compromise the clinician’s trust in these tools, should any errors occur, and the inability to explain how decisions are reached may affect their relationship with patients. Explainable AI (XAI) aims to overcome this limitation by facilitating a better understanding of how AI models reach their conclusions for users, thereby enhancing trust in the decisions reached. This review first defined the concepts underlying XAI, establishing the tools available and how they can benefit digestive healthcare. Examples of the application of XAI in digestive healthcare were provided, and potential future uses were proposed. In addition, aspects of the regulatory frameworks that must be established and the ethical concerns that must be borne in mind during the development of these tools were discussed. Finally, we considered the challenges that this technology faces to ensure that optimal benefits are reaped, highlighting the need for more research into the use of XAI in this field. Full article
(This article belongs to the Section General Surgery)
Show Figures

Figure 1

Figure 1
<p>Examples of explainable AI techniques currently available and applicable in digestive medicine to enhance the adoption of AI tools. XAI—explainable AI; AI—artificial intelligence; CBR, case-based reasoning; LIME, Local Interpretable Model-Agnostic Explanations; SHAP, Shapley additive explanations.</p>
Full article ">Figure 2
<p>Examples of generated heatmaps for different types of lesions in different locations in Capsule Endoscopy. Each prediction is associated with a degree of certainty expressed with a percentage, while the generated heatmap identified the area responsible for the prediction. The lesions are numbered as follows: 1—P1U—P1 (ulcer lesion by Saurin classification); 2—P1PE (erosion by Saurin classification); 3—PV (vascular lesion). P2V—P2 (vascular lesion); 4—PP/REST (pleomorphic lesion).</p>
Full article ">Figure 3
<p>Real-time heatmap generation for lesion location and biopsy guidance in high-resolution anoscopy and digital single-operator cholangioscopy.</p>
Full article ">Figure 4
<p>Challenges for implementation of trustworthy explainable artificial intelligence mechanisms in clinical practice.</p>
Full article ">Figure 5
<p>XAI application in digestive medicine. AI—artificial intelligence; LIME—Local Interpretable Model-Agnostic Explanations; SHAP—Shappley additive explanations.</p>
Full article ">
26 pages, 8715 KiB  
Article
Interpretable Deep Learning for Pneumonia Detection Using Chest X-Ray Images
by Jovito Colin and Nico Surantha
Information 2025, 16(1), 53; https://doi.org/10.3390/info16010053 - 15 Jan 2025
Viewed by 312
Abstract
Pneumonia remains a global health issue, creating the need for accurate detection methods for effective treatment. Deep learning models like ResNet50 show promise in detecting pneumonia from chest X-rays; however, their black-box nature limits the transparency, which fails to meet that needed for [...] Read more.
Pneumonia remains a global health issue, creating the need for accurate detection methods for effective treatment. Deep learning models like ResNet50 show promise in detecting pneumonia from chest X-rays; however, their black-box nature limits the transparency, which fails to meet that needed for clinical trust. This study aims to improve model interpretability by comparing four interpretability techniques, which are Layer-wise Relevance Propagation (LRP), Adversarial Training, Class Activation Maps (CAMs), and the Spatial Attention Mechanism, and determining which fits best the model, enhancing its transparency with minimal impact on its performance. Each technique was evaluated for its impact on the accuracy, sensitivity, specificity, AUC-ROC, Mean Relevance Score (MRS), and a calculated trade-off score that balances interpretability and performance. The results indicate that LRP was the most effective in enhancing interpretability, achieving high scores across all metrics without sacrificing diagnostic accuracy. The model achieved 0.91 accuracy and 0.85 interpretability (MRS), demonstrating its potential for clinical integration. In contrast, Adversarial Training, CAMs, and the Spatial Attention Mechanism showed trade-offs between interpretability and performance, each highlighting unique image features but with some impact on specificity and accuracy. Full article
Show Figures

Figure 1

Figure 1
<p>Experimental design.</p>
Full article ">Figure 2
<p>Illustrative examples of chest X-rays in patients with pneumonia (arrows indicate clinical annotations of pneumonia-positive areas).</p>
Full article ">Figure 3
<p>Model development flow chart.</p>
Full article ">Figure 4
<p>Layer-wise Relevance Propagation (LRP) Integrated ResNet50 architecture [<a href="#B24-information-16-00053" class="html-bibr">24</a>].</p>
Full article ">Figure 5
<p>Adversarial Training Integrated ResNet50 architecture [<a href="#B25-information-16-00053" class="html-bibr">25</a>].</p>
Full article ">Figure 6
<p>Class Activation Map (CAM) Integrated ResNet50 architecture [<a href="#B26-information-16-00053" class="html-bibr">26</a>].</p>
Full article ">Figure 7
<p>Attention Mechanism Integrated ResNet50 architecture [<a href="#B27-information-16-00053" class="html-bibr">27</a>].</p>
Full article ">Figure 8
<p>Preview of initial chest X-ray dataset.</p>
Full article ">Figure 9
<p>Pre-Trained ResNet50 pneumonia detection accuracy and model loss results visualization across 50 epochs.</p>
Full article ">Figure 10
<p>Epsilon LRP Integrated ResNet50 pneumonia detection accuracy, model loss, and accuracy vs. interpretability visualization across 50 epochs.</p>
Full article ">Figure 11
<p>Epsilon LRP heatmap for pneumonia detection.</p>
Full article ">Figure 12
<p>Adversarial Training Integrated ResNet50 pneumonia detection accuracy and model loss visualization across 50 epochs.</p>
Full article ">Figure 13
<p>Adversarial chest X-ray images (pneumonia highlighted).</p>
Full article ">Figure 14
<p>Grad-CAM Integrated ResNet50 pneumonia detection accuracy, model loss, and accuracy vs. interpretability visualization across 50 epochs.</p>
Full article ">Figure 15
<p>Grad-CAM heatmap visualization of chest X-ray images for pneumonia detection.</p>
Full article ">Figure 16
<p>SAM Integrated ResNet50 pneumonia detection accuracy, model loss, accuracy vs. interpretability, and trade-off score visualization across 50 epochs.</p>
Full article ">Figure 17
<p>SAM map visualization on chest X-ray images for pneumonia detection.</p>
Full article ">
22 pages, 2141 KiB  
Article
Macronutrient-Based Predictive Modelling of Bioconversion Efficiency in Black Soldier Fly Larvae (Hermetia illucens) Through Artificial Substrates
by Laurens Broeckx, Lotte Frooninckx, Siebe Berrens, Sarah Goossens, Carmen ter Heide, Ann Wuyts, Mariève Dallaire-Lamontagne and Sabine Van Miert
Insects 2025, 16(1), 77; https://doi.org/10.3390/insects16010077 - 14 Jan 2025
Viewed by 567
Abstract
This study explores the optimisation of rearing substrates for black soldier fly larvae (BSFL). First, the ideal dry matter content of substrates was determined, comparing the standard 30% dry matter (DM) with substrates hydrated to their maximum water holding capacity (WHC). Substrates at [...] Read more.
This study explores the optimisation of rearing substrates for black soldier fly larvae (BSFL). First, the ideal dry matter content of substrates was determined, comparing the standard 30% dry matter (DM) with substrates hydrated to their maximum water holding capacity (WHC). Substrates at maximal WHC yielded significantly higher larval survival rates (p = 0.0006). Consequently, the WHC approach was adopted for further experiments. Using these hydrated artificial substrates, fractional factorial designs based on central composite and Box–Behnken designs were employed to assess the impact of macronutrient composition on bioconversion efficiency. The results demonstrated significant main, interaction, and quadratic effects on bioconversion efficiency. Validation with real-life substrates of varied protein content, including indigestible feather meal, affirmed the predictive model’s accuracy after accounting for protein source digestibility. This research underscores the importance of optimal hydration and macronutrient composition in enhancing BSFL growth and bioconversion efficiency. Full article
(This article belongs to the Section Insect Physiology, Reproduction and Development)
Show Figures

Figure 1

Figure 1
<p>Central Composite design (<b>left</b>) and Box–Behnken design (<b>right</b>). (Figure was adapted from Breig and Luti [<a href="#B32-insects-16-00077" class="html-bibr">32</a>]).</p>
Full article ">Figure 2
<p>Composition of substrate mixtures used for the validation experiment. The top part displays the percentage of each side-stream used in the mix (based on dry matter), the middle part displays the macronutrient composition of the substrate (based on dry matter), and the bottom part displays the dry matter content (%).</p>
Full article ">Figure 3
<p>Boxplot of larval survival rates after feeding on substrates brought to 30% dry matter content (<b>left</b>) and to maximal water holding capacity (<b>right</b>).</p>
Full article ">Figure 4
<p>Bar-plot displaying larval survival rates in experimental set-up. The lowest point shows 80% survival (n = 146).</p>
Full article ">Figure 5
<p>Correlation between predicted and observed bioconversion efficiency (%) with the bioconversion prediction based on raw macronutrient composition (<b>left</b>) and with the prediction of bioconversion efficiency corrected for feather meal protein (<b>right</b>).</p>
Full article ">Figure 6
<p>Surface plot describing interaction effects between substrate fat and protein contents on bioconversion efficiency.</p>
Full article ">Figure 7
<p>Surface plot describing interaction effects between substrate carbohydrate and protein contents on bioconversion efficiency.</p>
Full article ">Figure 8
<p>Surface plot describing interaction effects between substrate fat and carbohydrate contents on bioconversion efficiency.</p>
Full article ">
19 pages, 2028 KiB  
Article
Biologically Inspired Spatial–Temporal Perceiving Strategies for Spiking Neural Network
by Yu Zheng, Jingfeng Xue, Jing Liu and Yanjun Zhang
Biomimetics 2025, 10(1), 48; https://doi.org/10.3390/biomimetics10010048 - 14 Jan 2025
Viewed by 451
Abstract
A future unmanned system needs the ability to perceive, decide and control in an open dynamic environment. In order to fulfill this requirement, it needs to construct a method with a universal environmental perception ability. Moreover, this perceptual process needs to be interpretable [...] Read more.
A future unmanned system needs the ability to perceive, decide and control in an open dynamic environment. In order to fulfill this requirement, it needs to construct a method with a universal environmental perception ability. Moreover, this perceptual process needs to be interpretable and understandable, so that future interactions between unmanned systems and humans can be unimpeded. However, current mainstream DNN (deep learning neural network)-based AI (artificial intelligence) is a ‘black box’. We cannot interpret or understand how the decision is made by these AIs. An SNN (spiking neural network), which is more similar to a biological brain than a DNN, has the potential to implement interpretable or understandable AI. In this work, we propose a neuron group-based structural learning method for an SNN to better capture the spatial and temporal information from the external environment, and propose a time-slicing scheme to better interpret the spatial and temporal information of responses generated by an SNN. Results show that our method indeed helps to enhance the environment perception ability of the SNN, and possesses a certain degree of robustness, enhancing the potential to build an interpretable or understandable AI in the future. Full article
(This article belongs to the Special Issue Biologically Inspired Vision and Image Processing 2024)
Show Figures

Figure 1

Figure 1
<p>Requirement for future unmanned system.</p>
Full article ">Figure 2
<p>The structure of our 2-layer and 3-layer SNN.</p>
Full article ">Figure 3
<p>The pattern of an image labeled for digit ‘0’ with a 2-map slicing scheme.</p>
Full article ">Figure 4
<p>The pattern of an image labeled for digit ‘1’ with a 2-map slicing scheme.</p>
Full article ">Figure 5
<p>The pattern of an image labeled for digit ‘7’ with a 4-map slicing scheme.</p>
Full article ">Figure 6
<p>The pattern of an image labeled for digit ‘7’ with a 3-map slicing scheme.</p>
Full article ">Figure 7
<p>The pattern of an image labeled for digit ‘7’ with a 2-map slicing scheme.</p>
Full article ">Figure 8
<p>Analogizing a sample as a signal.</p>
Full article ">Figure 9
<p>Accuracy improvement for more than two slices.</p>
Full article ">Figure 10
<p>The effect of number of neuron pairs on the accuracy difference of 2-layer network and 3-layer network.</p>
Full article ">Figure 11
<p>The effect of neuron pairs with different firing frequencies in the process of memory formation on accuracy for MNIST.</p>
Full article ">Figure 12
<p>Analysis of excitation amount of different slices.</p>
Full article ">Figure 13
<p>The effect of neuron pairs with different firing frequencies in the process of memory formation on accuracy for MNIST-C.</p>
Full article ">Figure 14
<p>The effectiveness against noise condition.</p>
Full article ">
32 pages, 3661 KiB  
Systematic Review
Explainable AI in Diagnostic Radiology for Neurological Disorders: A Systematic Review, and What Doctors Think About It
by Yasir Hafeez, Khuhed Memon, Maged S. AL-Quraishi, Norashikin Yahya, Sami Elferik and Syed Saad Azhar Ali
Diagnostics 2025, 15(2), 168; https://doi.org/10.3390/diagnostics15020168 - 13 Jan 2025
Viewed by 597
Abstract
Background: Artificial intelligence (AI) has recently made unprecedented contributions in every walk of life, but it has not been able to work its way into diagnostic medicine and standard clinical practice yet. Although data scientists, researchers, and medical experts have been working in [...] Read more.
Background: Artificial intelligence (AI) has recently made unprecedented contributions in every walk of life, but it has not been able to work its way into diagnostic medicine and standard clinical practice yet. Although data scientists, researchers, and medical experts have been working in the direction of designing and developing computer aided diagnosis (CAD) tools to serve as assistants to doctors, their large-scale adoption and integration into the healthcare system still seems far-fetched. Diagnostic radiology is no exception. Imagining techniques like magnetic resonance imaging (MRI), computed tomography (CT), and positron emission tomography (PET) scans have been widely and very effectively employed by radiologists and neurologists for the differential diagnoses of neurological disorders for decades, yet no AI-powered systems to analyze such scans have been incorporated into the standard operating procedures of healthcare systems. Why? It is absolutely understandable that in diagnostic medicine, precious human lives are on the line, and hence there is no room even for the tiniest of mistakes. Nevertheless, with the advent of explainable artificial intelligence (XAI), the old-school black boxes of deep learning (DL) systems have been unraveled. Would XAI be the turning point for medical experts to finally embrace AI in diagnostic radiology? This review is a humble endeavor to find the answers to these questions. Methods: In this review, we present the journey and contributions of AI in developing systems to recognize, preprocess, and analyze brain MRI scans for differential diagnoses of various neurological disorders, with special emphasis on CAD systems embedded with explainability. A comprehensive review of the literature from 2017 to 2024 was conducted using host databases. We also present medical domain experts’ opinions and summarize the challenges up ahead that need to be addressed in order to fully exploit the tremendous potential of XAI in its application to medical diagnostics and serve humanity. Results: Forty-seven studies were summarized and tabulated with information about the XAI technology and datasets employed, along with performance accuracies. The strengths and weaknesses of the studies have also been discussed. In addition, the opinions of seven medical experts from around the world have been presented to guide engineers and data scientists in developing such CAD tools. Conclusions: Current CAD research was observed to be focused on the enhancement of the performance accuracies of the DL regimens, with less attention being paid to the authenticity and usefulness of explanations. A shortage of ground truth data for explainability was also observed. Visual explanation methods were found to dominate; however, they might not be enough, and more thorough and human professor-like explanations would be required to build the trust of healthcare professionals. Special attention to these factors along with the legal, ethical, safety, and security issues can bridge the current gap between XAI and routine clinical practice. Full article
Show Figures

Figure 1

Figure 1
<p>Accuracy/interpretability trade-off.</p>
Full article ">Figure 2
<p>Block diagram of the journey from the old-school black box models to XAI.</p>
Full article ">Figure 3
<p>Brain MRI with tumor taken from the Kaggle dataset [<a href="#B35-diagnostics-15-00168" class="html-bibr">35</a>], correctly classified by MobileNetV2, which was trained with 154 images with a tumor (class 1) and 97 images without a tumor (class 2) for this demonstration. The figure shows (<b>a</b>) raw MRI, and heatmaps to highlight regions responsible for classification generated by (<b>b</b>) Grad-CAM, (<b>c</b>) LIME, and (<b>d</b>) OSA using MATLAB R2024b. A ‘jet’ color scheme used to highlight the image based on the influence of different regions leading to this inference (Tumor) by the DL model. The ‘jet’ color map has deep blue as the lowest value and deep red as the highest, as shown at the top of the figure. Notice the inaccuracies of the heatmaps in (<b>b</b>–<b>d</b>), highlighting irrelevant regions, as shown in (<b>b</b>), and missing critical tumorous regions, as shown in (<b>d</b>). This is primarily due to the primitive nature of the dataset employed to train the DL regimen used here for demonstration purposes and can be improved further in practical scenarios. An ideal heatmap (generated manually) is shown in (<b>e</b>), where only the tumor region appears as “most significant” (red), and all other pixels appear as “least significant” (blue) for this brain tumor classification example.</p>
Full article ">Figure 4
<p>PRISMA study selection diagram: out of the 92 articles identified, 81 were included in the qualitative analysis.</p>
Full article ">Figure 5
<p>Statistics: (<b>a</b>) article sources—IEEE dominant; (<b>b</b>) neurological disorders—diagnosis for AD and brain tumors dominant since they are the most widely researched disorders in CAD research; (<b>c</b>) XAI techniques—Grad-CAM, LIME, and SHAP dominant; (<b>d</b>) datasets—ADNI (for AD and its sub-types) dominant alongside Kaggle. The percentages have been computed from a total of 81 articles.</p>
Full article ">Figure 6
<p>XAI-powered CAD research discussed in this study.</p>
Full article ">Figure 7
<p>Details of the medical experts that participated in the survey on XAI-powered CAD tools for differential diagnosis of neurological disorders from brain MRI.</p>
Full article ">
27 pages, 4856 KiB  
Article
A Study on the Differences in Optimized Inputs of Various Data-Driven Methods for Battery Capacity Prediction
by Kuo Xin, Fu Jia, Byoungik Choi and Geesoo Lee
Batteries 2025, 11(1), 26; https://doi.org/10.3390/batteries11010026 - 13 Jan 2025
Viewed by 410
Abstract
As lithium-ion batteries become increasingly popular worldwide, accurately determining their capacity is crucial for various devices that rely on them. Numerous data-driven methods have been applied to evaluate battery-related parameters. In the application of these methods, input features play a critical role. Most [...] Read more.
As lithium-ion batteries become increasingly popular worldwide, accurately determining their capacity is crucial for various devices that rely on them. Numerous data-driven methods have been applied to evaluate battery-related parameters. In the application of these methods, input features play a critical role. Most researchers often use the same input features to compare the performance of various neural network models. However, because most models are regarded as black-box models, different methods may show different dependencies on specific features given the inherent differences in their internal structures. And the corresponding optimal inputs of different neural network models should be different. Therefore, comparing the differences in optimized input features for different neural networks is essential. This paper extracts 11 types of lithium battery-related health features, and experiments are conducted on two traditional machine learning networks and three advanced deep learning networks in three aspects of input differences. The experiment aims to systematically evaluate how changes in health feature types, dimensions, and data volume affect the performance of different methods and find the optimal input for each method. The results demonstrate that each network has its own optimal input in the aspects of health feature types, dimensions, and data volume. Moreover, under the premise of obtaining more accurate prediction accuracy, different networks have different requirements for input data. Therefore, in the process of using different types of neural networks for battery capacity prediction, it is very important to determine the type, dimension, and number of input health features according to the structure, category, and actual application requirements of the network. Different inputs will lead to larger differences in results. The optimization degree of mean absolute error (MAE) can be improved by 10–50%, and other indicators can also be optimized to varying degrees. Therefore, it is very important to optimize the network in a targeted manner. Full article
(This article belongs to the Section Battery Modelling, Simulation, Management and Application)
Show Figures

Figure 1

Figure 1
<p>Dataset cycle curve.</p>
Full article ">Figure 2
<p>First category features.</p>
Full article ">Figure 3
<p>Second category features.</p>
Full article ">Figure 4
<p>Third category features.</p>
Full article ">Figure 5
<p>The Pearson correlation heatmap.</p>
Full article ">Figure 6
<p>Principle of PSO.</p>
Full article ">Figure 7
<p>Schematic of LSTM.</p>
Full article ">Figure 8
<p>Deep learning network structure. (<b>a</b>) CNN-LSTM-Attention; (<b>b</b>) CNN-GRU-Attention; (<b>c</b>) CNN-BILSTM-Attention.</p>
Full article ">Figure 9
<p>Correlation comparison diagram. (<b>a</b>) Pearson and Spearman coefficient grouped bar chart; (<b>b</b>) Pearson and Spearman coefficients; (<b>c</b>) Pearson and Spearman correlation coefficients radar chart.</p>
Full article ">Figure 10
<p>B5 Results comparison chart. (<b>a</b>) MAE and MSE trends line chart; (<b>b</b>) Percentage Change in MAE Bar Chart; (<b>c</b>) Average MAE and MSE Radar Chart.</p>
Full article ">Figure 11
<p>PSO-BP and SVM dimensional experiment for B5. (<b>a</b>) PSO-BP results for different input size; (<b>b</b>) SVM results for different input sizes.</p>
Full article ">Figure 12
<p>Results comparison scatter plot for different networks (<b>a</b>) B5 result; (<b>b</b>) B18 result.</p>
Full article ">Figure 13
<p>CNN-LSTM performance metrics for different input sizes (<b>a</b>) B5 result; (<b>b</b>) B18 result.</p>
Full article ">Figure 14
<p>CNN-LSTM B6 Category 2 input experiment result (<b>a</b>) Test set fitting plot; (<b>b</b>) Error graph.</p>
Full article ">Figure 15
<p>CNN-biLSTM performance metrics for different input sizes. (<b>a</b>) B5 result; (<b>b</b>)B18 result.</p>
Full article ">Figure 16
<p>CNN-biLSTM 5D input experiment result. (<b>a</b>) Test set fitting plot; (<b>b</b>) Error graph.</p>
Full article ">Figure 17
<p>Input data volume comparison results.</p>
Full article ">
23 pages, 9783 KiB  
Article
Optimizing Performance of a Solar Flat Plate Collector for Sustainable Operation Using Box–Behnken Design (BBD)
by Ramesh Chitharaj, Hariprasad Perumal, Mohammed Almeshaal and P. Manoj Kumar
Sustainability 2025, 17(2), 461; https://doi.org/10.3390/su17020461 - 9 Jan 2025
Viewed by 489
Abstract
This study investigated the performance optimization of nickel-cobalt (Ni-Co)-coated absorber panels in solar flat plate collectors (SFPCs) using response surface methodology for sustainable operation and optimized performance. Ni-Co coatings, applied through an electroplating process, represent a novel approach by offering superior thermal conductivity, [...] Read more.
This study investigated the performance optimization of nickel-cobalt (Ni-Co)-coated absorber panels in solar flat plate collectors (SFPCs) using response surface methodology for sustainable operation and optimized performance. Ni-Co coatings, applied through an electroplating process, represent a novel approach by offering superior thermal conductivity, durability, and environmental benefits compared to conventional black chrome coatings, addressing critical concerns related to ecological impact and long-term reliability. Experiments were conducted to evaluate the thermal efficiency of Ni-Co-coated panels with and without reflectors under varying flow rates, collector angles, and reflector angles. The thermal efficiency was calculated based on the inlet and outlet water temperatures, solar radiation intensity, and panel area. The results showed that the SFPC achieved average efficiencies of 50.9% without reflectors and 59.0% with reflectors, demonstrating the effectiveness of the coatings in enhancing solar energy absorption and heat transfer. A validated quadratic regression model (R2 = 0.9941) predicted efficiency based on the process variables, revealing significant individual and interaction effects. Optimization using the Box–Behnken design identified the optimal parameter settings for maximum efficiency: a flow rate of 1.32 L/min, collector angle of 46.91°, and reflector angle of 42.34°, yielding a predicted efficiency of 79.2%. These findings highlight the potential of Ni-Co coatings and reflectors for enhancing SFPC performance and provide valuable insights into the sustainable operation of solar thermal systems. Furthermore, the introduction of Ni-Co coatings offers a sustainable alternative to black chrome, reducing environmental risks while enhancing efficiency, thereby contributing to the advancement of renewable energy technologies. Full article
Show Figures

Figure 1

Figure 1
<p>SFPC with Ni-Co-coated absorber panel.</p>
Full article ">Figure 2
<p>Development of regression model.</p>
Full article ">Figure 3
<p>Box–Behnken design (BBD).</p>
Full article ">Figure 4
<p>Performance of Ni-Co-coated absorber panel.</p>
Full article ">Figure 5
<p>Performance of Ni-Co-coated absorber panel with reflectors.</p>
Full article ">Figure 6
<p>Normal plot of residuals for Ni-Co panel with reflector.</p>
Full article ">Figure 7
<p>Predicted and actual responses for Ni-Co panel with reflector.</p>
Full article ">Figure 8
<p>Response surfaces of interactions for nickel-cobalt-coated panel. (<b>a</b>) Flow rate vs. collector angle, (<b>b</b>) flow rate vs. reflector angle, (<b>c</b>) collector angle vs. reflector angle.</p>
Full article ">
16 pages, 1743 KiB  
Article
RLVS: A Reinforcement Learning-Based Sparse Adversarial Attack Method for Black-Box Video Recognition
by Jianxin Song, Dan Yu, Hongfei Teng and Yongle Chen
Electronics 2025, 14(2), 245; https://doi.org/10.3390/electronics14020245 - 8 Jan 2025
Viewed by 511
Abstract
To address the challenges of black-box video adversarial attacks, such as excessive query times and suboptimal attack performance due to the lack of result feedback during the attack process, we propose a reinforcement learning-based sparse adversarial attack method called RLVS. This approach leverages [...] Read more.
To address the challenges of black-box video adversarial attacks, such as excessive query times and suboptimal attack performance due to the lack of result feedback during the attack process, we propose a reinforcement learning-based sparse adversarial attack method called RLVS. This approach leverages reinforcement learning to identify key frames for efficient gradient estimation, significantly reducing the number of queries. First, a self-attention network is integrated into the agent policy network to enable more precise selection of key frames. Second, designed reward functions allow the agent to continuously adapt to the sparse key frames by querying the black-box threat model and receiving feedback on attack outcomes. Lastly, gradient estimation is applied solely to the selected key frames, estimating only the gradient sign rather than the full gradient, further enhancing attack efficiency. We conducted experiments on two video recognition models using three popular action datasets. The experimental results demonstrate that our method outperforms other black-box video attack methods in terms of attack efficiency and effectiveness, achieving higher fooling rates with fewer queries and minimal perturbations. Full article
Show Figures

Figure 1

Figure 1
<p>Framework of RLVS.</p>
Full article ">Figure 2
<p>Framework of the policy network.</p>
Full article ">Figure 3
<p>Adjusted results for different reward weights.</p>
Full article ">Figure 4
<p>An example of the adversarial video produced with RLVS.</p>
Full article ">Figure 5
<p>Comparison results of RLVS with only BiRNN and random agent.</p>
Full article ">Figure 6
<p>The convergence of RLVS.</p>
Full article ">
26 pages, 1303 KiB  
Article
On Explainability of Reinforcement Learning-Based Machine Learning Agents Trained with Proximal Policy Optimization That Utilizes Visual Sensor Data
by Tomasz Hachaj and Marcin Piekarczyk
Appl. Sci. 2025, 15(2), 538; https://doi.org/10.3390/app15020538 - 8 Jan 2025
Viewed by 421
Abstract
In this paper, we address the issues of the explainability of reinforcement learning-based machine learning agents trained with Proximal Policy Optimization (PPO) that utilizes visual sensor data. We propose an algorithm that allows an effective and intuitive approximation of the PPO-trained neural network [...] Read more.
In this paper, we address the issues of the explainability of reinforcement learning-based machine learning agents trained with Proximal Policy Optimization (PPO) that utilizes visual sensor data. We propose an algorithm that allows an effective and intuitive approximation of the PPO-trained neural network (NN). We conduct several experiments to confirm our method’s effectiveness. Our proposed method works well for scenarios where semantic clustering of the scene is possible. Our approach is based on the solid theoretical foundation of Gradient-weighted Class Activation Mapping (GradCAM) and Classification and Regression Tree with additional proxy geometry heuristics. It excels in the explanation process in a virtual simulation system based on a video system with relatively low resolution. Depending on the convolutional feature extractor of the PPO-trained neural network, our method obtains 0.945 to 0.968 accuracy of approximation of the black-box model. The proposed method has important application aspects. Through its use, it is possible to estimate the causes of specific decisions made by the neural network due to the current state of the observed environment. This estimation makes it possible to determine whether the network makes decisions as expected (decision-making is related to the model’s observation of objects belonging to different semantic classes in the environment) and to detect unexpected, seemingly chaotic behavior that might be, for example, the result of data bias, bad design of the reward function or insufficient generalization abilities of the model. We publish all source codes so our experiments can be reproduced. Full article
(This article belongs to the Special Issue Research on Machine Learning in Computer Vision)
Show Figures

Figure 1

Figure 1
<p>Illustration of agent–environment interaction in reinforcement learning.</p>
Full article ">Figure 2
<p>This figure presents two example agent brains trained with PPO consisting of convolutional image feature extractor and classification layers constructed by two 16-neuron layers. The names of the individual network elements are derived from the notation according to the ONNX data exchange standard (Open Neural Network Exchange, <a href="https://onnx.ai/onnx/" target="_blank">https://onnx.ai/onnx/</a> [accessed on 9 October 2024]). Gemm is matrix multiplication, Mul is element-wise multiplication, and Slice produces a slice of the input tensor along axes. Visualization was performed using Onnx-modifier software <a href="https://github.com/ZhangGe6/onnx-modifier" target="_blank">https://github.com/ZhangGe6/onnx-modifier</a> [accessed on 9 October 2024]. (<b>a</b>) Agent’s NN with “Simple” convolutional feature extractor. (<b>b</b>) Agent’s NN with “Nature” convolutional feature extractor [<a href="#B35-applsci-15-00538" class="html-bibr">35</a>].</p>
Full article ">Figure 3
<p>Proxy geometry we used in proposed method.</p>
Full article ">Figure 4
<p>In this Figure, in each row, we present the triplet of the input image corresponding to semantic segmentation and visualization of the Agent’s position (white capsule shape) towards the blue platform. (<b>a</b>) Input image. (<b>b</b>) Semantic segmentation of the input image. (<b>c</b>) Bird-eye view of the scene. (<b>d</b>) Input image. (<b>e</b>) Semantic segmentation of the input image. (<b>f</b>) Bird-eye view of the scene.</p>
Full article ">Figure 5
<p>A block diagram summarizing the proposed methodology. At first, during the simulation, we gather observations and actions of the agent. Then, with the help of GradCAM and proxy geometry (see Algorithm 1), features are generated (see Algorithm 2). Those features and the agent’s actions are used to create an approximation of the neural network brain of an agent in the form of a decision tree (CART).</p>
Full article ">Figure 6
<p>The plot of Cumulative Reward value during PPO training steps (episodes).</p>
Full article ">Figure 7
<p>Example GradCAM results for various Agent NNs differing with a resolution of the image sensor. All agents use a “Simple” convolutional features embedder. (<b>a</b>) The semantic clustering of the input image (<b>h</b>). The first row presents the GradCAM color-coded map generated as the response of NN for input image (<b>h</b>). The second row shows the same GradCAM map but is imposed onto the input signal. The darker the region, the smaller the value on the map. The bright areas correspond to a value of 1 on the GradCAM map.</p>
Full article ">Figure 8
<p>Example GradCAM results for various Agent NNs differing with a resolution of the image sensor. All agents use a “Simple” convolutional features embedder. (<b>a</b>) Semantic clustering of the input image (<b>h</b>). The first row presents the GradCAM color-coded map generated as the response of NN for input image (<b>h</b>). The second row shows the same GradCAM map but is imposed onto the input signal. The darker the region, the smaller the value on the map. The bright areas correspond to a value of 1 on the GradCAM map.</p>
Full article ">Figure 9
<p>CART explanation generated by the proposed GradCAM-based method for Agent’s NN with <math display="inline"><semantics> <mrow> <mo>(</mo> <mn>64</mn> <mo>×</mo> <mn>64</mn> <mo>)</mo> </mrow> </semantics></math> input image signal, “Simple” convolutional feature embedder and <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>α</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mi>t</mi> <mo>=</mo> <mn>0.3</mn> <mo>)</mo> </mrow> </semantics></math> parameters of Algorithm 2. This particular tree explains the rules of forward–backward motion. The size of the tree is limited to maximal depth of 4. Instances of classes among certain features are presented in color-coded bars. The black arrow positioned under the horizontal axis indicates the splitting threshold.</p>
Full article ">Figure 10
<p>CART explanation generated by the proposed GradCAM-based method for the Agent’s NN with <math display="inline"><semantics> <mrow> <mo>(</mo> <mn>64</mn> <mo>×</mo> <mn>64</mn> <mo>)</mo> </mrow> </semantics></math> input image signal, “Simple” convolutional feature embedder and <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>α</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mi>t</mi> <mo>=</mo> <mn>0.3</mn> <mo>)</mo> </mrow> </semantics></math> parameters of Algorithm 2. This particular tree explains the rules of left–right motion. The size of the tree is limited to maximal depth of 4. Instances of classes among certain features are presented in color-coded bars. The black arrow positioned under the horizontal axis indicates the splitting threshold.</p>
Full article ">Figure 11
<p>CART explanation generated by the proposed GradCAM-based method for Agent’s NN with <math display="inline"><semantics> <mrow> <mn>64</mn> <mo>×</mo> <mn>64</mn> </mrow> </semantics></math> input image signal, “Simple” convolutional feature embedder and <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>α</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mi>t</mi> <mo>=</mo> <mn>0.3</mn> <mo>)</mo> </mrow> </semantics></math> parameters of Algorithm 2. This particular tree explains the rules of jump motion. Instances of classes among certain features are presented in color-coded bars. The black arrow positioned under the horizontal axis indicates the splitting threshold.</p>
Full article ">Figure 12
<p>This Figure presents the decision process (predictions) of the CART tree generated by the proposed GradCAM-based method for Agent’s NN with <math display="inline"><semantics> <mrow> <mo>(</mo> <mn>64</mn> <mo>×</mo> <mn>64</mn> <mo>)</mo> </mrow> </semantics></math> input image signal, “Simple” convolutional features embedder, <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>α</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mi>t</mi> <mo>=</mo> <mn>0.3</mn> <mo>)</mo> </mrow> </semantics></math> on images from <a href="#applsci-15-00538-f004" class="html-fig">Figure 4</a>. The orange arrow under the horizontal axis indicates the actual feature value. This Figure shows an explanation of the decision-making process for a single agent camera reading. It contains a single tree path from the tree shown in <a href="#applsci-15-00538-f009" class="html-fig">Figure 9</a>. (<b>a</b>) Prediction for input image from <a href="#applsci-15-00538-f004" class="html-fig">Figure 4</a>a. (<b>b</b>) Prediction for input image from <a href="#applsci-15-00538-f004" class="html-fig">Figure 4</a>d.</p>
Full article ">Figure 13
<p>This Figure presents the decision process (predictions) of the CART tree generated by the proposed GradCAM-based method for the Agent’s NN with <math display="inline"><semantics> <mrow> <mo>(</mo> <mn>64</mn> <mo>×</mo> <mn>64</mn> <mo>)</mo> </mrow> </semantics></math> input image signal, “Simple” convolutional features embedder, <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>α</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mi>t</mi> <mo>=</mo> <mn>0.3</mn> <mo>)</mo> </mrow> </semantics></math> on images from <a href="#applsci-15-00538-f004" class="html-fig">Figure 4</a>. The orange arrow under the horizontal axis indicates the actual feature value. This Figure shows an explanation of the decision-making process for a single Agent camera reading. It contains a single tree path from the tree shown in <a href="#applsci-15-00538-f010" class="html-fig">Figure 10</a>. (<b>a</b>) Prediction for input image from <a href="#applsci-15-00538-f004" class="html-fig">Figure 4</a>a. (<b>b</b>) Prediction for input image from <a href="#applsci-15-00538-f004" class="html-fig">Figure 4</a>d.</p>
Full article ">Figure 14
<p>This Figure presents the decision process (predictions) of the CART tree generated by the proposed GradCAM-based method for the Agent’s NN with <math display="inline"><semantics> <mrow> <mo>(</mo> <mn>64</mn> <mo>×</mo> <mn>64</mn> <mo>)</mo> </mrow> </semantics></math> input image signal, “Simple” convolutional features embedder, <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>α</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mi>t</mi> <mo>=</mo> <mn>0.3</mn> <mo>)</mo> </mrow> </semantics></math> on images from <a href="#applsci-15-00538-f004" class="html-fig">Figure 4</a>. The orange arrow under the horizontal axis indicates the actual feature value. This Figure shows an explanation of the decision-making process for a single Agent camera reading. It contains a single tree path from the tree shown in <a href="#applsci-15-00538-f011" class="html-fig">Figure 11</a>. (<b>a</b>) Prediction for input image from <a href="#applsci-15-00538-f004" class="html-fig">Figure 4</a>a. (<b>b</b>) Prediction for input image from <a href="#applsci-15-00538-f004" class="html-fig">Figure 4</a>d.</p>
Full article ">
Back to TopTop