Saved Queries

Face recognition (FR) is a less intrusive biometrics technology with various applications, such as security, surveillance, and access control systems. FR remains challenging, especially when there is only a single image per person as a gallery dataset and when dealing with variations like pose, illumination, and occlusion. Deep learning techniques have shown promising results in recent years using VAE and GAN, with approaches such as patch-VAE, VAE-GAN for 3D Indoor Scene Synthesis, and hybrid VAE-GAN models. However, in Single Sample Per Person Face Recognition (SSPP FR), the challenge of learning robust and discriminative features that preserve the subject’s identity persists. To address these issues, we propose a novel framework called AD-VAE, specifically for SSPP FR, using a combination of variational autoencoder (VAE) and Generative Adversarial Network (GAN) techniques. The proposed AD-VAE framework is designed to learn how to build representative identity-preserving prototypes from both controlled and wild datasets, effectively handling variations like pose, illumination, and occlusion. The method uses four networks: an encoder and decoder similar to VAE, a generator that receives the encoder output plus noise to generate an identity-preserving prototype, and a discriminator that operates as a multi-task network. AD-VAE outperforms all tested state-of-the-art face recognition techniques, demonstrating its robustness. The proposed framework achieves superior results on four controlled benchmark datasets—AR, E-YaleB, CAS-PEAL, and FERET—with recognition rates of 84.9%, 94.6%, 94.5%, and 96.0%, respectively, and achieves remarkable performance on the uncontrolled LFW dataset, with a recognition rate of 99.6%. The AD-VAE framework shows promising potential for future research and real-world applications. Full article

(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)

►▼ Show Figures

Figure 1

37 pages, 34201 KiB

Open AccessArticle

Measuring the Level of Aflatoxin Infection in Pistachio Nuts by Applying Machine Learning Techniques to Hyperspectral Images

by Lizzie Williams, Pancham Shukla, Akbar Sheikh-Akbari, Sina Mahroughi and Iosif Mporas

Sensors 2025, 25(5), 1548; https://doi.org/10.3390/s25051548 - 2 Mar 2025

Viewed by 148

Abstract

This paper investigates the use of machine learning techniques on hyperspectral images of pistachios to detect and classify different levels of aflatoxin contamination. Aflatoxins are toxic compounds produced by moulds, posing health risks to consumers. Current detection methods are invasive and contribute to food waste. This paper explores the feasibility of a non-invasive method using hyperspectral imaging and machine learning to classify aflatoxin levels accurately, potentially reducing waste and enhancing food safety. Hyperspectral imaging with machine learning has shown promise in food quality control. The paper evaluates models including Dimensionality Reduction with K-Means Clustering, Residual Networks (ResNets), Variational Autoencoders (VAEs), and Deep Convolutional Generative Adversarial Networks (DCGANs). Using a dataset from Leeds Beckett University with 300 hyperspectral images, covering three aflatoxin levels (<8 ppn, >160 ppn, and >300 ppn), key wavelengths were identified to indicate contamination presence. Dimensionality Reduction with K-Means achieved 84.38% accuracy, while a ResNet model using the 866.21 nm wavelength reached 96.67%. VAE and DCGAN models, though promising, were constrained by dataset size. The findings highlight the potential for machine learning-based hyperspectral imaging in pistachio quality control, and future research should focus on expanding datasets and refining models for industry application. Full article

(This article belongs to the Special Issue Sensors for Hyperspectral Imaging: Technologies, Methods and Data Processing)

►▼ Show Figures

Figure 1

27 pages, 4959 KiB

Open AccessArticle

Deep Learning Autoencoders for Fast Fourier Transform-Based Clustering and Temporal Damage Evolution in Acoustic Emission Data from Composite Materials

by Serafeim Moustakidis, Konstantinos Stergiou, Matthew Gee, Sanaz Roshanmanesh, Farzad Hayati, Patrik Karlsson and Mayorkinos Papaelias

Infrastructures 2025, 10(3), 51; https://doi.org/10.3390/infrastructures10030051 - 2 Mar 2025

Viewed by 343

Abstract

Structural health monitoring (SHM) in fiber-reinforced polymer (FRP) composites is essential to ensure safety and reliability during service, particularly in critical industries such as aerospace and wind energy. Traditional methods of analyzing Acoustic Emission (AE) signals in the time domain often fail to accurately detect subtle or early-stage damage, limiting their effectiveness. The present study introduces a novel approach that integrates frequency-domain analysis using the fast Fourier transform (FFT) with deep learning techniques for more accurate and proactive damage detection. AE signals are first transformed into the frequency domain, where significant frequency components are extracted and used as inputs to an autoencoder network. The autoencoder model reduces the dimensionality of the data while preserving essential features, enabling unsupervised clustering to identify distinct damage states. Temporal damage evolution is modeled using Markov chain analysis to provide insights into how damage progresses over time. The proposed method achieves a reconstruction error of 0.0017 and a high R-squared value of 0.95, indicating the autoencoder’s effectiveness in learning compact representations while minimizing information loss. Clustering results, with a silhouette score of 0.37, demonstrate well-separated clusters that correspond to different damage stages. Markov chain analysis captures the transitions between damage states, providing a predictive framework for assessing damage progression. These findings highlight the potential of the proposed approach for early damage detection and predictive maintenance, which significantly improves the effectiveness of AE-based SHM systems in reducing downtime and extending component lifespan. Full article

(This article belongs to the Special Issue Remote Sensing for Infrastructure Health Monitoring: Advancements in Sensors and Analysis)

►▼ Show Figures

Figure 1

Figure 1
High-level architecture of the proposed methodology. Full article ">Figure 2
Sensor positions for the tensile test (left) and 3-point bend (right). Full article ">Figure 3
Raw acoustic data captured from a sample strip in the 3-point bending machine: (a) the whole duration of the experiment; (b) a zoomed-in visualization of a single AE event. Full article ">Figure 4
Preprocessing pipeline for AE data, including noise removal, DC offset correction, segmentation, and frequency-domain transformation using FFT. Full article ">Figure 5
Comparative literature values for the frequency ranges of damage in CFRPs. Data adapted from [<a href="#B25-infrastructures-10-00051" class="html-bibr">25</a>,<a href="#B27-infrastructures-10-00051" class="html-bibr">27</a>,<a href="#B28-infrastructures-10-00051" class="html-bibr">28</a>,<a href="#B29-infrastructures-10-00051" class="html-bibr">29</a>], illustrating variations in reported frequency bands for matrix cracking, delamination, debonding, fiber breakage, and fiber pull-out. Full article ">Figure 6
Examples of Peak Frequency Assessment of a Tensile Tested CFRP Sample Coupon; (a) Time vs. Peak Frequency and (b) Magnitude vs. Peak Frequency. Full article ">Figure 7
Time- and frequency-domain AE signals: (a–c) data with a detected AE event; (d) data without a detected AE event. DC offset has been removed from the time domain signals. Full article ">Figure 8
Thresholding on the frequency domain (mean FFT) for detection of events (example from lab experiment). Full article ">Figure 9
Bayesian optimization convergence. Full article ">Figure 10
Examples of reconstructed and original FFT signals. Full article ">Figure 11
Analysis of the mean frequency content of two clusters over time. The top plot shows the evolution of the mean frequency content for Cluster 1 and Cluster 2 with accumulated mean frequency curves. The bottom plots display the normalized amplitude of the mean frequency signals for Cluster 1 (left) and Cluster 2 (right), highlighting distinct peaks at specific frequencies within each cluster (results for lab experiment). Full article ">Figure 12
State transition diagram (Markov chain) representing the probability of transitioning between two clusters. The diagram shows self-transition probabilities for Cluster 1 and Cluster 2, as well as the transition probabilities between the two clusters. The width of the arrows is proportional to the transition probabilities, with values indicated along each transition path. Full article ">

14 pages, 494 KiB

Open AccessArticle

Denoising-Autoencoder-Aided Euclidean Distance Matrix Reconstruction for Connectivity-Based Localization: A Low-Rank Perspective

by Woong-Hee Lee, Mustafa Ozger, Ursula Challita and Taewon Song

Appl. Sci. 2025, 15(5), 2656; https://doi.org/10.3390/app15052656 - 1 Mar 2025

Viewed by 262

Abstract

In contrast to conventional localization methods, connectivity-based localization is a promising approach that leverages wireless links among network nodes. Here, the Euclidean distance matrix (EDM) plays a pivotal role in implementing the multidimensional scaling technique for the localization of wireless nodes based on pairwise distance measurements. This is based on the representation of complex datasets in lower-dimensional spaces, resulting from the mathematical property of an EDM being a low-rank matrix. However, EDM data are inevitably susceptible to contamination due to errors such as measurement imperfections, channel dynamics, and clock asynchronization. Motivated by the low-rank property of the EDM, we introduce a new pre-processor for connectivity-based localization, namely denoising-autoencoder-aided EDM reconstruction (DAE-EDMR). The proposed method is based on optimizing the neural network by inputting and outputting vectors of the eigenvalues of the noisy EDM and the original EDM, respectively. The optimized NN denoises the contaminated EDM, leading to an exceptional performance in connectivity-based localization. Additionally, we introduce a relaxed version of DAE-EDMR, i.e., truncated DAE-EDMR (T-DAE-EDMR), which remains operational regardless of variations in the number of nodes between the training and test phases in NN operations. The proposed algorithms show a superior performance in both EDM denoising and localization accuracy. Moreover, the method of T-DAE-EDMR notably requires a minimal number of training datasets compared to that in conventional approaches such as deep learning algorithms. Overall, our proposed algorithms reduce the required training dataset’s size by approximately one-tenth while achieving more than twice the effectiveness in EDM denoising, as demonstrated through our experiments. Full article

(This article belongs to the Special Issue Feature Paper Collection in the Section ‘Electrical, Electronics and Communications Engineering’)

►▼ Show Figures

Figure 1

Figure 1
An illustration of an example of the proposed DAE-EDMR (Additionally, truncated DAE-EDMR (T-DAE-EDMR) is presented as a relaxed version of DAE-EDMR. This additional work is carried out to make the optimized NN model more efficient by using the dominant <math display="inline"><semantics> <mrow> <mi>k</mi> <mo>+</mo> <mn>2</mn> </mrow> </semantics></math> eigenvalues as the input data. Details can be found in <a href="#sec2dot3-applsci-15-02656" class="html-sec">Section 2.3</a>). Full article ">Figure 2
Performance comparison according to the number of training datasets. (a) NMSE between ground-truth and denoised EDMs and (b) localization error in meters. Full article ">Figure 3
Performance comparison according to the NLoS probability. (a) NMSE between ground-truth and denoised EDMs and (b) localization error in meters. Full article ">Figure 4
Performance comparison according to the utilized matrices. (a) NMSE between ground-truth and denoised EDMs and (b) localization error in meters. Full article ">

22 pages, 873 KiB

Open AccessArticle

EEG-Based Music Emotion Prediction Using Supervised Feature Extraction for MIDI Generation

by Oscar Gomez-Morales, Hernan Perez-Nastar, Andrés Marino Álvarez-Meza, Héctor Torres-Cardona and Germán Castellanos-Dominguez

Sensors 2025, 25(5), 1471; https://doi.org/10.3390/s25051471 - 27 Feb 2025

Viewed by 222

Abstract

Advancements in music emotion prediction are driving AI-driven algorithmic composition, enabling the generation of complex melodies. However, bridging neural and auditory domains remains challenging due to the semantic gap between brain-derived low-level features and high-level musical concepts, making alignment computationally demanding. This study proposes a deep learning framework for generating MIDI sequences aligned with labeled emotion predictions through supervised feature extraction from neural and auditory domains. EEGNet is employed to process neural data, while an autoencoder-based piano algorithm handles auditory data. To address modality heterogeneity, Centered Kernel Alignment is incorporated to enhance the separation of emotional states. Furthermore, regression between feature domains is applied to reduce intra-subject variability in extracted Electroencephalography (EEG) patterns, followed by the clustering of latent auditory representations into denser partitions to improve MIDI reconstruction quality. Using musical metrics, evaluation on real-world data shows that the proposed approach improves emotion classification (namely, between arousal and valence) and the system’s ability to produce MIDI sequences that better preserve temporal alignment, tonal consistency, and structural integrity. Subject-specific analysis reveals that subjects with stronger imagery paradigms produced higher-quality MIDI outputs, as their neural patterns aligned more closely with the training data. In contrast, subjects with weaker performance exhibited auditory data that were less consistent. Full article

(This article belongs to the Special Issue Advances in ECG/EEG Monitoring)

►▼ Show Figures

Figure 1

Figure 1
Proposed deep learning framework for EEG-based emotion prediction using supervised feature extraction for MIDI generation. Stages: (i) segment-wise preprocessing; (ii) supervised deep feature extraction for emotion classification; and (iii) affective-based MIDI prediction and feature alignment. Full article ">Figure 2
Visualization of emotion labels and MIDI feature representations. (a) Emotion labels set by Subject <math display="inline"><semantics> <mrow> <mo>#</mo> <mn>1</mn> </mrow> </semantics></math>, where the x-axis represents arousal and the y-axis represents valence. (b) Two-dimensional t-SNE projection (<math display="inline"><semantics> <mrow> <mi>n</mi> <mo>_</mo> <mi>c</mi> <mi>o</mi> <mi>m</mi> <mi>p</mi> <mi>o</mi> <mi>n</mi> <mi>e</mi> <mi>n</mi> <mi>t</mi> <mi>s</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, perplexity <math display="inline"><semantics> <mrow> <mo>=</mo> <mn>5</mn> </mrow> </semantics></math>) of the piano-roll arrays, illustrating clustering of MIDI features based on emotion labels. Colors indicate the class of each audio stimulus. (c) Embedding space obtained from the bottleneck representation of the piano-roll autoencoder trained with CKA loss. Dots correspond to training data, while crosses (×) represent test data. Of note, the axes are resized to provide better visual perception of the plotted values. Full article ">Figure 3
Comparison between original MIDI (top) and reconstructed MIDI (bottom). Of note, the axes are resized to provide better visual perception of the plotted values. Full article ">Figure 4
Probability density functions (PDFs) of model characteristics for all subjects in fold 1. Each subfigure corresponds to a specific feature extracted from the MIDI data. (a) Feature 0: description of the feature (e.g., pitch range). (b) Feature 1: description of the feature (e.g., total used pitch). (c) Feature 2: description of the feature (e.g., average IOI). (d) Feature 3: description of the feature (e.g., pitch-class histogram). Full article ">Figure 5
Violin plots comparing metrics for the best- and worst-performing subjects. Full article ">

18 pages, 8946 KiB

Open AccessArticle

Estimation of Nitrogen Content in Hevea Rubber Leaves Based on Hyperspectral Data Deep Feature Fusion

by Wenfeng Hu, Longfei Zhang, Zhouyang Chen, Xiaochuan Luo and Cheng Qian

Sustainability 2025, 17(5), 2072; https://doi.org/10.3390/su17052072 - 27 Feb 2025

Viewed by 179

Abstract

Leaf nitrogen content is a critical quantitative indicator for the growth of rubber trees, and accurately determining this content holds significant value for agricultural management and precision fertilization. This study introduces a novel feature extraction framework—SFS-CAE—that integrates the Sequential Feature Selection (SFS) method with Convolutional Autoencoder (CAE) technology to enhance the accuracy of nitrogen content estimation. Initially, the SFS algorithm was employed to select spectral bands from hyperspectral data collected from rubber tree leaves, thereby extracting feature information pertinent to nitrogen content. Subsequently, a CAE was utilized to further explore deep features within the dataset. Ultimately, the selected feature subset was concatenated with deep features to create a comprehensive input feature set, which was then analyzed using partial least squares regression (PLSR) for nitrogen content regression estimation. To validate the effectiveness of the proposed methodology, comparisons were made against commonly used competitive adaptive reweighted sampling (CARS), successive projection algorithm (SPA), and uninformative variable elimination (UVE) feature selection algorithms. The results indicate that SFS-CAE outperforms traditional SFS methods on the test set; notably, CARS-CAE achieved optimal performance with a coefficient of determination (R²) of 0.9064 and a root mean square error (RMSE) of 0.1405. This approach not only effectively integrates deep features derived from hyperspectral data but also optimizes both band selection and feature extraction processes, offering an innovative solution for the efficient estimation of nitrogen content in rubber tree leaves. Full article

(This article belongs to the Section Sustainable Agriculture)

►▼ Show Figures

Figure 1

17 pages, 1774 KiB

Open AccessArticle

Training a Minesweeper Agent Using a Convolutional Neural Network

by Wenbo Wang and Chengyou Lei

Appl. Sci. 2025, 15(5), 2490; https://doi.org/10.3390/app15052490 - 25 Feb 2025

Viewed by 294

Abstract

The Minesweeper game is modeled as a sequential decision-making task, for which a neural network architecture, state encoding, and reward function were herein designed. Both a Deep Q-Network (DQN) and supervised learning methods were successfully applied to optimize the training of the game. The experiments were conducted on the AutoDL platform using an NVIDIA RTX 3090 GPU for efficient computation. The results showed that in a 6 × 6 grid with four mines, the DQN model achieved an average win rate of 93.3% (standard deviation: 0.77%), while the supervised learning method achieved 91.2% (standard deviation: 0.9%), both outperforming human players and baseline algorithms and demonstrating high intelligence. The mechanisms of the two methods in the Minesweeper task were analyzed, with the reasons for the faster training speed and more stable performance of supervised learning explained from the perspectives of means–ends analysis and feedback control. Although there is room for improvement in sample efficiency and training stability in the DQN model, its greater generalization ability makes it highly promising for application in more complex decision-making tasks. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

►▼ Show Figures

Figure 1

Figure 1
Minesweeper game interface on Windows XP system (beginner level, 9 × 9 grid, 10 mines). Full article ">Figure 2
Illustration of single-channel encoding representation. Full article ">Figure 3
Illustration of full encoding representation. Full article ">Figure 4
Illustration of dual-channel encoding representation. Full article ">Figure 5
Convolutional neural network model of the Minesweeper game agent. Full article ">Figure 6
Schematic of the minesweeper reward function (The red box represents the currently clicked cell). Full article ">Figure 7
Training curves of the two methods (smoothing parameter = 0.6). Full article ">

25 pages, 1516 KiB

Open AccessArticle

Deep Learning Approach for Automatic Heartbeat Classification

by Roger de T. Guerra, Cristina K. Yamaguchi, Stefano F. Stefenon, Leandro dos S. Coelho and Viviana C. Mariani

Sensors 2025, 25(5), 1400; https://doi.org/10.3390/s25051400 - 25 Feb 2025

Viewed by 194

Abstract

Arrhythmia is an irregularity in the rhythm of the heartbeat, and it is the primary method for detecting cardiac abnormalities. The electrocardiogram (ECG) identifies arrhythmias and is one of the methods used to diagnose cardiac issues. Traditional arrhythmia detection methods are time-consuming, error-prone, and often subjective, making it difficult for doctors to discern between distinct patterns of arrhythmia. To understand ECG signals, this study presents a multi-class classifier and an autoencoder with long short-term memory (LSTM) network layers for extracting signal properties on a dataset from the Massachusetts Institute of Technology and Boston’s Beth Israel Hospital (MIT-BIH). The suggested model had an accuracy rate of 98.57% on the arrhythmia dataset and 97.59% on the supraventricular dataset. In contrast to other deep learning models, the proposed model eliminates the problem of the gradient disappearing in classification tasks. Full article

(This article belongs to the Section Biomedical Sensors)

►▼ Show Figures

Figure 1

Figure 1
Diagram for identifying and adding papers to the literature review of this study. Full article ">Figure 2
Diagram of the proposed algorithm. Full article ">Figure 3
Diagram of the adaptive filter used. Full article ">Figure 4
Example of an original signal without passing through the filter. Full article ">Figure 5
Example of a signal after filtering. Full article ">Figure 6
Diagram of the coding layer structure. Full article ">Figure 7
Diagram of the decoding layer structure. Full article ">Figure 8
Classification of heartbeat types. Full article ">

17 pages, 4186 KiB

Open AccessArticle

Anomaly-Guided Double Autoencoders for Hyperspectral Unmixing

by Hongyi Liu, Chenyang Zhang, Jianing Huang and Zhihui Wei

Remote Sens. 2025, 17(5), 800; https://doi.org/10.3390/rs17050800 - 25 Feb 2025

Viewed by 119

Abstract

Deep learning has emerged as a prevalent approach for hyperspectral unmixing. However, most existing unmixing methods employ a single network, resulting in moderate estimation errors and less meaningful endmembers and abundances. To address this imitation, this paper proposes a novel double autoencoders-based unmixing method, consisting of an endmember extraction network and an abundance estimation network. In the endmember network, to improve the spectral discrimination, a logarithm spectral angle distance (SAD), integrated with anomaly-guided weight, is developed as the loss function. Specifically, the logarithm function is used to boost the reliability of a pixel based on its high SAD similarity to other pixels. Moreover, the anomaly-guided weight mitigates the influence of outliers. As for the abundance network, a spectral convolutional autoencoder combined with the channel attention module is employed to exploit the spectral features. Additionally, the decoder weight is shared between the two networks to reduce computational complexity. Extensive comparative experiments with state-of-the-art unmixing methods demonstrate that the proposed method achieves superior performance in both endmember extraction and abundance estimation. Full article

(This article belongs to the Special Issue Recent Advances in the Processing of Hyperspectral Images)

►▼ Show Figures

Figure 1

26 pages, 17412 KiB

Open AccessArticle

Enhancing Maritime Safety: Estimating Collision Probabilities with Trajectory Prediction Boundaries Using Deep Learning Models

by Robertas Jurkus, Julius Venskus, Jurgita Markevičiūtė and Povilas Treigys

Sensors 2025, 25(5), 1365; https://doi.org/10.3390/s25051365 - 23 Feb 2025

Viewed by 165

Abstract

We investigate maritime accidents near Bornholm Island in the Baltic Sea, focusing on one of the most recent vessel collisions and a way to improve maritime safety as a prevention strategy. By leveraging Long Short-Term Memory autoencoders, a class of deep recurrent neural networks, this research demonstrates a unique approach to forecasting vessel trajectories and assessing collision risks. The proposed method integrates trajectory predictions with statistical techniques to construct probabilistic boundaries, including confidence intervals, prediction intervals, ellipsoidal prediction regions, and conformal prediction regions. The study introduces a collision risk score, which evaluates the likelihood of boundary overlaps as a metric for collision detection. These methods are applied to simulated test scenarios and a real-world case study involving the 2021 collision between the Scot Carrier and Karin Hoej cargo ships. The results demonstrate that CPR, a non-parametric approach, reliably forecasts collision risks with 95% confidence. The findings underscore the importance of integrating statistical uncertainty quantification with deep learning models to improve navigational decision-making and encourage a shift towards more proactive, AI/ML-enhanced maritime risk management protocols. Full article

(This article belongs to the Section Intelligent Sensors)

►▼ Show Figures

Figure 1

23 pages, 2239 KiB

Open AccessArticle

Securing IoT Networks Against DDoS Attacks: A Hybrid Deep Learning Approach

by Noor Ul Ain, Muhammad Sardaraz, Muhammad Tahir, Mohamed W. Abo Elsoud and Abdullah Alourani

Sensors 2025, 25(5), 1346; https://doi.org/10.3390/s25051346 - 22 Feb 2025

Viewed by 228

Abstract

The Internet of Things (IoT) has revolutionized many domains. Due to the growing interconnectivity of IoT networks, several security challenges persist that need to be addressed. This research presents the application of deep learning techniques for Distributed Denial-of-Service (DDoS) attack detection in IoT networks. This study assesses the performance of various deep learning models, including Latent Autoencoders, LSTM Autoencoders, and convolutional neural networks (CNNs), for DDoS attack detection in IoT environments. Furthermore, a novel hybrid model is proposed, integrating CNNs for feature extraction, Long Short-Term Memory (LSTM) networks for temporal pattern recognition, and Autoencoders for dimensionality reduction. Experimental results on the CICIOT2023 dataset show the enhanced performance of the proposed hybrid model, achieving training and testing accuracy of 96.78% integrated with 96.60% validation accuracy. This presents its efficiency in addressing complex attack patterns within IoT networks. Results’ analysis shows that the proposed hybrid model outperforms the others. However, this research has limitations in detecting rare attack types and emphasizes the importance of addressing data imbalance challenges for further enhancement of DDoS attack detection capabilities in future. Full article

(This article belongs to the Special Issue Machine Learning and Big Data Analytics for the Internet of Things and Wireless Sensor Networks)

►▼ Show Figures

Figure 1

17 pages, 1533 KiB

Open AccessArticle

Multimodal Brain Growth Patterns: Insights from Canonical Correlation Analysis and Deep Canonical Correlation Analysis with Auto-Encoder

by Ram Sapkota, Bishal Thapaliya, Bhaskar Ray, Pranav Suresh and Jingyu Liu

Information 2025, 16(3), 160; https://doi.org/10.3390/info16030160 - 20 Feb 2025

Viewed by 208

Abstract

Today’s advancements in neuroimaging have been pivotal in enhancing our understanding of brain development and function using various MRI techniques. This study utilizes images from T1-weighted imaging and diffusion-weighted imaging to identify gray matter and white matter coherent growth patterns within 2 years from 9–10-year-old participants in the Adolescent Brain Cognitive Development (ABCD) Study. The motivation behind this investigation lies in the need to comprehend the intricate processes of brain development during adolescence, a critical period characterized by significant cognitive maturation and behavioral change. While traditional methods like canonical correlation analysis (CCA) capture the linear interactions of brain regions, a deep canonical correlation analysis with an autoencoder (DCCAE) nonlinearly extracts brain patterns. The study involves a comparative analysis of changes in gray and white matter over two years, exploring their interrelation based on correlation scores, extracting significant features using both CCA and DCCAE methodologies, and finding an association between the extracted features with cognition and the Child Behavior Checklist. The results show that both CCA and DCCAE components identified similar brain regions associated with cognition and behavior, indicating that brain growth patterns over this two-year period are linear. The variance explained by CCA and DCCAE components for cognition and behavior suggests that brain growth patterns better account for cognitive maturation compared to behavioral changes. This research advances our understanding of neuroimaging analysis and provides valuable insights into the nuanced dynamics of brain development during adolescence. Full article

(This article belongs to the Section Biomedical Information and Health)

►▼ Show Figures

Figure 1

Figure 1
First, second, and third components of GM identified through CCA. Full article ">Figure 2
First, second, and third components of FA identified through CCA. Full article ">Figure 3
First, second, and third components of GM identified through DCCAE. Full article ">Figure 4
First, second, and third components of FA identified through DCCAE. Full article ">

23 pages, 10921 KiB

Open AccessArticle

A Weakly Supervised and Self-Supervised Learning Approach for Semantic Segmentation of Land Cover in Satellite Images with National Forest Inventory Data

by Daniel Moraes, Manuel L. Campagnolo and Mário Caetano

Remote Sens. 2025, 17(4), 711; https://doi.org/10.3390/rs17040711 - 19 Feb 2025

Viewed by 182

Abstract

National Forest Inventories (NFIs) provide valuable land cover (LC) information but often lack spatial continuity and an adequate update frequency. Satellite-based remote sensing offers a viable alternative, employing machine learning to extract thematic data. State-of-the-art methods such as convolutional neural networks rely on fully pixel-level annotated images, which are difficult to obtain. Although reference LC datasets have been widely used to derive annotations, NFIs consist of point-based data, providing only sparse annotations. Weakly supervised and self-supervised learning approaches help address this issue by reducing dependence on fully annotated images and leveraging unlabeled data. However, their potential for large-scale LC mapping needs further investigation. This study explored the use of NFI data with deep learning and weakly supervised and self-supervised methods. Using Sentinel-2 images and the Portuguese NFI, which covers other LC types beyond forest, as sparse labels, we performed weakly supervised semantic segmentation with a convolutional neural network to create an updated and spatially continuous national LC map. Additionally, we investigated the potential of self-supervised learning by pretraining a masked autoencoder on 65,000 Sentinel-2 image chips and then fine-tuning the model with NFI-derived sparse labels. The weakly supervised baseline achieved a validation accuracy of 69.60%, surpassing Random Forest (67.90%). The self-supervised model achieved 71.29%, performing on par with the baseline using half the training data. The results demonstrated that integrating both learning approaches enabled successful countrywide LC mapping with limited training data. Full article

(This article belongs to the Section Earth Observation Data)

►▼ Show Figures

Figure 1

9 pages, 4313 KiB

Open AccessArticle

Power Load Forecasting System of Iron and Steel Enterprises Based on Deep Kernel–Multiple Kernel Joint Learning

by Yan Zhang, Junsheng Wang, Jie Sun, Ruiqi Sun and Dawei Qin

Processes 2025, 13(2), 584; https://doi.org/10.3390/pr13020584 - 19 Feb 2025

Viewed by 271

Abstract

The traditional power load forecasting learning method has problems such as overfitting and incomplete learning of time series information when dealing with complex nonlinear data, which affects the accuracy of short–medium term power load forecasting. A joint learning method, LSVM-MKL, was proposed based on the bidirectional promotion of deep kernel learning (DKL) and multiple kernel learning (MKL). The multi-kernel method was combined with the input layer, the highest coding layer, and the highest encoding layer to model the network of the stack autoencoder (SAE) to obtain more comprehensive information. At the same time, the deep kernel was integrated into the optimization training of Gaussian multi-kernel by means of the nonlinear product to form the nonlinear composite kernel. Through a large number of reference datasets and actual industrial data experiments, it was shown that compared with the Elman and LSTM-Seq2Seq methods, the proposed method achieved a higher prediction accuracy of 4.32%, which verified its adaptability to complex time-varying power load forecasting processes and greatly improved the accuracy of power load forecasting. Full article

(This article belongs to the Special Issue Industrial IoT-Enabled Modeling and Optimization for the Process Industry)

►▼ Show Figures

Figure 1

Figure 1
Joint learning framework. Full article ">Figure 2
Stack autoencoder structure. Full article ">Figure 3
Classification accuracy under three kinds of multi-kernel instances. (a) LSVM-MKL1; (b) Elman1; (c) LSTM-Seq2Seq1; (d) LSVM-MKL2; (e) Elman2; and (f) LSTM-Seq2Seq2. Full article ">Figure 4
Datasets and timing databases. Full article ">Figure 5
Power load forecasting results. (a) Training values and actual values; and (b) predicted values and real values. Full article ">

15 pages, 1877 KiB

Open AccessArticle

GraphEPN: A Deep Learning Framework for B-Cell Epitope Prediction Leveraging Graph Neural Networks

by Feng Wang, Xiangwei Dai, Liyan Shen and Shan Chang

Appl. Sci. 2025, 15(4), 2159; https://doi.org/10.3390/app15042159 - 18 Feb 2025

Viewed by 271

Abstract

B-cell epitope prediction is crucial for advancing immunology, particularly in vaccine development and antibody-based therapies. Traditional experimental techniques are hindered by high costs, time consumption, and limited scalability, making them unsuitable for large-scale applications. Computational methods provide a promising alternative, enabling high-throughput screening and accurate predictions. However, existing computational approaches often struggle to capture the complexity of protein structures and intricate residue interactions, highlighting the need for more effective models. This study presents GraphEPN, a novel B-cell epitope prediction framework combining a vector quantized variational autoencoder (VQ-VAE) with a graph transformer. The pre-trained VQ-VAE captures both discrete representations of amino acid microenvironments and continuous structural embeddings, providing a comprehensive feature set for downstream tasks. The graph transformer further processes these features to model long-range dependencies and interactions. Experimental results demonstrate that GraphEPN outperforms existing methods across multiple datasets, achieving superior prediction accuracy and robustness. This approach underscores the significant potential for applications in immunodiagnostics and vaccine development, merging advanced deep learning-based representation learning with graph-based modeling. Full article

►▼ Show Figures

Figure 1

Figure 1
Schematic view of the GraphEPN architecture. (a) Overall framework: The 3D structure of a protein (PDB ID: 1OTU_A) is represented as a graph, where nodes correspond to amino acid residues. The VQ-VAE encodes and quantizes node features into discrete embeddings, which are passed to a graph transformer for epitope prediction. (b) VQ-VAE Module: The encoder extracts latent features and maps them to the nearest codebook vectors, generating discrete representations, while the decoder reconstructs original features. (c) Graph Transformer Architecture: The model applies graph attention networks (GAT) to capture residue interactions, followed by residual connections and feed-forward layers. Full article ">Figure 2
Performance evaluation of the GraphEPN model. (a) ROC curves of 5-fold cross-validation for GraphEPN. (b) AUPRC curves of 5-fold cross-validation for GraphEPN. (c) Comparison of the ROC curves between the GraphEPN and peer methods. (d) Comparison of AUPRC curves between GraphEPN and peer methods. Full article ">Figure 3
Visualization of epitope predictions for a test case (PDB ID: 6ad8_A, chain A) across multiple methods. (a) Reference epitopes. (b–f) Predictions by GraphEPN, BepiPred 3.0, SEPPA 3.0, SEMA 2.0, and ElliPro, respectively. In each model, correctly predicted epitope residues (true positives) are shown in green, residues incorrectly predicted as epitopes (false positives) are shown in red, and residues that should have been predicted as epitopes but were missed (false negatives) are highlighted in yellow. Silver represents non-epitope residues. Full article ">Figure 4
Visualization of GraphEPN model predictions for protein 2j88_A. (a) Predicted epitopes are highlighted on the protein’s 3D structure. Residues with high prediction scores are shown as cyan sticks, with secondary structure elements in green and unlabeled regions in silver. (b) Epitope prediction scores along the sequence, where the x-axis corresponds to the sequence positions of amino acids and the y-axis represents the predicted epitope scores. The color gradient indicates the predicted confidence, with yellow representing high-confidence predictions. The blue dashed line marks the prediction threshold. Full article ">

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 27.

Go to page 1 2 3 4 5

Search Results (1,302)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI