[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (21,813)

Search Parameters:
Keywords = image network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 7905 KiB  
Article
Efficient Hyperspectral Video Reconstruction via Dual-Channel DMD Encoding
by Mingming Ma, Yi Niu, Dahua Gao, Fu Li and Guangming Shi
Remote Sens. 2025, 17(2), 190; https://doi.org/10.3390/rs17020190 (registering DOI) - 8 Jan 2025
Abstract
Hyperspectral video acquisition requires a precise balance between spectral and temporal resolution, often achieved through compressive sampling using two-dimensional detectors and spectral reconstruction algorithms. However, the reliance on spatial light modulators for coding reduces optical efficiency, while complex recovery algorithms hinder real-time reconstruction. [...] Read more.
Hyperspectral video acquisition requires a precise balance between spectral and temporal resolution, often achieved through compressive sampling using two-dimensional detectors and spectral reconstruction algorithms. However, the reliance on spatial light modulators for coding reduces optical efficiency, while complex recovery algorithms hinder real-time reconstruction. To address these challenges, we propose a digital-micromirror-device-based complementary dual-channel hyperspectral (DMD-CDH) video imaging system. This system employs a DMD for simultaneous light splitting and spatial encoding, enabling one channel to perform non-aliasing spectral sampling at lower frame rates while the other provides complementary high-rate sampling for panchromatic video. Featuring high optical throughput and efficient complementary sampling, the system ensures reliable hyperspectral video reconstruction and serves as a robust ground-based validation platform for remote sensing applications. Additionally, we introduce tailored optical error calibration and fixation techniques alongside a lightweight hyperspectral fusion network for reconstruction, achieving hyperspectral frame rates exceeding 30 fps. Compared to the existing models, this system simplifies the calibration process and provides a practical high-performance solution for real-time hyperspectral video imaging. Full article
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Schematic of the DMD-CDM system. (<b>b</b>) Imaging system design diagram with stable structure.</p>
Full article ">Figure 2
<p>Schematic diagram of DMD structure, with diamond-shaped arrangement reversed diagonally to both sides at a specified angle.</p>
Full article ">Figure 3
<p>Loss of field of view due to rotating DMD. The yellow squares represent pixels on the detector, the gray squares represent DMD pixels aligned with the detector after tilting 45°, and the blue squares represent pixels that cannot be captured by the detector.</p>
Full article ">Figure 4
<p>The arrangement and matching of DMD and detector. (<b>a</b>) Diamond-shaped staggered DMD. (<b>b</b>) Square-shaped arranged CMOS. (<b>c</b>) Schematic diagram of diamond–square matching problem.</p>
Full article ">Figure 5
<p>Reconstruction results in pushscan mode, the red and green boxes represent the selected partial regions. (<b>a</b>) Selecting the largest energy value to reconstruct the result. (<b>b</b>,<b>c</b>) Reconstruction detail display. (<b>d</b>) Polynomial fit calibration reconstruction result. (<b>e</b>,<b>f</b>) Reconstruction detail display.</p>
Full article ">Figure 6
<p>Schematic diagram of multispectral pixel distribution. The left picture is the reconstruction method adopted by the system using the beam splitter, and the RGB information of the sampling points in each spectral channel is known; the right picture is the DMD-CDM reconstruction method.</p>
Full article ">Figure 7
<p>Schematic diagram of lightweight hyperspectral fusion network. UP-HSI is the upsampled image on LR-HSI, selecting spectral bands at equal intervals and sequentially performing feature extraction and fusion with HR-MSI image.</p>
Full article ">Figure 8
<p>Physical diagram of the proposed system, consisting of an objective lens, a DMD, two sets of reflectors, a relay lens, an Amici prism, a grayscale camera, and an RGB camera.</p>
Full article ">Figure 9
<p>(<b>a</b>,<b>c</b>) Color image obtained after DMD encoding and reflection, grayscale image obtained after DMD-coded reflection, and prism dispersion. The different colored boxes in the figure correspond to the selected regions shown in <a href="#remotesensing-17-00190-f010" class="html-fig">Figure 10</a>. (<b>b</b>,<b>d</b>) Reconstructed spectral image (wavelength from 500 to 850 nm; interval of 50 nm).</p>
Full article ">Figure 10
<p>Comparison curves between the spectral information of the three sample points intercepted and the ground truth. (<b>a</b>) Regions corresponding to the red, green, and blue boxes in <a href="#remotesensing-17-00190-f009" class="html-fig">Figure 9</a>b; (<b>b</b>) Regions corresponding to the blue, red, and yellow boxes in <a href="#remotesensing-17-00190-f009" class="html-fig">Figure 9</a>b.</p>
Full article ">Figure 11
<p>The comparison results of light throughput of different systems.</p>
Full article ">Figure 12
<p>Difference results of “balloons-ms” image on the CAVE dataset at 430, 470, 510, 550, 590, 630, and 670 nm. The first row shows the ground-truth spectral band, and the other rows show the reconstructed results generated by the competing methods.</p>
Full article ">
20 pages, 6075 KiB  
Article
Fault Diagnosis of Rolling Bearings Based on Adaptive Denoising Residual Network
by Yiwen Chen, Xinggui Zeng and Haisheng Huang
Processes 2025, 13(1), 151; https://doi.org/10.3390/pr13010151 (registering DOI) - 8 Jan 2025
Abstract
To address the vulnerability of rolling bearings to noise interference in industrial settings, along with the problems of weak noise resilience and inadequate generalization in conventional residual network frameworks, this study introduces an adaptive denoising residual network (AD-ResNet) for diagnosing rolling bearing faults. [...] Read more.
To address the vulnerability of rolling bearings to noise interference in industrial settings, along with the problems of weak noise resilience and inadequate generalization in conventional residual network frameworks, this study introduces an adaptive denoising residual network (AD-ResNet) for diagnosing rolling bearing faults. Initially, the sensors collect the bearing vibration signals, which are then converted into two-dimensional grayscale images through the application of a continuous wavelet transform. Then, a spatial adaptive denoising network (SADNet) architecture is incorporated to comprehensively extract multi-scale information from noisy images. By exploiting the improved pyramid squeeze attention (IPSA) module, which excels in extracting representative features from channel attention vectors, this unit substitutes the standard convolutional layers present in typical residual networks. Ultimately, this model was validated through experiments using publicly available bearing datasets from CWRU and HUST. The findings suggest that with −6 dB Gaussian white noise, the average accuracy of recognition achieves a rate of 90.96%. In scenarios of fluctuating speeds accompanied by strong noise, the recognition accuracy can reach 89.54%, while the training time per cycle averages merely 3.65 s. When compared to other widely utilized fault diagnosis techniques, the approach described in this paper exhibits enhanced noise resistance and better generalization capabilities. Full article
(This article belongs to the Section Advanced Digital and Other Processes)
Show Figures

Figure 1

Figure 1
<p>Selected time-frequency representations of various bearing fault conditions: (<b>a</b>) normal condition; (<b>b</b>) inner ring fault at a moderate level; (<b>c</b>) inner ring fault at a severe level; (<b>d</b>) outer ring fault at a moderate level; (<b>e</b>) outer ring fault at a severe level; (<b>f</b>) rolling element fault at a moderate level.</p>
Full article ">Figure 2
<p>ResNet residual block [<a href="#B34-processes-13-00151" class="html-bibr">34</a>].</p>
Full article ">Figure 3
<p>Residual spatial-adaptive block [<a href="#B36-processes-13-00151" class="html-bibr">36</a>].</p>
Full article ">Figure 4
<p>IPSA module.</p>
Full article ">Figure 5
<p>Optimized residual building block: (<b>a</b>) ORBB1; (<b>b</b>) ORBB2.</p>
Full article ">Figure 6
<p>AD-ResNet model.</p>
Full article ">Figure 7
<p>CWRU bearing test rig [<a href="#B39-processes-13-00151" class="html-bibr">39</a>].</p>
Full article ">Figure 8
<p>Accuracy of six models’ recognition performance under varying SNR conditions.</p>
Full article ">Figure 9
<p>Confusion matrix plot of three different models under high noise: (<b>a</b>) ResNet18; (<b>b</b>) SADNet; (<b>c</b>) model in this paper.</p>
Full article ">Figure 10
<p>Accuracy of fault recognition under variable load.</p>
Full article ">Figure 11
<p>t-SNE visualization distribution of three different models: (<b>a</b>) WDCNN; (<b>b</b>) EPSANet; (<b>c</b>) model in this paper.</p>
Full article ">Figure 12
<p>Boxplot of recognition accuracy of different models: (<b>a</b>) SNR of −4 dB; (<b>b</b>) SNR of −2 dB.</p>
Full article ">Figure 12 Cont.
<p>Boxplot of recognition accuracy of different models: (<b>a</b>) SNR of −4 dB; (<b>b</b>) SNR of −2 dB.</p>
Full article ">
17 pages, 399 KiB  
Article
Concatenated Attention: A Novel Method for Regulating Information Structure Based on Sensors
by Zeyu Zhang, Tianqi Chen and Yuki Todo
Appl. Sci. 2025, 15(2), 523; https://doi.org/10.3390/app15020523 (registering DOI) - 8 Jan 2025
Abstract
This paper addresses the challenges of limited training data and suboptimal environmental conditions in image processing tasks, such as underwater imaging with poor lighting and distortion. Neural networks, including Convolutional Neural Networks (CNNs) and Transformers, have advanced image analysis but remain constrained by [...] Read more.
This paper addresses the challenges of limited training data and suboptimal environmental conditions in image processing tasks, such as underwater imaging with poor lighting and distortion. Neural networks, including Convolutional Neural Networks (CNNs) and Transformers, have advanced image analysis but remain constrained by computational demands and insufficient data. To overcome these limitations, we propose a novel split-and-concatenate method for self-attention mechanisms. By splitting Query and Key matrices into submatrices, performing cross-multiplications, and applying weighted summation, the method optimizes intermediate variables without increasing computational costs. Experiments on a real-world crack dataset demonstrate its effectiveness in improving network performance. Full article
(This article belongs to the Special Issue Application of Neural Networks in Sensors and Microwave Antennas)
Show Figures

Figure 1

Figure 1
<p>The computational structure of the self-attention mechanism.</p>
Full article ">Figure 2
<p>The computational structure of the concatenated attention method.</p>
Full article ">Figure 3
<p>Neural network with CA method.</p>
Full article ">Figure 4
<p>The accuracy changes in the network on the training set during the training process.</p>
Full article ">Figure 5
<p>The accuracy changes in the network on the validation set during the training process.</p>
Full article ">Figure 6
<p>The ROC curve of the MHSA method on the Crack-Segmentation dataset.</p>
Full article ">Figure 7
<p>The ROC curve of the CA method on the Crack-Segmentation dataset.</p>
Full article ">Figure 8
<p>Samples from the Crack dataset, photos with cracks (<b>left</b>) and photos without cracks (<b>right</b>).</p>
Full article ">Figure 9
<p>The ROC curve of the MHSA method on the Crack dataset.</p>
Full article ">Figure 10
<p>The ROC curve of the CA method on the Crack dataset.</p>
Full article ">
20 pages, 1313 KiB  
Article
DeepGenMon: A Novel Framework for Monkeypox Classification Integrating Lightweight Attention-Based Deep Learning and a Genetic Algorithm
by Abdulqader M. Almars
Diagnostics 2025, 15(2), 130; https://doi.org/10.3390/diagnostics15020130 (registering DOI) - 8 Jan 2025
Abstract
Background: The rapid global spread of the monkeypox virus has led to serious issues for public health professionals. According to related studies, monkeypox and other types of skin conditions can spread through direct contact with infected animals, humans, or contaminated items. This [...] Read more.
Background: The rapid global spread of the monkeypox virus has led to serious issues for public health professionals. According to related studies, monkeypox and other types of skin conditions can spread through direct contact with infected animals, humans, or contaminated items. This disease can cause fever, headaches, muscle aches, and enlarged lymph nodes, followed by a rash that develops into lesions. To facilitate the early detection of monkeypox, researchers have proposed several AI-based techniques for accurately classifying and identifying the condition. However, there is still room for improvement to accurately detect and classify monkeypox cases. Furthermore, the currently proposed pre-trained deep learning models can consume extensive resources to achieve accurate detection and classification of monkeypox. Hence, these models often need significant computational power and memory. Methods: This paper proposes a novel lightweight framework called DeepGenMonto accurately classify various types of skin diseases, such as chickenpox, melasma, monkeypox, and others. This suggested framework leverages an attention-based convolutional neural network (CNN) and a genetic algorithm (GA) to enhance detection accuracy while optimizing the hyperparameters of the proposed model. It first applies the attention mechanism to highlight and assign weights to specific regions of an image that are relevant to the model’s decision-making process. Next, the CNN is employed to process the visual input and extract hierarchical features for classifying the input data into multiple classes. Finally, the CNN’s hyperparameters are adjusted using a genetic algorithm to enhance the model’s robustness and classification accuracy. Compared to the state-of-the-art (SOTA) models, DeepGenMon features a lightweight design that requires significantly lower computational resources and is easier to train with few parameters. Its effective integration of a CNN and an attention mechanism with a GA further enhances its performance, making it particularly well suited for low-resource environments. DeepGenMon is evaluated on two public datasets. The first dataset comprises 847 images of diverse skin diseases, while the second dataset contains 659 images classified into several categories. Results: The proposed model demonstrates superior performance compared to SOTA models across key evaluation metrics. On dataset 1, it achieves a precision of 0.985, recall of 0.984, F-score of 0.985, and accuracy of 0.985. Similarly, on dataset 2, the model attains a precision of 0.981, recall of 0.982, F-score of 0.982, and accuracy of 0.982. Moreover, the findings demonstrate the model’s ability to achieve an inference time of 2.9764 s on dataset 1 and 2.1753 s on dataset 2. Conclusions: These results also show DeepGenMon’s effectiveness in accurately classifying different skin conditions, highlighting its potential as a reliable and low-resource tool in clinical settings. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

Figure 1
<p>The proposed DeepGenMon architecture.</p>
Full article ">Figure 2
<p>Samples from dataset 1 for each class, including (<b>A</b>) Chickenpox, (<b>B</b>) Measles, (<b>C</b>) Cowpox, (<b>D</b>) Monkeypox, and (<b>E</b>) Smallpox.</p>
Full article ">Figure 3
<p>Samples from dataset 2 for each class, including (<b>A</b>) Chickenpox, (<b>B</b>) Measles, (<b>C</b>) Monkeypox, and (<b>D</b>) Normal.</p>
Full article ">Figure 4
<p>Comparative accuracies of DeepGenMon and state-of-the-art models.</p>
Full article ">Figure 5
<p>Overall accuracy and validation for the proposed framework on Dataset 1.</p>
Full article ">Figure 6
<p>Overall Training and validation for the proposed framework on dataset 2.</p>
Full article ">Figure 7
<p>ROC curves for the proposed framework.</p>
Full article ">Figure 8
<p>Confusion matrix for the proposed framework.</p>
Full article ">Figure 9
<p>Class-wise performance of DeepGenMon.</p>
Full article ">
20 pages, 2066 KiB  
Article
Double Attention: An Optimization Method for the Self-Attention Mechanism Based on Human Attention
by Zeyu Zhang, Bin Li, Chenyang Yan, Kengo Furuichi and Yuki Todo
Biomimetics 2025, 10(1), 34; https://doi.org/10.3390/biomimetics10010034 (registering DOI) - 8 Jan 2025
Abstract
Artificial intelligence, with its remarkable adaptability, has gradually integrated into daily life. The emergence of the self-attention mechanism has propelled the Transformer architecture into diverse fields, including a role as an efficient and precise diagnostic and predictive tool in medicine. To enhance accuracy, [...] Read more.
Artificial intelligence, with its remarkable adaptability, has gradually integrated into daily life. The emergence of the self-attention mechanism has propelled the Transformer architecture into diverse fields, including a role as an efficient and precise diagnostic and predictive tool in medicine. To enhance accuracy, we propose the Double-Attention (DA) method, which improves the neural network’s biomimetic performance of human attention. By incorporating matrices generated from shifted images into the self-attention mechanism, the network gains the ability to preemptively acquire information from surrounding regions. Experimental results demonstrate the superior performance of our approaches across various benchmark datasets, validating their effectiveness. Furthermore, the method was applied to patient kidney datasets collected from hospitals for diabetes diagnosis, where they achieved high accuracy with significantly reduced computational demands. This advancement showcases the potential of our methods in the field of biomimetics, aligning well with the goals of developing innovative bioinspired diagnostic tools. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) in Biomedical Engineering)
Show Figures

Figure 1

Figure 1
<p>The architecture of the DA method in the Transformer.</p>
Full article ">Figure 2
<p>Self-attention algorithm schematic diagram (<b>left</b>) and Double-Attention algorithm schematic diagram (<b>right</b>). The <math display="inline"><semantics> <msub> <mi>K</mi> <mn>2</mn> </msub> </semantics></math> matrix is multiplied by the <span class="html-italic">Q</span> matrix, and the resulting matrix is added to the computation result of the <math display="inline"><semantics> <msub> <mi>K</mi> <mn>1</mn> </msub> </semantics></math> matrix. After applying the softmax function, the resulting values are multiplied by the <span class="html-italic">V</span> matrix.</p>
Full article ">Figure 3
<p>Matrix generation schematic diagram for the DA method. The matrices <span class="html-italic">Q</span>, <math display="inline"><semantics> <msub> <mi>K</mi> <mn>1</mn> </msub> </semantics></math>, and <span class="html-italic">V</span> are generated from the original input image, while the <math display="inline"><semantics> <msub> <mi>K</mi> <mn>2</mn> </msub> </semantics></math> matrix is generated from the shifted input image. The multiplication of the <math display="inline"><semantics> <msub> <mi>K</mi> <mn>2</mn> </msub> </semantics></math> matrix with the <span class="html-italic">Q</span> matrix can be seen as the interaction between different windows, which resembles the process of shifting and recalculating windows in the Swin Transformer.</p>
Full article ">Figure 4
<p>Accuracy of the training and validation sets in the CUB-200 dataset.</p>
Full article ">Figure 5
<p>Accuracy of the training and validation sets in the Oxford-IIIT Pet dataset.</p>
Full article ">Figure 6
<p>Accuracy of the training and validation sets in the Flower-102 dataset.</p>
Full article ">Figure 7
<p>Accuracy of the training and validation sets in the Food-101 dataset.</p>
Full article ">Figure 8
<p>Accuracy of the training and validation sets in the CIFAR-100 dataset.</p>
Full article ">Figure 9
<p>Example sample of kidney imaging dataset.</p>
Full article ">
20 pages, 6712 KiB  
Article
A Parallel Image Denoising Network Based on Nonparametric Attention and Multiscale Feature Fusion
by Jing Mao, Lianming Sun, Jie Chen and Shunyuan Yu
Sensors 2025, 25(2), 317; https://doi.org/10.3390/s25020317 - 7 Jan 2025
Abstract
Convolutional neural networks have achieved excellent results in image denoising; however, there are still some problems: (1) The majority of single-branch models cannot fully exploit the image features and often suffer from the loss of information. (2) Most of the deep CNNs have [...] Read more.
Convolutional neural networks have achieved excellent results in image denoising; however, there are still some problems: (1) The majority of single-branch models cannot fully exploit the image features and often suffer from the loss of information. (2) Most of the deep CNNs have inadequate edge feature extraction and saturated performance problems. To solve these problems, this paper proposes a two-branch convolutional image denoising network based on nonparametric attention and multiscale feature fusion, aiming to improve the denoising performance while better recovering the image edge and texture information. Firstly, ordinary convolutional layers were used to extract shallow features of noise in the image. Then, a combination of two-branch networks with different and complementary structures was used to extract deep features from the noise information in the image to solve the problem of insufficient feature extraction by the single-branch network model. The upper branch network used densely connected blocks to extract local features of the noise in the image. The lower branch network used multiple dilation convolution residual blocks with different dilation rates to increase the receptive field and extend more contextual information to obtain the global features of the noise in the image. It not only solved the problem of insufficient edge feature extraction but also solved the problem of the saturation of deep CNN performance. In this paper, a nonparametric attention mechanism is introduced in the two-branch feature extraction module, which enabled the network to pay attention to and learn the key information in the feature map, and improved the learning performance of the network. The enhanced features were then processed through the multiscale feature fusion module to obtain multiscale image feature information at different depths to obtain more robust fused features. Finally, the shallow features and deep features were summed using a long jump join and were processed through an ordinary convolutional layer and output to obtain a residual image. In this paper, Set12, BSD68, Set5, CBSD68, and SIDD are used as a test dataset to which different intensities of Gaussian white noise were added for testing and compared with several mainstream denoising methods currently available. The experimental results showed that this paper’s algorithm had better objective indexes on all test sets and outperformed the comparison algorithms. The method in this paper not only achieved a good denoising effect but also effectively retained the edge and texture information of the original image. The proposed method provided a new idea for the study of deep neural networks in the field of image denoising. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>The framework of the NAMFPDNet.</p>
Full article ">Figure 2
<p>Structure of PFEM.</p>
Full article ">Figure 3
<p>TCB structure.</p>
Full article ">Figure 4
<p>DCRB structure.</p>
Full article ">Figure 5
<p>An implementation of simamAtten.</p>
Full article ">Figure 6
<p>Multiscale feature fusion module.</p>
Full article ">Figure 7
<p>Results of average PSNR of the proposed method minus average PSNR (dB) of other algorithms (on Set12 and BSD68). (<b>a</b>) Set12; (<b>b</b>) BSD68.</p>
Full article ">Figure 8
<p>Lena image denoising results (<math display="inline"><semantics> <mrow> <mo>δ</mo> <mo>=</mo> <mn>25</mn> </mrow> </semantics></math>). (<b>a</b>) Original image; (<b>b</b>) noise image/20.26 dB; (<b>c</b>) BM3D/32.07 dB; (<b>d</b>) WNNM/32.24 dB; (<b>e</b>) MLP/32.25 dB; (<b>f</b>) EPLL/31.73 dB; (<b>g</b>) DNCNN-S/32.44 dB; (<b>h</b>) DNCNN-B/32.42 dB; (<b>i</b>) FFDNet/32.57 dB; (<b>j</b>) ours/32.74 dB.</p>
Full article ">Figure 9
<p>Building image denoising results (<math display="inline"><semantics> <mrow> <mo>δ</mo> <mo>=</mo> <mn>50</mn> </mrow> </semantics></math>). (<b>a</b>) Original image; (<b>b</b>) noise image/14.78 dB; (<b>c</b>) BM3D/26.21 dB; (<b>d</b>) WNNM/26.51 dB; (<b>e</b>) EPLL/26.36 dB; (<b>f</b>) MLP/26.54 dB; (<b>g</b>) DnCNN-S/26.89 dB; (<b>h</b>) DnCNN-B/26.92 dB; (<b>i</b>) FFDNet/26.93 dB; (<b>j</b>) ours/26.97 dB.</p>
Full article ">Figure 9 Cont.
<p>Building image denoising results (<math display="inline"><semantics> <mrow> <mo>δ</mo> <mo>=</mo> <mn>50</mn> </mrow> </semantics></math>). (<b>a</b>) Original image; (<b>b</b>) noise image/14.78 dB; (<b>c</b>) BM3D/26.21 dB; (<b>d</b>) WNNM/26.51 dB; (<b>e</b>) EPLL/26.36 dB; (<b>f</b>) MLP/26.54 dB; (<b>g</b>) DnCNN-S/26.89 dB; (<b>h</b>) DnCNN-B/26.92 dB; (<b>i</b>) FFDNet/26.93 dB; (<b>j</b>) ours/26.97 dB.</p>
Full article ">Figure 10
<p>Denoising effect of different algorithms from Set5 (<math display="inline"><semantics> <mrow> <mo>δ</mo> <mo>=</mo> <mn>25</mn> </mrow> </semantics></math>). (<b>a</b>) Original image; (<b>b</b>) noise image/20.8186 dB; (<b>c</b>) CBM3D/33.1082 dB; (<b>d</b>) DNCNN-C/33.1754 dB; (<b>e</b>) FFDNet/33.3205 dB; (<b>f</b>) ours/33.4017 dB.</p>
Full article ">Figure 11
<p>Denoising effect of different algorithms from CBSD68 (<math display="inline"><semantics> <mrow> <mo>δ</mo> <mo>=</mo> <mn>50</mn> </mrow> </semantics></math>). (<b>a</b>) Original image; (<b>b</b>) noise image/14.6714 dB; (<b>c</b>) CBM3D/26.4053 dB; (<b>d</b>) DNCNN-C/27.3557 dB; (<b>e</b>) FFDNet/27.2981 dB; (<b>f</b>) ours/27.3931 dB.</p>
Full article ">Figure 12
<p>SIDD image denoising results with different algorithms. (<b>a</b>) Noise image; (<b>b</b>) original image; (<b>c</b>) BM3D (35.64 dB); (<b>d</b>) DnCNN-C (40.67 dB); (<b>e</b>) FFDNet (40.96 dB); (<b>f</b>) CBDNet (38.75 dB); (<b>g</b>) ours (41.36 dB).</p>
Full article ">Figure 13
<p>Average PSNR of each sub-network with a dual-branch network.</p>
Full article ">Figure 14
<p>Average PSNR of this paper’s network with DnCNN and FFDNet.</p>
Full article ">Figure 15
<p>Feature direct fusion method.</p>
Full article ">
14 pages, 7866 KiB  
Article
The First Seismic Imaging of the Holy Cross Fault in the Łysogóry Region, Poland
by Eslam Roshdy, Artur Marciniak, Rafał Szaniawski and Mariusz Majdański
Appl. Sci. 2025, 15(2), 511; https://doi.org/10.3390/app15020511 - 7 Jan 2025
Abstract
The Holy Cross Mountains represent an isolated outcrop of Palaeozoic rocks located in the Trans-European Suture Zone, which is the boundary between the Precambrian East European Craton and Phanerozoic mobile belts of South-Western Europe. Despite extensive structural history studies, high-resolution seismic profiling has [...] Read more.
The Holy Cross Mountains represent an isolated outcrop of Palaeozoic rocks located in the Trans-European Suture Zone, which is the boundary between the Precambrian East European Craton and Phanerozoic mobile belts of South-Western Europe. Despite extensive structural history studies, high-resolution seismic profiling has not been applied to this region until now. This research introduces near-surface seismic imaging of the Holy Cross Fault, separating two tectonic units of different stratigraphic and deformation history. In our study, we utilize a carefully designed weight drop source survey with 5 m shot and receiver spacing and 4.5 Hz geophones. The imaging technique, combining seismic reflection profiling and travel time tomography, reveals detailed fault geometries down to 400 m. Precise data processing, including static corrections and noise attenuation, significantly enhanced signal-to-noise ratio and seismic resolution. Furthermore, the paper discusses various fault imaging techniques with their shortcomings. The data reveal a complex network of intersecting fault strands, confirming general thrust fault geometry of the fault system, that align with the region’s tectonic evolution. These findings enhance understanding of the Holy Cross Mountains’ structural framework and provide valuable reference data for future studies of similar tectonic environments. Full article
(This article belongs to the Special Issue Earthquake Engineering and Seismic Risk)
Show Figures

Figure 1

Figure 1
<p>(<b>A</b>) Tectonic map of Poland with marked Holy Cross Mountain region (HCM). (<b>B</b>) Geological map of the Holy Cross Mountains (after [<a href="#B22-applsci-15-00511" class="html-bibr">22</a>], modified). (<b>C</b>) Geological map of the study area. Red line shows the seismic profile crossing Holy Cross Fault (HCF).</p>
Full article ">Figure 2
<p>Stratigraphy of the Łysogóry Region (after [<a href="#B8-applsci-15-00511" class="html-bibr">8</a>], modified). The Upper Silurian-Lower Devonian units are based on geological maps. Abbreviation (BF: Bronkowice Fm., GPF: Góry Pieprzowe Fm., GS: graptolite shales, MGC: Miedziana Góra Conglomerate, RF: Rachtanka Fm., WF: Wisniówka Fm., WoF: Wojciechowice Fm).</p>
Full article ">Figure 3
<p>Overview of seismic acquisition setup. (<b>A</b>) shows a surface elevation map with the seismic line marked; (<b>B</b>) captures the field acquisition setup using the PEG-40 seismic impact source; and (<b>C</b>) provides a schematic representation of the two-deployment acquisition layout.</p>
Full article ">Figure 4
<p>Shot gathers illustrating various recorded waveform types with red lines indicating geometry integrity. The first breaks are easily visible at all offsets. Rich wavefield including S waves and surface waves is visible for the whole record.</p>
Full article ">Figure 5
<p>First break tomographic image of P-wave velocities (<b>top</b>) and detrended model showing perturbations from smoothed velocity field (<b>bottom</b>). The gray area is not covered by seismic rays. Transparent gray area is verified with a limited number of seismic rays.</p>
Full article ">Figure 6
<p>Comparison of Brute Stack (<b>A</b>) and Final Stack (<b>B</b>) with Corresponding Amplitude Spectra. Significant enhancements in data quality and amplitude spectrum can be observed in the final stack.</p>
Full article ">Figure 7
<p>The final reflection seismic image with marked recognized faults.</p>
Full article ">
18 pages, 6815 KiB  
Article
An Energy-Domain IR NUC Method Based on Unsupervised Learning
by Ting Li, Xuefeng Lai, Sheng Liao and Yucheng Xia
Remote Sens. 2025, 17(2), 187; https://doi.org/10.3390/rs17020187 - 7 Jan 2025
Abstract
To obtain accurate blackbody temperature, emissivity, and waveband measurements, an energy-domain infrared nonuniformity method based on unsupervised learning is proposed. This method exploits the inherent physical correlation within the calibration dataset and sets the average predicted energy-domain value of the same blackbody temperature [...] Read more.
To obtain accurate blackbody temperature, emissivity, and waveband measurements, an energy-domain infrared nonuniformity method based on unsupervised learning is proposed. This method exploits the inherent physical correlation within the calibration dataset and sets the average predicted energy-domain value of the same blackbody temperature as the learning goal. Then, the coefficients of the model are learned without theoretical radiance labels by leveraging clustering-based unsupervised learning methodologies. Finally, several experiments are performed on a mid-wave infrared system. The results show that the trained correction network is uniform and produces stable outputs when the integration time and attenuator change within the optimal dynamic range. The maximum change in the image corrected using the proposed algorithm was 1.29%. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

Figure 1
<p>The radiometric calibration diagram.</p>
Full article ">Figure 2
<p>The relation of radiance concerning the attenuator transmittance, integration time, and gray level of the infrared image.</p>
Full article ">Figure 3
<p>The curves of SNR vs. integration time and SNR vs. the transmittance of the attenuator.</p>
Full article ">Figure 4
<p>The schematic diagram of blackbody radiance clustering.</p>
Full article ">Figure 5
<p>The diagram of the unsupervised learning model.</p>
Full article ">Figure 6
<p>The picture of the infrared system.</p>
Full article ">Figure 7
<p>The flow chart of the proposed method.</p>
Full article ">Figure 8
<p>Training curve of the loss function value.</p>
Full article ">Figure 9
<p>The fitting result of the 16 operating points and the relative error of the corresponding point.</p>
Full article ">Figure 10
<p>The images corrected by the two-point correction algorithm and their corresponding gray histogram under different integration times.</p>
Full article ">Figure 11
<p>The images corrected using the proposed method and their corresponding histogram under different integration times.</p>
Full article ">Figure 12
<p>The images corrected by the two-point correction algorithm and their corresponding gray histogram under different attenuator gears.</p>
Full article ">Figure 13
<p>The images corrected by the proposed method and their corresponding histograms under different attenuator gears.</p>
Full article ">Figure 14
<p>The images corrected by the proposed method and their corresponding histogram under uncalibrated operating points and a calibrated operating point.</p>
Full article ">Figure 15
<p>Schematic diagram of the dynamic adjustment process.</p>
Full article ">
15 pages, 7120 KiB  
Article
Identifying Tomato Growth Stages in Protected Agriculture with StyleGAN3–Synthetic Images and Vision Transformer
by Yao Huo, Yongbo Liu, Peng He, Liang Hu, Wenbo Gao and Le Gu
Agriculture 2025, 15(2), 120; https://doi.org/10.3390/agriculture15020120 - 7 Jan 2025
Abstract
In protected agriculture, accurately identifying the key growth stages of tomatoes plays a significant role in achieving efficient management and high-precision production. However, traditional approaches often face challenges like non-standardized data collection, unbalanced datasets, low recognition efficiency, and limited accuracy. This paper proposes [...] Read more.
In protected agriculture, accurately identifying the key growth stages of tomatoes plays a significant role in achieving efficient management and high-precision production. However, traditional approaches often face challenges like non-standardized data collection, unbalanced datasets, low recognition efficiency, and limited accuracy. This paper proposes an innovative solution combining generative adversarial networks (GANs) and deep learning techniques to address these challenges. Specifically, the StyleGAN3 model is employed to generate high-quality images of tomato growth stages, effectively augmenting the original dataset with a broader range of images. This augmented dataset is then processed using a Vision Transformer (ViT) model for intelligent recognition of tomato growth stages within a protected agricultural environment. The proposed method was tested on 2723 images, demonstrating that the generated images are nearly indistinguishable from real images. The combined training approach incorporating both generated and original images produced superior recognition results compared to training with only the original images. The validation set achieved an accuracy of 99.6%, while the test set achieved 98.39%, marking improvements of 22.85%, 3.57%, and 3.21% over AlexNet, DenseNet50, and VGG16, respectively. The average detection speed was 9.5 ms. This method provides a highly effective means of identifying tomato growth stages in protected environments and offers valuable insights for improving the efficiency and quality of protected crop production. Full article
Show Figures

Figure 1

Figure 1
<p>Four categories of tomato growth period.</p>
Full article ">Figure 2
<p>StyleGAN3 generator structure.</p>
Full article ">Figure 3
<p>ViT model structure.</p>
Full article ">Figure 4
<p>The original tomato images and the StyleGAN3-generated images: The first to fourth columns represent the four categories of tomato growth stages. The first row shows the original images of the four tomato growth stages, while the second to fourth rows display the generated images corresponding to each category.</p>
Full article ">Figure 5
<p>Confusion matrix of tomato growth classification using proposed method.</p>
Full article ">Figure 6
<p>AUC comparison of ViT-Base performance with original dataset (<b>a</b>) and dataset augmented with generated image (<b>b</b>) during training.</p>
Full article ">Figure 7
<p>Performance of training parameters for four models. (<b>a</b>) Training accuracy, (<b>b</b>) training loss, (<b>c</b>) validation accuracy, (<b>d</b>) validation loss.</p>
Full article ">
16 pages, 1981 KiB  
Article
Optimizing Natural Image Quality Evaluators for Quality Measurement in CT Scan Denoising
by Rudy Gunawan, Yvonne Tran, Jinchuan Zheng, Hung Nguyen and Rifai Chai
Computers 2025, 14(1), 18; https://doi.org/10.3390/computers14010018 - 7 Jan 2025
Abstract
Evaluating the results of image denoising algorithms in Computed Tomography (CT) scans typically involves several key metrics to assess noise reduction while preserving essential details. Full Reference (FR) quality evaluators are popular for evaluating image quality in denoising CT scans. There is limited [...] Read more.
Evaluating the results of image denoising algorithms in Computed Tomography (CT) scans typically involves several key metrics to assess noise reduction while preserving essential details. Full Reference (FR) quality evaluators are popular for evaluating image quality in denoising CT scans. There is limited information about using Blind/No Reference (NR) quality evaluators in the medical image area. This paper shows the previously utilized Natural Image Quality Evaluator (NIQE) in CT scans; this NIQE is commonly used as a photolike image evaluator and provides an extensive assessment of the optimum NIQE setting. The result was obtained using the library of good images. Most are also part of the Convolutional Neural Network (CNN) training dataset against the testing dataset, and a new dataset shows an optimum patch size and contrast levels suitable for the task. This evidence indicates a possibility of using the NIQE as a new option in evaluating denoised quality to find improvement or compare the quality between CNN models. Full article
(This article belongs to the Special Issue Machine and Deep Learning in the Health Domain 2024)
Show Figures

Figure 1

Figure 1
<p>Four CNN models for denoising show layer type and connection variations.</p>
Full article ">Figure 2
<p>The scoring difference between improved and worsened (deteriorated) quality was between the MSE, PSNR, SSIM, and NIQE.</p>
Full article ">Figure 3
<p>The validation RMSE between four CNN Models shows that DU-Net takes the lead with a large margin.</p>
Full article ">Figure 4
<p>The percentage of improved images on [16 × 16] patch between four CNNs and different contrast levels.</p>
Full article ">Figure 5
<p>The percentage of images where the DU-Net scores best on [16 × 16] patch between different contrast levels.</p>
Full article ">Figure 6
<p>The new data average NIQE scores on different CNN models using [16 × 16] patch and a range of contrast levels.</p>
Full article ">Figure 7
<p>The NIQE perceived improvement in the new data between CNN models using [16 × 16] patch on selective contrast levels.</p>
Full article ">Figure 8
<p>The percentage of images where the DU-Net scores best in the new data using [16 × 16] patch on the targeted contrast levels.</p>
Full article ">
27 pages, 11926 KiB  
Article
Vision-Based Underwater Docking Guidance and Positioning: Enhancing Detection with YOLO-D
by Tian Ni, Can Sima, Wenzhong Zhang, Junlin Wang, Jia Guo and Lindan Zhang
J. Mar. Sci. Eng. 2025, 13(1), 102; https://doi.org/10.3390/jmse13010102 - 7 Jan 2025
Abstract
This study proposed a vision-based underwater vertical docking guidance and positioning method to address docking control challenges for human-operated vehicles (HOVs) and unmanned underwater vehicles (UUVs) under complex underwater visual conditions. A cascaded detection and positioning strategy incorporating fused active and passive markers [...] Read more.
This study proposed a vision-based underwater vertical docking guidance and positioning method to address docking control challenges for human-operated vehicles (HOVs) and unmanned underwater vehicles (UUVs) under complex underwater visual conditions. A cascaded detection and positioning strategy incorporating fused active and passive markers enabled real-time detection of the relative position and pose between the UUV and docking station (DS). A novel deep learning-based network model, YOLO-D, was developed to detect docking markers in real time. YOLO-D employed the Adaptive Kernel Convolution Module (AKConv) to dynamically adjust the sample shapes and sizes and optimize the target feature detection across various scales and regions. It integrated the Context Aggregation Network (CONTAINER) to enhance small-target detection and overall image accuracy, while the bidirectional feature pyramid network (BiFPN) facilitated effective cross-scale feature fusion, improving detection precision for multi-scale and fuzzy targets. In addition, an underwater docking positioning algorithm leveraging multiple markers was implemented. Tests on an underwater docking markers dataset demonstrated that YOLO-D achieved a detection accuracy of [email protected] to 94.5%, surpassing the baseline YOLOv11n with improvements of 1.5% in precision, 5% in recall, and 4.2% in [email protected]. Pool experiments verified the feasibility of the method, achieving a 90% success rate for single-attempt docking and recovery. The proposed approach offered an accurate and efficient solution for underwater docking guidance and target detection, which is of great significance for improving the safety of docking. Full article
(This article belongs to the Special Issue Innovations in Underwater Robotic Software Systems)
Show Figures

Figure 1

Figure 1
<p>Schematic of underwater vertical docking: (<b>a</b>) coordinate system of UUV; (<b>b</b>) arrangement and visual field of the camera; (<b>c</b>) coordinates of docking station; (<b>d</b>) final docking status.</p>
Full article ">Figure 2
<p>Underwater vertical docking guidance and positioning system: (<b>a</b>) unmanned underwater vehicle designed for docking; (<b>b</b>) docking station.</p>
Full article ">Figure 3
<p>Underwater docking guidance and positioning software structure.</p>
Full article ">Figure 4
<p>YOLOv11 network structure.</p>
Full article ">Figure 5
<p>YOLO-D network structure.</p>
Full article ">Figure 6
<p>Structure diagram of AKConv.</p>
Full article ">Figure 7
<p>PANet and BiFPN structure: (<b>a</b>) PANet structure; (<b>b</b>) BiFPN structure.</p>
Full article ">Figure 8
<p>Coordinate system for monocular vision position.</p>
Full article ">Figure 9
<p>Samples selected from datasets: (<b>a</b>) training set images captured onshore; (<b>b</b>) training set images taken underwater; (<b>c</b>) validation and testing set images taken underwater.</p>
Full article ">Figure 9 Cont.
<p>Samples selected from datasets: (<b>a</b>) training set images captured onshore; (<b>b</b>) training set images taken underwater; (<b>c</b>) validation and testing set images taken underwater.</p>
Full article ">Figure 10
<p>Train and valid total loss curves of different models: (<b>a</b>) train total loss curves; (<b>b</b>) valid total loss curves.</p>
Full article ">Figure 11
<p>Comparison of performance metric curves of different models: (<b>a</b>) comparison curves of precision; (<b>b</b>) comparison curves of recall; (<b>c</b>) comparison curves of mAP@0.5; (<b>d</b>) comparison curves of mAP@0.5:0.95.</p>
Full article ">Figure 12
<p>Precision–recall curves of different models.</p>
Full article ">Figure 13
<p>Comparison of experiment object detection effects.</p>
Full article ">Figure 13 Cont.
<p>Comparison of experiment object detection effects.</p>
Full article ">Figure 14
<p>Underwater landing docking guidance and positioning pool test.</p>
Full article ">Figure 15
<p>Underwater docking position curves.</p>
Full article ">Figure 16
<p>Position and pose curves of underwater docking: (<b>a</b>) position curves of <math display="inline"><semantics> <mrow> <mi>ξ</mi> </mrow> </semantics></math>; (<b>b</b>) position curves of <math display="inline"><semantics> <mrow> <mi>η</mi> </mrow> </semantics></math>; (<b>c</b>) position curves of <math display="inline"><semantics> <mrow> <mi>ζ</mi> </mrow> </semantics></math>; (<b>d</b>) yaw curves of <math display="inline"><semantics> <mrow> <mi>ψ</mi> </mrow> </semantics></math>.</p>
Full article ">
13 pages, 1390 KiB  
Article
Combined Input Deep Learning Pipeline for Embryo Selection for In Vitro Fertilization Using Light Microscopic Images and Additional Features
by Krittapat Onthuam, Norrawee Charnpinyo, Kornrapee Suthicharoenpanich, Supphaset Engphaiboon, Punnarai Siricharoen, Ronnapee Chaichaowarat and Chanakarn Suebthawinkul
J. Imaging 2025, 11(1), 13; https://doi.org/10.3390/jimaging11010013 - 7 Jan 2025
Abstract
The current process of embryo selection in in vitro fertilization is based on morphological criteria; embryos are manually evaluated by embryologists under subjective assessment. In this study, a deep learning-based pipeline was developed to classify the viability of embryos using combined inputs, including [...] Read more.
The current process of embryo selection in in vitro fertilization is based on morphological criteria; embryos are manually evaluated by embryologists under subjective assessment. In this study, a deep learning-based pipeline was developed to classify the viability of embryos using combined inputs, including microscopic images of embryos and additional features, such as patient age and developed pseudo-features, including a continuous interpretation of Istanbul grading scores by predicting the embryo stage, inner cell mass, and trophectoderm. For viability prediction, convolution-based transferred learning models were employed, multiple pretrained models were compared, and image preprocessing techniques and hyperparameter optimization via Optuna were utilized. In addition, a custom weight was trained using a self-supervised learning framework known as the Simple Framework for Contrastive Learning of Visual Representations (SimCLR) in cooperation with generated images using generative adversarial networks (GANs). The best model was developed from the EfficientNet-B0 model using preprocessed images combined with pseudo-features generated using separate EfficientNet-B0 models, and optimized by Optuna to tune the hyperparameters of the models. The designed model’s F1 score, accuracy, sensitivity, and area under curve (AUC) were 65.02%, 69.04%, 56.76%, and 66.98%, respectively. This study also showed an advantage in accuracy and a similar AUC when compared with the recent ensemble method. Full article
Show Figures

Figure 1

Figure 1
<p>Proposed pipeline of embryo pregnancy prediction using combined inputs. GAN, generative adversarial networks; ICM, inner cell mass; TE, trophectoderm.</p>
Full article ">Figure 2
<p>Scanned embryo image preprocessing diagram.</p>
Full article ">Figure 3
<p>Proposed structure of the model trained using Simple Framework for Contrastive Learning of Visual Representations (SimCLR) on original and generative adversarial networks (GAN)-generated images.</p>
Full article ">Figure 4
<p>Model performance summary: (<b>a</b>) confusion matrix of best model on test dataset, (<b>b</b>) prediction score distribution (non-pregnant), and (<b>c</b>) prediction score distribution (pregnant).</p>
Full article ">
19 pages, 4720 KiB  
Article
Applying MLP-Mixer and gMLP to Human Activity Recognition
by Takeru Miyoshi, Makoto Koshino and Hidetaka Nambo
Sensors 2025, 25(2), 311; https://doi.org/10.3390/s25020311 - 7 Jan 2025
Abstract
The development of deep learning has led to the proposal of various models for human activity recognition (HAR). Convolutional neural networks (CNNs), initially proposed for computer vision tasks, are examples of models applied to sensor data. Recently, high-performing models based on Transformers and [...] Read more.
The development of deep learning has led to the proposal of various models for human activity recognition (HAR). Convolutional neural networks (CNNs), initially proposed for computer vision tasks, are examples of models applied to sensor data. Recently, high-performing models based on Transformers and multi-layer perceptrons (MLPs) have also been proposed. When applying these methods to sensor data, we often initialize hyperparameters with values optimized for image processing tasks as a starting point. We suggest that comparable accuracy could be achieved with fewer parameters for sensor data, which typically have lower dimensionality than image data. Reducing the number of parameters would decrease memory requirements and computational complexity by reducing the model size. We evaluated the performance of two MLP-based models, MLP-Mixer and gMLP, by reducing the values of hyperparameters in their MLP layers from those proposed in the respective original papers. The results of this study suggest that the performance of MLP-based models is positively correlated with the number of parameters. Furthermore, these MLP-based models demonstrate improved computational efficiency for specific HAR tasks compared to representative CNNs. Full article
(This article belongs to the Section Wearables)
Show Figures

Figure 1

Figure 1
<p>Applying of patch embedding to sensor data in HAR.</p>
Full article ">Figure 2
<p>Architecture of MLP-Mixer for sensor data processing.</p>
Full article ">Figure 3
<p>Architecture of gMLP for sensor data processing.</p>
Full article ">Figure 4
<p>The number of model parameters vs. accuracy for HASC.</p>
Full article ">Figure 5
<p>The number or model parameters vs. accuracy for UCI HAR.</p>
Full article ">Figure 6
<p>The number of model parameters vs. accuracy for WISDM.</p>
Full article ">Figure 7
<p>Accuracy variations across five experiments for each model on the HASC. The value in “( )” after each model name indicates the reduction rate.</p>
Full article ">Figure 8
<p>Accuracy variations across five experiments for each model on the UCI HAR. The value in “( )” after each model name indicates the reduction rate.</p>
Full article ">Figure 9
<p>Accuracy variations across five experiments for each model on the WISDM. The value in “( )” after each model name indicates the reduction rate.</p>
Full article ">Figure 10
<p>Confusion matrix of models for the HASC (VGG16).</p>
Full article ">Figure 11
<p>Confusion matrix of models for the HASC (MLP-Mixer (reduction rate = 100%)).</p>
Full article ">Figure 12
<p>Confusion matrix of models for the HASC (gMLP (reduction rate = 100%)).</p>
Full article ">Figure 13
<p>Confusion matrix of models for the UCI HAR (ResNet-18).</p>
Full article ">Figure 14
<p>Confusion matrix of models for the UCI HAR (MLP-Mixer (reduction rate = 100%)).</p>
Full article ">Figure 15
<p>Confusion matrix of models for the UCI HAR (gMLP (reduction rate = 100%)).</p>
Full article ">Figure 16
<p>Confusion matrix of models for the WISDM (VGG16).</p>
Full article ">Figure 17
<p>Confusion matrix of models for the WISDM (MLP-Mixer (reduction rate = 100%)).</p>
Full article ">Figure 18
<p>Confusion matrix of models for the WISDM (gMLP (reduction rate = 100%)).</p>
Full article ">
18 pages, 19074 KiB  
Article
Deep Fashion Designer: Generative Adversarial Networks for Fashion Item Generation Based on Many-to-One Image Translation
by Jaewon Jung, Hyeji Kim and Jongyoul Park
Electronics 2025, 14(2), 220; https://doi.org/10.3390/electronics14020220 - 7 Jan 2025
Abstract
Generative adversarial networks (GANs) have demonstrated remarkable performance in various fashion-related applications, including virtual try-ons, compatible clothing recommendations, fashion editing, and the generation of fashion items. Despite this progress, limited research has addressed the specific challenge of generating a compatible fashion item with [...] Read more.
Generative adversarial networks (GANs) have demonstrated remarkable performance in various fashion-related applications, including virtual try-ons, compatible clothing recommendations, fashion editing, and the generation of fashion items. Despite this progress, limited research has addressed the specific challenge of generating a compatible fashion item with an ensemble consisting of distinct categories, such as tops, bottoms, and shoes. In response to this gap, we propose a novel GANs framework, termed Deep Fashion Designer Generative Adversarial Networks (DFDGAN), designed to address this challenge. Our model accepts a series of source images representing different fashion categories as inputs and generates a compatible fashion item, potentially from a different category. The architecture of our model comprises several key components: an encoder, a mapping network, a generator, and a discriminator. Through rigorous experimentation, we benchmark our model against existing baselines, validating the effectiveness of each architectural choice. Furthermore, qualitative results indicate that our framework successfully generates fashion items compatible with the input items, thereby advancing the field of fashion item generation. Full article
(This article belongs to the Special Issue AI-Based Pervasive Application Services)
Show Figures

Figure 1

Figure 1
<p>The overview of our framework. Four fashion images (shoes, suit pants, suit top, and a muffler) are input into the encoder, which generates four corresponding latent vectors. These latent vectors are then transformed into a single combined latent vector by the mapping network. The generator uses this combined latent vector along with the category of a fashion item to produce a synthetic image. Feature maps are extracted based on the synthetic and target images from VGG19 [<a href="#B1-electronics-14-00220" class="html-bibr">1</a>]. Finally, the discriminator evaluates the generated image and its associated category.</p>
Full article ">Figure 2
<p>Polyvore Dataset example. Each row of five images represents a single batch. The leftmost image depicts the ground truth, called a target, while the subsequent four images serve as input to DFDGAN and are referred to as sources 1 to 4.</p>
Full article ">Figure 3
<p>TSNE visualization; the left one is from the method with the mapping network and the right one is from the method without the mapping network.</p>
Full article ">Figure 4
<p>DFDGAN image generation result. Sources 1 to 4 represent the order of fashion items of multiple categories that are inputs to the model. DFDGAN represents the images generated by our framework.</p>
Full article ">Figure 5
<p>Image generation comparison. Sources 1 to 4 indicate the sequence of fashion items from multiple categories fed into the models. DFDGAN, pix2pix, and CycleGAN represent the images generated by each respective model.</p>
Full article ">Figure 6
<p>Outfit walking. Sources 1 to 4 represent the order of fashion items of multiple categories that are input into the model. DFDGAN represents the generated images.</p>
Full article ">
18 pages, 4612 KiB  
Article
A Machine Learning Algorithm to Aid the Development of Repair Materials for Ancient Ceramics via Additive Manufacturing
by Jianhong Ye
Processes 2025, 13(1), 145; https://doi.org/10.3390/pr13010145 - 7 Jan 2025
Abstract
In ancient historical ceramics, for various reasons, some problems such as dirt and damage inevitably occur, and necessary repair work must be carried out. Throughout ceramics restoration work, the selection and use of materials are very important. Thus, it is necessary to explore [...] Read more.
In ancient historical ceramics, for various reasons, some problems such as dirt and damage inevitably occur, and necessary repair work must be carried out. Throughout ceramics restoration work, the selection and use of materials are very important. Thus, it is necessary to explore the use of modern intelligent algorithms to assist the selection and application of restoration materials during the whole restoration process, in order to improve the effectiveness of ancient ceramics restoration. In this study, convolutional neural network (CNNs) technology and a machine learning (ML) algorithm were applied to images of ceramics for the defect identification and repair of ancient ceramics, aided by additive manufacturing (AM). The simulation results show that the recall of this algorithm for AM ancient ceramics image recognition was improved by 19.68%. In order to enhance the restoration effects on ancient ceramics, it is necessary to enhance their restoration by expanding the use of digital technology, with the intent to maintain the advantages of traditional handicrafts. Therefore, we should review experiences in the restoration of ancient ceramics, introduce digital technology according to specific needs, and enhance the advanced nature of restoration work. Full article
Show Figures

Figure 1

Figure 1
<p>Flow chart of this experiment.</p>
Full article ">Figure 2
<p>AM image restoration stage of ancient ceramics.</p>
Full article ">Figure 3
<p>CNN model of ceramic image feature extraction.</p>
Full article ">Figure 4
<p>Comparison of the characteristics of traditional and digital restoration characteristics.</p>
Full article ">Figure 5
<p>Image recognition consumes time.</p>
Full article ">Figure 6
<p>The error of different image inpainting algorithms applied on the training set.</p>
Full article ">Figure 7
<p>The error of different image inpainting algorithms applied on the test set.</p>
Full article ">Figure 8
<p>MSE of different image inpainting algorithms applied to the training set and test set. ((<b>A</b>) training set; (<b>B</b>) test set).</p>
Full article ">Figure 9
<p>RMSEs of different image inpainting algorithms applied to the training set and test set. ((<b>A</b>) training set; (<b>B</b>) test set).</p>
Full article ">Figure 10
<p>Accuracy results of different algorithms.</p>
Full article ">Figure 11
<p>Comparison of recall of ancient ceramic image recognition.</p>
Full article ">
Back to TopTop