[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (378)

Search Parameters:
Keywords = hybrid feature fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 871 KiB  
Article
The Walk of Guilt: Multimodal Deception Detection from Nonverbal Motion Behaviour
by Sharifa Alghowinem, Sabrina Caldwell, Ibrahim Radwan, Michael Wagner and Tom Gedeon
Information 2025, 16(1), 6; https://doi.org/10.3390/info16010006 - 26 Dec 2024
Viewed by 245
Abstract
Detecting deceptive behaviour for surveillance and border protection is critical for a country’s security. With the advancement of technology in relation to sensors and artificial intelligence, recognising deceptive behaviour could be performed automatically. Following the success of affective computing in emotion recognition from [...] Read more.
Detecting deceptive behaviour for surveillance and border protection is critical for a country’s security. With the advancement of technology in relation to sensors and artificial intelligence, recognising deceptive behaviour could be performed automatically. Following the success of affective computing in emotion recognition from verbal and nonverbal cues, we aim to apply a similar concept for deception detection. Recognising deceptive behaviour has been attempted; however, only a few studies have analysed this behaviour from gait and body movement. This research involves a multimodal approach for deception detection from gait, where we fuse features extracted from body movement behaviours from a video signal, acoustic features from walking steps from an audio signal, and the dynamics of walking movement using an accelerometer sensor. Using the video recording of walking from the Whodunnit deception dataset, which contains 49 subjects performing scenarios that elicit deceptive behaviour, we conduct multimodal two-category (guilty/not guilty) subject-independent classification. The classification results obtained reached an accuracy of up to 88% through feature fusion, with an average of 60% from both single and multimodal signals. Analysing body movement using single modality showed that the visual signal had the highest performance followed by the accelerometer and acoustic signals. Several fusion techniques were explored, including early, late, and hybrid fusion, where hybrid fusion not only achieved the highest classification results, but also increased the confidence of the results. Moreover, using a systematic framework for selecting the most distinguishing features of guilty gait behaviour, we were able to interpret the performance of our models. From these baseline results, we can conclude that pattern recognition techniques could help in characterising deceptive behaviour, where future work will focus on exploring the tuning and enhancement of the results and techniques. Full article
(This article belongs to the Special Issue Multimodal Human-Computer Interaction)
Show Figures

Figure 1

Figure 1
<p>Summary of the guilty behaviour detection from walking.</p>
Full article ">Figure 2
<p>Camera positions during participant movement (blue triangle indicates angle direction upward of camera view; yellow indicates angle direction downward of camera view).</p>
Full article ">Figure 3
<p>Sample of body joints’ localisation while walking the stairs (red lines relate to the right side of the body and the blue ones to the left side).</p>
Full article ">Figure 4
<p>Interpretation of the selected features from each modality. (<b>a</b>) Top body movement features. (<b>b</b>) Top step acoustics features. (<b>c</b>) Top accelerometer sensor features.</p>
Full article ">
15 pages, 1033 KiB  
Case Report
Utilization of RT-PCR and Optical Genome Mapping in Acute Promyelocytic Leukemia with Cryptic PML::RARA Rearrangement: A Case Discussion and Systemic Literature Review
by Giby V. George, Murad Elsadawi, Andrew G. Evans, Sarmad Ali, Bin Zhang and M. Anwar Iqbal
Genes 2025, 16(1), 7; https://doi.org/10.3390/genes16010007 - 25 Dec 2024
Viewed by 250
Abstract
Background: Acute promyelocytic leukemia (APL) is characterized by abnormal promyelocytes and t(15;17)(q24;q21) PML::RARA. Rarely, patients may have cryptic or variant rearrangements. All-trans retinoic acid (ATRA)/arsenic trioxide (ATO) is largely curative provided that the diagnosis is established early. Methods: We present the case [...] Read more.
Background: Acute promyelocytic leukemia (APL) is characterized by abnormal promyelocytes and t(15;17)(q24;q21) PML::RARA. Rarely, patients may have cryptic or variant rearrangements. All-trans retinoic acid (ATRA)/arsenic trioxide (ATO) is largely curative provided that the diagnosis is established early. Methods: We present the case of a 36-year-old male who presented with features concerning for disseminated intravascular coagulation. Although the initial diagnostic work-up, including pathology and flow cytometry evaluation, suggested a diagnosis of APL, karyotype and fluorescence in situ hybridization (FISH), using the PML/RARA dual fusion and RARA breakapart probes, were negative. We performed real-time polymerase chain reaction (RT-PCR) and optical genome mapping (OGM) to further confirm the clinicopathological findings. Results: RT-PCR revealed a cryptic PML::RARA fusion transcript. OGM further confirmed the nature and orientation of a cryptic rearrangement with an insertion of RARA into PML at intron 3 (bcr3). In light of these findings, we performed a systematic literature review to understand the prevalence, diagnosis, and prognosis of APL with cryptic PML::RARA rearrangements. Conclusions: This case, in conjunction with the results of our systematic literature review, highlights the importance of performing confirmatory testing in FISH-negative cases of suspected APL to enable prompt diagnosis and appropriate treatment. Full article
(This article belongs to the Special Issue Clinical Cytogenetics: Current Advances and Future Perspectives)
Show Figures

Figure 1

Figure 1
<p>Peripheral Smear Findings. (<b>A</b>). Morphologic examination revealed scattered abnormal promyelocytes with variable cytoplasmic granules and occasional Auer rods. (<b>B</b>). Flow cytometry evaluation of the peripheral blood specimen shows an abnormal cell population sitting within the granulocytic gate (displayed in gray, over 80% of total analyzed cells). The lymphocytic gate is depicted in red. The abnormal population with high SSC shows co-expression of CD13, CD33, CD117, CD38, CD123, CD64 and cytoplasmic-MPO, while negative for CD34, HLA-DR, and all other markers tested.</p>
Full article ">Figure 2
<p>(<b>A</b>). Interphase FISH performed using the Vysis dual color dual fusion t(15;17) probe, showing normal signals for both <span class="html-italic">PML</span> (SpectrumOrange) and <span class="html-italic">RARA</span> (SpectrumGreen). (<b>B</b>). Retrospective FISH using the CytoCell <span class="html-italic">PML/RARA</span> dual color dual fusion probe set, which targets smaller regions, revealed a fusion in 74% of interphase cells. (<b>C</b>). Metaphase FISH also confirmed this cryptic rearrangement as depicted by the arrow.</p>
Full article ">Figure 3
<p>Optical Genome Mapping Results. (<b>A</b>). Genome browser view using the BAS software 1.8.1 showed an insertion in the <span class="html-italic">PML</span> gene at breakpoints 73995446 and 74023755 marked in red. The light blue horizontal bar represents the sample’s consensus map aligned with reference chromosome 15 map represented as a light green horizontal bar. The insertion in chromosome 15 in the middle is denoted by the gray lines. (<b>B</b>). Further analysis using the VIA software 7.0 confirmed the insertion in <span class="html-italic">PML</span> at intron 3 (bcr3 region). The insertion is shown as a blue bar in the SV events track (<b>C</b>). Genome browser view using the BAS software shows the missing alignment of OGM molecules on the <span class="html-italic">RARA</span> gene highlighted by the red arrows. (<b>D</b>). The low coverage of OGM molecules from exon 1 and 2 and the breakpoint in intron 2 of the <span class="html-italic">RARA</span> gene confirms the missing alignment of molecules using the VIA software. (<b>E</b>). The possible S-isoform with type A translocation/fusion was constructed based on the available breakpoints and intron 3 involvement of the <span class="html-italic">PML</span> gene.</p>
Full article ">
24 pages, 7683 KiB  
Article
Hybrid-DETR: A Differentiated Module-Based Model for Object Detection in Remote Sensing Images
by Mingji Yang, Rongyu Xu, Chunyu Yang, Haibin Wu and Aili Wang
Electronics 2024, 13(24), 5014; https://doi.org/10.3390/electronics13245014 - 20 Dec 2024
Viewed by 461
Abstract
Currently, embedded unmanned aerial vehicle (UAV) systems face significant challenges in balancing detection accuracy and computational efficiency when processing remote sensing images with complex backgrounds, small objects, and occlusions. This paper proposes the Hybrid-DETR model based on a real-time end-to-end Detection Transformer (RT-DETR), [...] Read more.
Currently, embedded unmanned aerial vehicle (UAV) systems face significant challenges in balancing detection accuracy and computational efficiency when processing remote sensing images with complex backgrounds, small objects, and occlusions. This paper proposes the Hybrid-DETR model based on a real-time end-to-end Detection Transformer (RT-DETR), featuring a novel HybridNet backbone network that implements a differentiated hybrid structure through lightweight RepConv Cross-stage Partial Efficient Layer Aggregation Network (RCSPELAN) modules and the Heat-Transfer Cross-stage Fusion (HTCF) modules, effectively balancing feature extraction efficiency and global perception capabilities. Additionally, we introduce a Small-Object Detection Module (SODM) and an EIFI module to enhance the detection capability of small objects in complex scenarios, while employing the Focaler-Shape-IoU loss function to optimize bounding box regression. Experimental results on the VisDrone2019 dataset demonstrate that Hybrid-DETR achieves mAP50 and mAP50:95 scores of 52.2% and 33.3%, respectively, representing improvements of 5.2% and 4.3% compared to RT-DETR-R18, while reducing model parameters by 29.33%. The effectiveness and robustness of our improved method are further validated on multiple challenging datasets, including AI-TOD and HIT-UAV. Full article
(This article belongs to the Special Issue New Insights in 2D and 3D Object Detection and Semantic Segmentation)
Show Figures

Figure 1

Figure 1
<p>RT-DETR network structure diagram.</p>
Full article ">Figure 2
<p>Hybrid-RTDETR network structure diagram.</p>
Full article ">Figure 3
<p>The P2 enhancement layer and PAFPN structure in Hybrid-DETR.</p>
Full article ">Figure 4
<p>The structure of RGCSPELAN, CSPNet, and ELAN. (<b>a</b>) RGCSPELAN; (<b>b</b>) CSPNet; (<b>c</b>) ELAN.</p>
Full article ">Figure 5
<p>The structure of the HTCF, bottleneck, HCOneck, and HCOneck. (<b>a</b>) HTCF; (<b>b</b>) bottleneck; (<b>c</b>) HCOneck; (<b>d</b>) HCOblock.</p>
Full article ">Figure 6
<p>The structure of HybridNet.</p>
Full article ">Figure 7
<p>Focaler-Shape-IoU definition diagram.</p>
Full article ">Figure 8
<p>Visualization of detection results with different datasets.</p>
Full article ">Figure 9
<p>Comparison of evaluation metrics between Hybrid-DETR and RT-DETR-R18.</p>
Full article ">Figure 10
<p>Precision–recall curve on the VisDrone validation set: (<b>a</b>) result of RT-DETR-R18; (<b>b</b>) result of Hybrid-DETR.</p>
Full article ">
18 pages, 3461 KiB  
Article
Dynamic Structure-Aware Modulation Network for Underwater Image Super-Resolution
by Li Wang, Ke Li, Chengang Dong, Keyong Shen and Yang Mu
Biomimetics 2024, 9(12), 774; https://doi.org/10.3390/biomimetics9120774 - 19 Dec 2024
Viewed by 384
Abstract
Image super-resolution (SR) is a formidable challenge due to the intricacies of the underwater environment such as light absorption, scattering, and color distortion. Plenty of deep learning methods have provided a substantial performance boost for SR. Nevertheless, these methods are not only computationally [...] Read more.
Image super-resolution (SR) is a formidable challenge due to the intricacies of the underwater environment such as light absorption, scattering, and color distortion. Plenty of deep learning methods have provided a substantial performance boost for SR. Nevertheless, these methods are not only computationally expensive but also often lack flexibility in adapting to severely degraded image statistics. To counteract these issues, we propose a dynamic structure-aware modulation network (DSMN) for efficient and accurate underwater SR. A Mixed Transformer incorporated a structure-aware Transformer block and multi-head Transformer block, which could comprehensively utilize local structural attributes and global features to enhance the details of underwater image restoration. Then, we devised a dynamic information modulation module (DIMM), which adaptively modulated the output of the Mixed Transformer with appropriate weights based on input statistics to highlight important information. Further, a hybrid-attention fusion module (HAFM) adopted spatial and channel interaction to aggregate more delicate features, facilitating high-quality underwater image reconstruction. Extensive experiments on benchmark datasets revealed that our proposed DSMN surpasses the most renowned SR methods regarding quantitative and qualitative metrics, along with less computational effort. Full article
(This article belongs to the Special Issue Exploration of Computer Vision and Pattern Recognition)
Show Figures

Figure 1

Figure 1
<p>The architecture of our proposed DSMN, which consists of a DIMM, Mixed Transformers, and HAFMs to progressively gather features rich in detail and enhance contrast.</p>
Full article ">Figure 2
<p>(<b>a</b>) Original Transformer with MSA. (<b>b</b>) Mixed Transformer that contains structure-aware Transformer block and multi-head Transformer block. MSC that aggregates multiple asymmetric convolutions with different kernel sizes.</p>
Full article ">Figure 3
<p>The architecture of the HAFM that autonomously aggregates discriminative features in multiple dimensions.</p>
Full article ">Figure 4
<p>Particular instances with the USR-248 and UFO-120 datasets. (<b>a</b>) The USR-248 dataset facilitates paired training with scale factors of ×2, ×4, and ×8. (<b>b</b>) The USR-248 dataset facilitates paired training with scale factors of ×2, ×3, and ×4.</p>
Full article ">Figure 5
<p>Visualized feature maps of different models. It is evident that, enhanced by the DIMM, Mixed Transformer, and HAFM, our DSMN demonstrated a stronger response in the target area, highlighting its robust information capture capabilities.</p>
Full article ">Figure 6
<p>Canny edge detection of different models. DSMN is capable of detecting detailed edge and structural texture information.</p>
Full article ">Figure 7
<p>PSNR/SSIM results achieved with UFO-120 dataset as network depth (<span class="html-italic">d</span>) increased. When <span class="html-italic">d</span> exceeded 4, the PSNR and SSIM performance gains diminished.</p>
Full article ">Figure 8
<p>Visual comparison of our proposed DSMN against popular works with USR-248 dataset.</p>
Full article ">Figure 9
<p>Comparison of model capacity and performance between our DSMN (red star) and dominant methods with USR-248 for scale factor ×2. One can see that our DSMN effectively balanced high accuracy and model capacity.</p>
Full article ">Figure 9 Cont.
<p>Comparison of model capacity and performance between our DSMN (red star) and dominant methods with USR-248 for scale factor ×2. One can see that our DSMN effectively balanced high accuracy and model capacity.</p>
Full article ">Figure 10
<p>Visual comparison of our proposed DSMN against popular works with UFO-120 dataset.</p>
Full article ">Figure 11
<p>Comparison of LAM attribution results against popular works for scale factor ×4. It can be seen that our DSMN pixels are informative and obtain the highest DI value.</p>
Full article ">
17 pages, 2272 KiB  
Article
Convolutional Neural Network–Vision Transformer Architecture with Gated Control Mechanism and Multi-Scale Fusion for Enhanced Pulmonary Disease Classification
by Okpala Chibuike and Xiaopeng Yang
Diagnostics 2024, 14(24), 2790; https://doi.org/10.3390/diagnostics14242790 - 12 Dec 2024
Viewed by 587
Abstract
Background/Objectives: Vision Transformers (ViTs) and convolutional neural networks (CNNs) have demonstrated remarkable performances in image classification, especially in the domain of medical imaging analysis. However, ViTs struggle to capture high-frequency components of images, which are critical in identifying fine-grained patterns, while CNNs have [...] Read more.
Background/Objectives: Vision Transformers (ViTs) and convolutional neural networks (CNNs) have demonstrated remarkable performances in image classification, especially in the domain of medical imaging analysis. However, ViTs struggle to capture high-frequency components of images, which are critical in identifying fine-grained patterns, while CNNs have difficulties in capturing long-range dependencies due to their local receptive fields, which makes it difficult to fully capture the spatial relationship across lung regions. Methods: In this paper, we proposed a hybrid architecture that integrates ViTs and CNNs within a modular component block(s) to leverage both local feature extraction and global context capture. In each component block, the CNN is used to extract the local features, which are then passed through the ViT to capture the global dependencies. We implemented a gated attention mechanism that combines the channel-, spatial-, and element-wise attention to selectively emphasize the important features, thereby enhancing overall feature representation. Furthermore, we incorporated a multi-scale fusion module (MSFM) in the proposed framework to fuse the features at different scales for more comprehensive feature representation. Results: Our proposed model achieved an accuracy of 99.50% in the classification of four pulmonary conditions. Conclusions: Through extensive experiments and ablation studies, we demonstrated the effectiveness of our approach in improving the medical image classification performance, while achieving good calibration results. This hybrid approach offers a promising framework for reliable and accurate disease diagnosis in medical imaging. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

Figure 1
<p>The proposed hybrid architecture.</p>
Full article ">Figure 2
<p>Gated mechanism with attention.</p>
Full article ">Figure 3
<p>Inception-styled multi-scale fusion module proposed in this study.</p>
Full article ">Figure 4
<p>A confusion matrix for the proposed model.</p>
Full article ">Figure 5
<p>Impact of different augmentation methods on original images.</p>
Full article ">Figure 6
<p>Impact of gated mechanism and multi-scale fusion using LIME explainability analysis.</p>
Full article ">
21 pages, 3197 KiB  
Article
Infrared Aircraft Detection Algorithm Based on High-Resolution Feature-Enhanced Semantic Segmentation Network
by Gang Liu, Jiangtao Xi, Chao Ma and Huixiang Chen
Sensors 2024, 24(24), 7933; https://doi.org/10.3390/s24247933 - 11 Dec 2024
Viewed by 491
Abstract
In order to achieve infrared aircraft detection under interference conditions, this paper proposes an infrared aircraft detection algorithm based on high-resolution feature-enhanced semantic segmentation network. Firstly, the designed location attention mechanism is utilized to enhance the current-level feature map by obtaining correlation weights [...] Read more.
In order to achieve infrared aircraft detection under interference conditions, this paper proposes an infrared aircraft detection algorithm based on high-resolution feature-enhanced semantic segmentation network. Firstly, the designed location attention mechanism is utilized to enhance the current-level feature map by obtaining correlation weights between pixels at different positions. Then, it is fused with the high-level feature map rich in semantic features to construct a location attention feature fusion network, thereby enhancing the representation capability of target features. Secondly, based on the idea of using dilated convolutions to expand the receptive field of feature maps, a hybrid atrous spatial pyramid pooling module is designed. By utilizing a serial structure of dilated convolutions with small dilation rates, this module addresses the issue of feature information loss when expanding the receptive field through dilated spatial pyramid pooling. It captures the contextual information of the target, further enhancing the target features. Finally, a dice loss function is introduced to calculate the overlap between the predicted results and the ground truth labels, facilitating deep excavation of foreground information for comprehensive learning of samples. This paper constructs an infrared aircraft detection algorithm based on a high-resolution feature-enhanced semantic segmentation network which combines the location attention feature fusion network, the hybrid atrous spatial pyramid pooling module, the dice loss function, and a network that maintains the resolution of feature maps. Experiments conducted on a self-built infrared dataset show that the proposed algorithm achieves a mean intersection over union (mIoU) of 92.74%, a mean pixel accuracy (mPA) of 96.34%, and a mean recall (MR) of 96.19%, all of which outperform classic segmentation algorithms such as DeepLabv3+, Segformer, HRNetv2, and DDRNet. This demonstrates that the proposed algorithm can achieve effective detection of infrared aircraft in the presence of interference. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

Figure 1
<p>HFSSNet architecture diagram.</p>
Full article ">Figure 2
<p>The structure of LAM.</p>
Full article ">Figure 3
<p>Location attention feature fusion network.</p>
Full article ">Figure 4
<p>Atrous spatial pyramid pooling.</p>
Full article ">Figure 5
<p>Dilated convolution kernel.</p>
Full article ">Figure 6
<p>Calculation diagram of HASPP serial structure.</p>
Full article ">Figure 7
<p>Hybrid atrous spatial pyramid pooling.</p>
Full article ">Figure 8
<p>Infrared aircraft image with simulated interference under sky background.</p>
Full article ">Figure 9
<p>Infrared aircraft image with simulated interference under ground background.</p>
Full article ">Figure 10
<p>Visual effect of location attention mechanism.</p>
Full article ">Figure 11
<p>The comparison between HRNetv2 and HRNetv2+LAFFN for segmentation results.</p>
Full article ">Figure 12
<p>The comparison between HRNetv2 and HRNetv2+HASPP for segmentation results.</p>
Full article ">Figure 13
<p>The comparison between HRNetv2 and HRNetv2+dice loss for segmentation results.</p>
Full article ">Figure 14
<p>The segmentation results of different algorithms.</p>
Full article ">
20 pages, 3968 KiB  
Article
HybridFusionNet: Deep Learning for Multi-Stage Diabetic Retinopathy Detection
by Amar Shukla, Shamik Tiwari and Anurag Jain
Technologies 2024, 12(12), 256; https://doi.org/10.3390/technologies12120256 - 11 Dec 2024
Viewed by 669
Abstract
Diabetic retinopathy (DR) is one of the most common causes of visual impairment worldwide and requires reliable automated detection methods. Numerous research efforts have developed various conventional methods for early detection of DR. Research in the field of DR remains insufficient, indicating the [...] Read more.
Diabetic retinopathy (DR) is one of the most common causes of visual impairment worldwide and requires reliable automated detection methods. Numerous research efforts have developed various conventional methods for early detection of DR. Research in the field of DR remains insufficient, indicating the potential for advances in diagnosis. In this paper, a hybrid model (HybridFusionNet) that integrates vision transformer (VIT) and attention processes is presented. It improves classification in the binary (Bcl) and multi-class (Mcl) stages by utilizing deep features from the DR stages. As a result, both the SAN and VIT models improve the recognition accuracy (Acc) in both stages.The HybridFusionNet mechanism achieves a competitive improvement in multi-stage and binary stages, which is Acc in Bcl and Mcl, with 91% and 99%, respectively. This illustrates that this model is suitable for a better diagnosis of DR. Full article
Show Figures

Figure 1

Figure 1
<p>Intensity of the different classes of DR.</p>
Full article ">Figure 2
<p>Architecture of HybridFusionNet.</p>
Full article ">Figure 3
<p>Self attained architecture.</p>
Full article ">Figure 4
<p>Vision transformer.</p>
Full article ">Figure 5
<p>SAN Evaluation parameter. (<b>a</b>) Describes the <math display="inline"><semantics> <msub> <mi>t</mi> <mi>n</mi> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>V</mi> <mi>d</mi> </msub> </semantics></math> curves obtained in the <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>b</b>) Describes the <math display="inline"><semantics> <msub> <mi>t</mi> <mi>n</mi> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>V</mi> <mi>d</mi> </msub> </semantics></math> curves obtained in the <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>c</b>) Shows the <math display="inline"><semantics> <msub> <mi>C</mi> <mi>m</mi> </msub> </semantics></math> evaluation for the <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>d</b>) Shows the <math display="inline"><semantics> <msub> <mi>C</mi> <mi>m</mi> </msub> </semantics></math> evaluation for the <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>e</b>) Shows the ROC achieved for the <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>f</b>) Shows the achieved ROC for the <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification.</p>
Full article ">Figure 6
<p>VIT evaluation parameter. (<b>a</b>) Describes the <math display="inline"><semantics> <msub> <mi>t</mi> <mi>n</mi> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>V</mi> <mi>d</mi> </msub> </semantics></math> curves obtained in the <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>b</b>) Describes the <math display="inline"><semantics> <msub> <mi>t</mi> <mi>n</mi> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>V</mi> <mi>d</mi> </msub> </semantics></math> curves obtained in the <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>c</b>) Shows the <math display="inline"><semantics> <msub> <mi>C</mi> <mi>m</mi> </msub> </semantics></math> evaluation for the <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>d</b>) Shows the <math display="inline"><semantics> <msub> <mi>C</mi> <mi>m</mi> </msub> </semantics></math> evaluation for the <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>e</b>) Shows the ROC achieved for the <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>f</b>) Shows the ROC achieved for the <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification.</p>
Full article ">Figure 7
<p>HybridFusionNet evaluation parameters. (<b>a</b>) Describes the <math display="inline"><semantics> <msub> <mi>t</mi> <mi>n</mi> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>V</mi> <mi>d</mi> </msub> </semantics></math> curves achieved in the <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>b</b>) Describes the <math display="inline"><semantics> <msub> <mi>t</mi> <mi>n</mi> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>V</mi> <mi>d</mi> </msub> </semantics></math> curves obtained in the <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>c</b>) Shows the <math display="inline"><semantics> <msub> <mi>C</mi> <mi>m</mi> </msub> </semantics></math> evaluation for the <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>d</b>) Shows the <math display="inline"><semantics> <msub> <mi>C</mi> <mi>m</mi> </msub> </semantics></math> evaluation for the <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>e</b>) Shows the ROC achieved for the <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification. (<b>f</b>) Shows the ROC achieved for the <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mi>l</mi> </mrow> </msub> </semantics></math> classification.</p>
Full article ">Figure 8
<p>Evaluation of different classes using trending models.</p>
Full article ">Figure 9
<p>Performance analysis of trending and proposed models.</p>
Full article ">
31 pages, 2960 KiB  
Review
A Survey on Deep Learning for Few-Shot PolSAR Image Classification
by Ningwei Wang, Weiqiang Jin, Haixia Bi, Chen Xu and Jinghuai Gao
Remote Sens. 2024, 16(24), 4632; https://doi.org/10.3390/rs16244632 - 11 Dec 2024
Viewed by 531
Abstract
Few-shot classification of polarimetric synthetic aperture radar (PolSAR) images is a challenging task due to the scarcity of labeled data and the complex scattering properties of PolSAR data. Traditional deep learning models often suffer from overfitting and catastrophic forgetting in such settings. Recent [...] Read more.
Few-shot classification of polarimetric synthetic aperture radar (PolSAR) images is a challenging task due to the scarcity of labeled data and the complex scattering properties of PolSAR data. Traditional deep learning models often suffer from overfitting and catastrophic forgetting in such settings. Recent advancements have explored innovative approaches, including data augmentation, transfer learning, meta-learning, and multimodal fusion, to address these limitations. Data augmentation methods enhance the diversity of training samples, with advanced techniques like generative adversarial networks (GANs) generating realistic synthetic data that reflect PolSAR’s polarimetric characteristics. Transfer learning leverages pre-trained models and domain adaptation techniques to improve classification across diverse conditions with minimal labeled samples. Meta-learning enhances model adaptability by learning generalizable representations from limited data. Multimodal methods integrate complementary data sources, such as optical imagery, to enrich feature representation. This survey provides a comprehensive review of these strategies, focusing on their advantages, limitations, and potential applications in PolSAR classification. We also identify key trends, such as the increasing role of hybrid models combining multiple paradigms and the growing emphasis on explainability and domain-specific customization. By synthesizing SOTA approaches, this survey offers insights into future directions for advancing few-shot PolSAR classification. Full article
(This article belongs to the Special Issue SAR and Multisource Remote Sensing: Challenges and Innovations)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>PolSAR image classification process.</p>
Full article ">Figure 2
<p>Knowledge graph of key concepts in Few-Shot PolSAR image classification. Nodes represent key concepts, methods, or techniques, while edges indicate the relationships or dependencies between them. Node colors correspond to methodological categories: orange for data augmentation-based methods, green for transfer learning-based methods, blue for meta-learning-based methods, and pink for multimodal-based methods. The graph was constructed based on literature analysis and keyword extraction, with relationships derived from established dependencies in the field.</p>
Full article ">Figure 3
<p>Overview of the GAN-based PolSAR data augmentation pipeline, illustrating the flow from raw PolSAR input to the final loss optimization stage for both the generator and discriminator.</p>
Full article ">Figure 4
<p>Overview of the self-supervised learning framework for PolSAR image classification.</p>
Full article ">Figure 5
<p>Data division in meta-learning-based few-shot PolSAR classification, showing the training and testing stages with support and query sets for each task.</p>
Full article ">Figure 6
<p>The diagram illustrates the multimodal feature extraction and fusion process. Coherency matrices and target decomposition features (Pauli, Freeman, <math display="inline"><semantics> <mrow> <mi>H</mi> <mo>/</mo> <mi>A</mi> <mo>/</mo> <mi>α</mi> </mrow> </semantics></math>) are separately processed through feature extraction pipelines (blue background) to capture complementary spatial, polarimetric, and semantic information. The final loss computation (<math display="inline"><semantics> <msub> <mi>L</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> </mrow> </msub> </semantics></math>) integrates these features to optimize classification performance.</p>
Full article ">
35 pages, 19129 KiB  
Article
Mapping Lithology with Hybrid Attention Mechanism–Long Short-Term Memory: A Hybrid Neural Network Approach Using Remote Sensing and Geophysical Data
by Michael Appiah-Twum, Wenbo Xu and Emmanuel Daanoba Sunkari
Remote Sens. 2024, 16(23), 4613; https://doi.org/10.3390/rs16234613 - 9 Dec 2024
Viewed by 698
Abstract
Remote sensing provides an efficient roadmap in geological analysis and interpretation. However, some challenges arise when remote sensing techniques are integrated with machine learning in geological surveys. Factors including irregular spatial distribution, sample imbalance, interclass resemblances, regolith, and geochemical similarities impede geological feature [...] Read more.
Remote sensing provides an efficient roadmap in geological analysis and interpretation. However, some challenges arise when remote sensing techniques are integrated with machine learning in geological surveys. Factors including irregular spatial distribution, sample imbalance, interclass resemblances, regolith, and geochemical similarities impede geological feature diagnosis, interpretation, and identification across varied remote sensing datasets. To address these limitations, a hybrid-attention-integrated long short-term memory (LSTM) network is employed to diagnose, interpret, and identify lithological feature representations in a remote sensing-based geological analysis using multisource data fusion. The experimental design integrates varied datasets including Sentinel-2A, Landsat-9, ASTER, ALOS PALSAR DEM, and Bouguer anomaly gravity data. The proposed model incorporates a hybrid attention mechanism (HAM) comprising channel and spatial attention submodules. HAM utilizes an adaptive technique that merges global-average-pooled features with max-pooled features, enhancing the model’s accuracy in identifying lithological units. Additionally, a channel separation operation is employed to allot refined channel features into clusters based on channel attention maps along the channel dimension. The comprehensive analysis of results from comparative extensive experiments demonstrates HAM-LSTM’s state-of-the-art performance, outperforming existing attention modules and attention-based models (ViT, SE-LSTM, and CBAM-LSTM). Comparing HAM-LSTM to baseline LSTM, the HAM module’s integrated configurations equip the proposed model to better diagnose and identify lithological units, thereby increasing the accuracy by 3.69%. Full article
Show Figures

Figure 1

Figure 1
<p>An overview of this study’s workflow: The multisource data fusion technique is employed to fuse the gravity anomaly data and remote sensing data. Channel and spatial attention mechanisms are modeled to learn the spatial and spectral information of pixels in the fused data and the resultant attention features, fed into the LSTM network for sequential iterative processing to map lithology.</p>
Full article ">Figure 2
<p>Location of study area and regional geological setting. (<b>a</b>) Administrative map of Burkina Faso; (<b>b</b>) administrative map of Bougouriba and Ioba Provinces within which the study area is located; (<b>c</b>) geological overview of Burkina Faso (modified from [<a href="#B44-remotesensing-16-04613" class="html-bibr">44</a>]) indicating the study area; (<b>d</b>) color composite image of Landsat-9 covering the study area.</p>
Full article ">Figure 3
<p>False color composite imagery of remote sensing data used: (<b>a</b>) Sentinel-2A (bands 4-3-2); (<b>b</b>) Landsat-9 (bands 4-3-2); (<b>c</b>) ASTER (bands 3-2-1); and (<b>d</b>) 12.5 m spatial resolution high-precision ALOS PALSAR DEM.</p>
Full article ">Figure 4
<p>Vegetation masking workflow.</p>
Full article ">Figure 5
<p>The HAM structure. It comprises three sequential components: channel attention submodule, feature separation chamber, and spatial attention submodule. One-dimensional and two-dimensional feature maps are produced by the channel and spatial attention submodules, respectively.</p>
Full article ">Figure 6
<p>Framework of HAM’s channel attention submodule. Dimensional feature information is generated by both max-pooling and average-pooling operations. The resultant features are then fed through a one-dimensional convolution with a sigmoid activation to deduce the definitive channel feature.</p>
Full article ">Figure 7
<p>Framework of HAM’s spatial attention. Two feature clusters of partitioned refined channel features from the separation chamber are fed into the submodule. Average-pooling and max-pooling functions subsequently synthesize two pairs of 2D maps into a shared convolution layer to synthesize spatial attention maps.</p>
Full article ">Figure 8
<p>The structural framework of the proposed HAM-LSTM model.</p>
Full article ">Figure 9
<p>Gravity anomaly maps of the terrane used: (<b>a</b>) complete Bouguer anomaly; (<b>b</b>) residual gravity.</p>
Full article ">Figure 10
<p>Band imagery: (<b>a</b>) Landsat-9 band 5; (<b>b</b>) Sentinel-2A band 5; (<b>c</b>) ASTER band 5; (<b>d</b>) fused image; (<b>e</b>) partial magnification of (<b>a</b>) (<math display="inline"><semantics> <mrow> <mn>279</mn> <mo>×</mo> <mn>235</mn> </mrow> </semantics></math> pixels); (<b>f</b>) partial magnification of (<b>b</b>) (<math display="inline"><semantics> <mrow> <mn>279</mn> <mo>×</mo> <mn>235</mn> </mrow> </semantics></math> pixels); (<b>g</b>) partial magnification of (<b>c</b>) (<math display="inline"><semantics> <mrow> <mn>279</mn> <mo>×</mo> <mn>235</mn> </mrow> </semantics></math> pixels); and (<b>h</b>) partial magnification of (<b>d</b>) (<math display="inline"><semantics> <mrow> <mn>279</mn> <mo>×</mo> <mn>235</mn> </mrow> </semantics></math> pixels).</p>
Full article ">Figure 11
<p>Resultant multisource fusion imagery.</p>
Full article ">Figure 12
<p>Annotation map of the study area.</p>
Full article ">Figure 13
<p>An illustration of the sliding window method implementation.</p>
Full article ">Figure 14
<p>Graphs of training performance of the varied model implementations in this study: (<b>a</b>) accuracy and (<b>b</b>) loss.</p>
Full article ">Figure 15
<p>Classification maps derived from implementing (<b>a</b>) HAM-LSTM, (<b>b</b>) CBAM-LSTM, (<b>c</b>) SE-LSTM, (<b>d</b>) ViT, and (<b>e</b>) LSTM on the multisource fusion dataset.</p>
Full article ">Figure 16
<p>Confusion matrices of (<b>a</b>) HAM-LSTM, (<b>b</b>) CBAM-LSTM, (<b>c</b>) SE-LSTM, (<b>d</b>) LSTM, and (<b>e</b>) ViT implementation.</p>
Full article ">
31 pages, 8308 KiB  
Article
Topology Optimization, Part Orientation, and Symmetry Operations as Elements of a Framework for Design and Production Planning Process in Additive Manufacturing L-PBF Technology
by Slobodan Malbašić, Aleksandar Đorđević, Srđan Živković, Dragan Džunić and Vlada Sokolović
Symmetry 2024, 16(12), 1616; https://doi.org/10.3390/sym16121616 - 6 Dec 2024
Viewed by 459
Abstract
This paper investigates the possibility of the application of different optimization techniques in the design and production planning phase in the metal additive manufacturing process, specifically laser powder bed fusion (L-PBF) additive technology. This technology has a significant market share and belongs to [...] Read more.
This paper investigates the possibility of the application of different optimization techniques in the design and production planning phase in the metal additive manufacturing process, specifically laser powder bed fusion (L-PBF) additive technology. This technology has a significant market share and belongs to the group of mature additive technology for the production of end-use metal parts. In the application of this technology, there is a space for additional cost/time reduction by simultaneously optimizing topology structure and part orientations. Simultaneous optimization reduces the production time and, indirectly, the cost of parts production, which is the goal of effective process planning. The novelty in this paper is the comparison of the part orientation solutions defined by the software algorithm and the experienced operator, where the optimal result was selected from the aspect of time and production costs. A feature recognition method together with symmetry operations in the part orientation process were also examined. A framework for the optimal additive manufacturing planning process has been proposed. This framework consists of design and production planning phases, within which there are several other activities: the redesign of the part, topological optimization, the creation of alternative build orientations (ABOs), and, as a final step, the selection of the optimal build orientation (OBO) using the multi-criteria decision method (MCDM). The results obtained after the MCDM hybrid method application clearly indicated that simultaneous topology optimization and part orientation has significant influence on the cost and time of the additive manufacturing process. The paper also proposed a further research direction that should take into consideration the mechanical as well as geometric, dimensioning and tolerances (GDT) characteristics of the part during the process of ABOs and OBO, as well as the uses of symmetry in these fields. Full article
(This article belongs to the Special Issue Symmetry in Process Optimization)
Show Figures

Figure 1

Figure 1
<p>The overview of the new framework for the AM planning phase.</p>
Full article ">Figure 2
<p>Steps in the topology optimization process.</p>
Full article ">Figure 3
<p>Scheme of twofold orientation and an inversion: position <b>A</b>—original position, position <b>B</b>—rotation of object A for 180 degrees of Celsius (mark twofold rotation in the direction of the curved dotted arrow), position <b>C</b> inversion of object B (in the direction of straight dotted arrow), position <b>D</b>—reflection of object B as well as inversion operation of object A.</p>
Full article ">Figure 4
<p>Scheme of the proposed MCDM process.</p>
Full article ">Figure 5
<p>Procedure for the selection of the AM part for optimization/production.</p>
Full article ">Figure 6
<p>MFMP display: (<b>a</b>) as part of suspension system and (<b>b</b>) individually.</p>
Full article ">Figure 7
<p>Color distribution of the displacement on the model.</p>
Full article ">Figure 8
<p>Von Mises equivalent stress field before optimization.</p>
Full article ">Figure 9
<p>Initial design (main and construction bodies) for topology optimization.</p>
Full article ">Figure 10
<p>Fixed constraints and load force (arrow): load force is pointed upward transferring ground movement up to the next support were shock absorber is attached.</p>
Full article ">Figure 11
<p>Final optimized part.</p>
Full article ">Figure 12
<p>Displacement on the optimized part.</p>
Full article ">Figure 13
<p>Von Mises equivalent stress after optimization.</p>
Full article ">Figure 14
<p>Optimized part mounted into the assembly.</p>
Full article ">Figure 15
<p>Part orientation on the working plate chosen from the software solution: (<b>a</b>) orientation that provide minimum support surface (<b>b</b>) orientation that minimize XY projection.</p>
Full article ">Figure 16
<p>Feature recognition on the optimized part.</p>
Full article ">Figure 17
<p>Part orientation on the working plate defined by the operator: (<b>a</b>) planar feature 3 parallel with working plate (orientation 3), (<b>b</b>) planar feature 3 rotated 180 degrees (orientation 4), (<b>c</b>) cylindrical features 1 and 2 have parallel axes with build direction (orientation 5), (<b>d</b>) planar planes and working plate are under some degree (orientation 6), (<b>e</b>) parallel axes of main cylindrical feature with build direction (this orientation corresponds to the minimum Z-height orientation from the software) (orientation 7).</p>
Full article ">Figure 18
<p>Use of symmetry operation in part orientation for the presented case study: (<b>a</b>) Object a inversion to obtain symmetrical new object b which is positioned 180 degrees opposite to the original one (in the direction of the dotted arrow). (<b>b</b>) The same symmetry operation (inversion) is applied to the asymmetric unit c in order to obtain symmetry part e (in the direction of the dotted arrow).</p>
Full article ">
24 pages, 11173 KiB  
Article
Advanced State-of-Health Estimation for Lithium-Ion Batteries Using Multi-Feature Fusion and KAN-LSTM Hybrid Model
by Zhao Zhang, Runrun Zhang, Xin Liu, Chaolong Zhang, Gengzhi Sun, Yujie Zhou, Zhong Yang, Xuming Liu, Shi Chen, Xinyu Dong, Pengyu Jiang and Zhexuan Sun
Batteries 2024, 10(12), 433; https://doi.org/10.3390/batteries10120433 - 6 Dec 2024
Viewed by 680
Abstract
Accurate assessment of battery State of Health (SOH) is crucial for the safe and efficient operation of electric vehicles (EVs), which play a significant role in reducing reliance on non-renewable energy sources. This study introduces a novel SOH estimation method combining Kolmogorov–Arnold Networks [...] Read more.
Accurate assessment of battery State of Health (SOH) is crucial for the safe and efficient operation of electric vehicles (EVs), which play a significant role in reducing reliance on non-renewable energy sources. This study introduces a novel SOH estimation method combining Kolmogorov–Arnold Networks (KAN) and Long Short-Term Memory (LSTM) networks. The method is based on fully charged battery characteristics, extracting key parameters such as voltage, temperature, and charging data collected during cycles. Validation was conducted under a temperature range of 10 °C to 30 °C and different charge–discharge current rates. Notably, temperature variations were primarily caused by seasonal changes, enabling the experiments to more realistically simulate the battery’s performance in real-world applications. By enhancing dynamic modeling capabilities and capturing long-term temporal associations, experimental results demonstrate that the method achieves highly accurate SOH estimation under various charging conditions, with low mean absolute error (MAE) and root mean square error (RMSE) values and a coefficient of determination (R2) exceeding 97%, significantly improving prediction accuracy and efficiency. Full article
(This article belongs to the Special Issue Control, Modelling, and Management of Batteries)
Show Figures

Figure 1

Figure 1
<p>Experimental equipment diagram.</p>
Full article ">Figure 2
<p>Voltage and current curves while charging.</p>
Full article ">Figure 3
<p>Variation of voltage curve with charging times in constant current charging state.</p>
Full article ">Figure 4
<p>Characteristic pattern. (<b>a</b>) Constant current charging time; (<b>b</b>) The amount of electricity charged at constant voltage; (<b>c</b>) Constant-voltage charging time; (<b>d</b>) Integral of temperature.</p>
Full article ">Figure 5
<p>Curve of current versus charging times in constant-voltage charging state.</p>
Full article ">Figure 6
<p>Relational graph.</p>
Full article ">Figure 7
<p>KAN model structure.</p>
Full article ">Figure 8
<p>LSTM Model Structure.</p>
Full article ">Figure 9
<p>Flow chart of the experimental steps.</p>
Full article ">Figure 10
<p>The curve of SOH versus the number of charging times.</p>
Full article ">Figure 11
<p>Graph of the prediction results of the KAN-LSTM model.</p>
Full article ">Figure 12
<p>Comparison of the prediction results of each model.</p>
Full article ">Figure 13
<p>The curve of average temperature versus the number of charging times.</p>
Full article ">Figure 14
<p>Figure of the prediction results of the KAN-LSTM model under the condition of missing temperature features.</p>
Full article ">Figure 15
<p>Model performance on the NASA dataset.</p>
Full article ">Figure 15 Cont.
<p>Model performance on the NASA dataset.</p>
Full article ">Figure 16
<p>Prediction results across batteries at different charging rates.</p>
Full article ">
19 pages, 3861 KiB  
Article
A Novel Temporal Fusion Channel Network with Multi-Channel Hybrid Attention for the Remaining Useful Life Prediction of Rolling Bearings
by Cunsong Wang, Junjie Jiang, Heng Qi, Dengfeng Zhang and Xiaodong Han
Processes 2024, 12(12), 2762; https://doi.org/10.3390/pr12122762 - 5 Dec 2024
Viewed by 467
Abstract
The remaining useful life (RUL) prediction of rolling bearings is crucial for optimizing maintenance schedules, reducing downtime, and extending machinery lifespan. However, existing multi-channel feature fusion methods do not fully capture the correlations between channels and time points in multi-dimensional sensor data. To [...] Read more.
The remaining useful life (RUL) prediction of rolling bearings is crucial for optimizing maintenance schedules, reducing downtime, and extending machinery lifespan. However, existing multi-channel feature fusion methods do not fully capture the correlations between channels and time points in multi-dimensional sensor data. To address the above problems, this paper proposes a multi-channel feature fusion algorithm based on a hybrid attention mechanism and temporal convolutional networks (TCNs), called MCHA-TFCN. The model employs a dual-channel hybrid attention mechanism, integrating self-attention and channel attention to extract spatiotemporal features from multi-channel inputs. It uses causal dilated convolutions in TCNs to capture long-term dependencies and incorporates enhanced residual structures for global feature fusion, effectively extracting high-level spatiotemporal degradation information. The experimental results on the PHM2012 dataset show that MCHA-TFCN achieves excellent performance, with an average Root-Mean-Square Error (RMSE) of 0.091, significantly outperforming existing methods like the DANN and CNN-LSTM. Full article
Show Figures

Figure 1

Figure 1
<p>Schematic diagram of traditional convolution calculation.</p>
Full article ">Figure 2
<p>Schematic diagram of dilated convolution calculation.</p>
Full article ">Figure 3
<p>Schematic diagram of dilated causal convolution calculation.</p>
Full article ">Figure 4
<p>Structural diagram of TCN residual block.</p>
Full article ">Figure 5
<p>Schematic diagram of dilated convolution with residual connection.</p>
Full article ">Figure 6
<p>Schematic diagram of channel attention mechanism.</p>
Full article ">Figure 7
<p>The RUL prediction process of the proposed method.</p>
Full article ">Figure 8
<p>Diagram of DCHA structure.</p>
Full article ">Figure 9
<p>MCHA-TFCN structure diagram.</p>
Full article ">Figure 10
<p>Rolling bearing test bench.</p>
Full article ">Figure 11
<p>Display of bearing prediction effect under working condition 1.</p>
Full article ">Figure 12
<p>Display of bearing prediction effect under working condition 2.</p>
Full article ">
17 pages, 9263 KiB  
Article
HHS-RT-DETR: A Method for the Detection of Citrus Greening Disease
by Yi Huangfu, Zhonghao Huang, Xiaogang Yang, Yunjian Zhang, Wenfeng Li, Jie Shi and Linlin Yang
Agronomy 2024, 14(12), 2900; https://doi.org/10.3390/agronomy14122900 - 4 Dec 2024
Viewed by 520
Abstract
Background: Given the severe economic burden that citrus greening disease imposes on fruit farmers and related industries, rapid and accurate disease detection is particularly crucial. This not only effectively curbs the spread of the disease, but also significantly reduces reliance on manual detection [...] Read more.
Background: Given the severe economic burden that citrus greening disease imposes on fruit farmers and related industries, rapid and accurate disease detection is particularly crucial. This not only effectively curbs the spread of the disease, but also significantly reduces reliance on manual detection within extensive citrus planting areas. Objective: In response to this challenge, and to address the issues posed by resource-constrained platforms and complex backgrounds, this paper designs and proposes a novel method for the recognition and localization of citrus greening disease, named the HHS-RT-DETR model. The goal of this model is to achieve precise detection and localization of the disease while maintaining efficiency. Methods: Based on the RT-DETR-r18 model, the following improvements are made: the HS-FPN (high-level screening-feature pyramid network) is used to improve the feature fusion and feature selection part of the RT-DETR model, and the filtered feature information is merged with the high-level features by filtering out the low-level features, so as to enhance the feature selection ability and multi-level feature fusion ability of the model. In the feature fusion and feature selection sections, the HWD (hybrid wavelet-directional filter banks) downsampling operator is introduced to prevent the loss of effective information in the channel and reduce the computational complexity of the model. Through using the ShapeIoU loss function to enable the model to focus on the shape and scale of the bounding box itself, the prediction of the bounding box of the model will be more accurate. Conclusions and Results: This study has successfully developed an improved HHS-RT-DETR model which exhibits efficiency and accuracy on resource-constrained platforms and offers significant advantages for the automatic detection of citrus greening disease. Experimental results show that the improved model, when compared to the RT-DETR-r18 baseline model, has achieved significant improvements in several key performance metrics: the precision increased by 7.9%, the frame rate increased by 4 frames per second (f/s), the recall rose by 9.9%, and the average accuracy also increased by 7.5%, while the number of model parameters reduced by 0.137×107. Moreover, the improved model has demonstrated outstanding robustness in detecting occluded leaves within complex backgrounds. This provides strong technical support for the early detection and timely control of citrus greening disease. Additionally, the improved model has showcased advanced detection capabilities on the PASCAL VOC dataset. Discussions: Future research plans include expanding the dataset to encompass a broader range of citrus species and different stages of citrus greening disease. In addition, the plans involve incorporating leaf images under various lighting conditions and different weather scenarios to enhance the model’s generalization capabilities, ensuring the accurate localization and identification of citrus greening disease in diverse complex environments. Lastly, the integration of the improved model into an unmanned aerial vehicle (UAV) system is envisioned to enable the real-time, regional-level precise localization of citrus greening disease. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>Images of selected greening datasets (images of rock sugar oranges, Wokan oranges, and grapefruits in natural and simple backgrounds).</p>
Full article ">Figure 2
<p>Partial results of dataset expansion.</p>
Full article ">Figure 3
<p>HHS-RT-DETR model structure (the structural diagram of the improved model).</p>
Full article ">Figure 4
<p>Structure of feature selection module (feature selection network structure in HS-FPN network).</p>
Full article ">Figure 5
<p>Structure of SPFF feature fusion module (feature fusion network structure in HS-FPN network).</p>
Full article ">Figure 6
<p>ChannelAttention_HSFPN structure.</p>
Full article ">Figure 7
<p>Feature selection module (feature selection network module in ChannelAttention-HSFPN network).</p>
Full article ">Figure 8
<p>Feature fusion module (feature fusion network module in ChannelAttention-HSFPN network).</p>
Full article ">Figure 9
<p>HWD module structure.</p>
Full article ">Figure 10
<p>Comparison of detection effect between the RT-DETR-r18 model and HS-RT-DETR model. (<b>a</b>) on the left presents the detection results of the original RT-DETR-r18 model, while figure (<b>b</b>) on the right displays the outcomes of the enhanced HHS-RT-DETR model. A comparison between the two figures reveals that the area indicated by the yellow arrow was not detected by the original model, but it has been successfully identified in the improved model.</p>
Full article ">Figure 11
<p>HWD module compared with other modules to reduce the loss of context information (comparison of HWD downsampling method with max pooling, average pooling, and strided convolution methods).</p>
Full article ">Figure 12
<p>Comparison curves of different loss functions.</p>
Full article ">Figure 13
<p>Comparison of the thermal map effect between the original model and the improved model ((<b>a</b>) is the original image, (<b>b</b>) is the heatmap of object detection from the HHS-RT-DETR model, and (<b>c</b>) is the heatmap of object detection from the RT-DETR-r18 benchmark model).</p>
Full article ">Figure 14
<p>Comparison curves of different models.</p>
Full article ">
23 pages, 11893 KiB  
Article
A High-Impedance Fault Detection Method for Active Distribution Networks Based on Time–Frequency–Space Domain Fusion Features and Hybrid Convolutional Neural Network
by Chen Wang, Lijun Feng, Sizu Hou, Guohui Ren and Tong Lu
Processes 2024, 12(12), 2712; https://doi.org/10.3390/pr12122712 - 1 Dec 2024
Viewed by 499
Abstract
Traditional methods for detecting high-impedance faults (HIFs) in distribution networks primarily rely on constructing fault diagnosis models using one-dimensional zero-sequence current sequences. A single diagnostic model often limits the deep exploration of fault characteristics. To improve the accuracy of HIF detection, a new [...] Read more.
Traditional methods for detecting high-impedance faults (HIFs) in distribution networks primarily rely on constructing fault diagnosis models using one-dimensional zero-sequence current sequences. A single diagnostic model often limits the deep exploration of fault characteristics. To improve the accuracy of HIF detection, a new method for detecting HIFs in active distribution networks is proposed. First, by applying continuous wavelet transform (CWT) to the collected zero-sequence currents under various operating conditions, the time–frequency spectrum (TFS) is obtained. An optimized algorithm, modified empirical wavelet transform (MEWT), is then used to denoise the zero-sequence current signals, resulting in a series of intrinsic mode functions (IMFs). Secondly, the intrinsic mode functions (IMFs) are transformed into a two-dimensional spatial domain fused image using the symmetric dot pattern (SDP). Finally, the TFS and SDP images are synchronized as inputs to a hybrid convolutional neural network (Hybrid-CNN) to fully explore the system’s fault features. The Sigmoid function is utilized to achieve HIF detection, followed by simulation and experimental validation. The results indicate that the proposed method can effectively overcome the issues of traditional methods, achieving a detection accuracy of up to 98.85% across different scenarios, representing a 2–7% improvement over single models. Full article
(This article belongs to the Section Advanced Digital and Other Processes)
Show Figures

Figure 1

Figure 1
<p>Structural model of 10 kV active distribution network.</p>
Full article ">Figure 2
<p>Transient signal of HIF current.</p>
Full article ">Figure 3
<p>Fault transient current signal time–frequency spectrum. (<b>a</b>) HIF; (<b>b</b>) NLLS; (<b>c</b>) CS; (<b>d</b>) LS; (<b>e</b>) IC.</p>
Full article ">Figure 4
<p>The flowchart of the MEWT.</p>
Full article ">Figure 5
<p>EWT decomposition results of HIF zero-sequence current. (<b>a</b>) EWT; (<b>b</b>) MEWT.</p>
Full article ">Figure 6
<p>The basic principle of SDP.</p>
Full article ">Figure 7
<p>Spatial domain images of fault features under various conditions. (<b>a</b>) HIF; (<b>b</b>) NLLS; (<b>c</b>) CS; (<b>d</b>) LS; (<b>e</b>) IC.</p>
Full article ">Figure 8
<p>HIF detection scheme based on the hybrid convolutional network. (<b>a</b>) HIF detection framework; (<b>b</b>) HIF detection flowchart.</p>
Full article ">Figure 9
<p>SE attention mechanism module network structure.</p>
Full article ">Figure 10
<p>HIF Emanuel model.</p>
Full article ">Figure 11
<p>Accuracy curves for different network structures. (<b>a</b>) Training sample accuracy; (<b>b</b>) validation sample accuracy.</p>
Full article ">Figure 12
<p>Hybrid neural network feature visualization results: (<b>a</b>) input layer; (<b>b</b>) hybrid modular fully connected layer.</p>
Full article ">Figure 13
<p>Model training accuracy: (<b>a</b>) training set confusion matrix; (<b>b</b>) test set confusion matrix.</p>
Full article ">Figure 14
<p>Performance comparison of the model under different data split ratios: (<b>a</b>) accuracy; (<b>b</b>) loss.</p>
Full article ">Figure 15
<p>CNN interpretability analysis results. (<b>a</b>) HIF; (<b>b</b>) NLLS; (<b>c</b>) CS; (<b>d</b>) LS; (<b>e</b>) IC.</p>
Full article ">Figure 16
<p>VGG16 interpretability analysis results. (<b>a</b>) HIF; (<b>b</b>) NLLS; (<b>c</b>) CS; (<b>d</b>) LS; (<b>e</b>) IC.</p>
Full article ">Figure 17
<p>Improved VGG16 interpretability analysis results. (<b>a</b>) HIF; (<b>b</b>) NLLS; (<b>c</b>) CS; (<b>d</b>) LS; (<b>e</b>) IC.</p>
Full article ">Figure 18
<p>Interpretability analysis results of the proposed method. (<b>a</b>) HIF; (<b>b</b>) NLLS; (<b>c</b>) CS; (<b>d</b>) LS; (<b>e</b>) IC.</p>
Full article ">Figure 19
<p>TFS of test data. (<b>a</b>) HIF; (<b>b</b>) NLLS; (<b>c</b>) CS; (<b>d</b>) LS; (<b>e</b>) IC.</p>
Full article ">Figure 20
<p>System topology of scheme 1.</p>
Full article ">Figure 21
<p>Real-type test site of the distribution network.</p>
Full article ">Figure 22
<p>SDP of actual data. (<b>a</b>) HIF; (<b>b</b>) NLLS; (<b>c</b>) CS; (<b>d</b>) LS; (<b>e</b>) IC.</p>
Full article ">Figure 23
<p>Comparison results of various algorithm test sets.</p>
Full article ">Figure 24
<p>Results of t-SNE visualization of zero sequence current characteristics for various methods. (<b>a</b>) VMD-SVD; (<b>b</b>) S-transformer; (<b>c</b>) ANN; (<b>d</b>) STFT; (<b>e</b>) the proposed method.</p>
Full article ">
33 pages, 8365 KiB  
Article
The Intelligent Diagnosis of a Hydraulic Plunger Pump Based on the MIGLCC-DLSTM Method Using Sound Signals
by Liqiang Ma, Anqi Jiang and Wanlu Jiang
Machines 2024, 12(12), 869; https://doi.org/10.3390/machines12120869 - 29 Nov 2024
Viewed by 483
Abstract
To fully exploit the rich state and fault information embedded in the acoustic signals of a hydraulic plunger pump, this paper proposes an intelligent diagnostic method based on sound signal analysis. First, acoustic signals were collected under normal and various fault conditions. Then, [...] Read more.
To fully exploit the rich state and fault information embedded in the acoustic signals of a hydraulic plunger pump, this paper proposes an intelligent diagnostic method based on sound signal analysis. First, acoustic signals were collected under normal and various fault conditions. Then, four distinct acoustic features—Mel Frequency Cepstral Coefficients (MFCCs), Inverse Mel Frequency Cepstral Coefficients (IMFCCs), Gammatone Frequency Cepstral Coefficients (GFCCs), and Linear Prediction Cepstral Coefficients (LPCCs)—were extracted and integrated into a novel hybrid cepstral feature called MIGLCCs. This fusion enhances the model’s ability to distinguish both high- and low-frequency characteristics, resist noise interference, and capture resonance peaks, achieving a complementary advantage. Finally, the MIGLCC feature set was input into a double layer long short-term memory (DLSTM) network to enable intelligent recognition of the hydraulic plunger pump’s operational states. The results indicate that the MIGLCC-DLSTM method achieved a diagnostic accuracy of 99.41% under test conditions. Validation on the CWRU bearing dataset and operational data from a high-pressure servo motor in a turbine system yielded overall recognition accuracies of 99.64% and 98.07%, respectively, demonstrating the robustness and broad application potential of the MIGLCC-DLSTM method. Full article
(This article belongs to the Section Machines Testing and Maintenance)
Show Figures

Figure 1

Figure 1
<p>Cepstral feature extraction flowchart.</p>
Full article ">Figure 2
<p>Distribution of Mel filter bank.</p>
Full article ">Figure 3
<p>Distribution of inverse Mel filter bank.</p>
Full article ">Figure 4
<p>Distribution of Gammatone filter bank.</p>
Full article ">Figure 5
<p>LSTM network architecture.</p>
Full article ">Figure 6
<p>DLSTM network schematic.</p>
Full article ">Figure 7
<p>Flow chart of intelligent diagnosis method of hydraulic plunger pump based on sound signals.</p>
Full article ">Figure 8
<p>Hydraulic plunger pump fault simulation test bench.</p>
Full article ">Figure 9
<p>Hydraulic plunger pump experimental setup diagram. 1—Oil tank; 2, 24—filter; 3—vane pump; 4, 25—gate valve; 5, 13—flow meter; 6, 15—pressure gauge switch; 7, 16—pressure gauge; 8, 18—relief valve; 9—hydraulic plunger pump; 10—accelerometer; 11—sound level meter; 12—check valve; 14—pressure sensor; 17, 22—accumulator; 19—solenoid valve; 20—electro-hydraulic servo valve; 21—hydraulic cylinder; 23—check throttle valve.</p>
Full article ">Figure 10
<p>Physical images of faulty components in hydraulic plunger pump: (<b>a</b>) swash plate wear; (<b>b</b>) slipper wear; and (<b>c</b>) loose slipper.</p>
Full article ">Figure 11
<p>Time–domain waveform and power spectrum of hydraulic plunger pump sound signals: (<b>a</b>) normal; (<b>b</b>) swash plate wear; (<b>c</b>) slipper wear; and (<b>d</b>) loose slipper.</p>
Full article ">Figure 12
<p>Four types of cepstral features in different states: (<b>a</b>) normal; (<b>b</b>) swash plate wear; (<b>c</b>) slipper wear; and (<b>d</b>) loose slipper.</p>
Full article ">Figure 13
<p>Average classification accuracy of ten trials.</p>
Full article ">Figure 14
<p>Confusion matrices for different features: (<b>a</b>) MFCC; (<b>b</b>) IMFCC; (<b>c</b>) MICC; (<b>d</b>) MIGCC; (<b>e</b>) MILCC; and (<b>f</b>) MIGLCC.</p>
Full article ">Figure 15
<p>Performance comparison of different diagnostic methods.</p>
Full article ">Figure 16
<p>Principles or network structures of various methods: (<b>a</b>) SVM; (<b>b</b>) 1D-CNN; and (<b>c</b>) RNN.</p>
Full article ">Figure 17
<p>Performance comparison of LSTM networks with different layer numbers.</p>
Full article ">Figure 18
<p>t-SNE feature visualization: (<b>a</b>) original data; (<b>b</b>) MIGLCC features; (<b>c</b>) LSTM1 layer; (<b>d</b>) LSTM2 layer; (<b>e</b>) FC layer.</p>
Full article ">Figure 19
<p>CWRU bearing fault test bench.</p>
Full article ">Figure 20
<p>Time–domain waveform and power spectrum of CWRU bearing vibration signals: (<b>a</b>) normal; (<b>b</b>) inner race fault; (<b>c</b>) outer race fault; and (<b>d</b>) rolling element fault.</p>
Full article ">Figure 21
<p>Confusion matrix of CWRU bearing data diagnosis results.</p>
Full article ">Figure 22
<p>t-SNE feature visualization before and after CWRU bearing diagnosis: (<b>a</b>) original data; (<b>b</b>) MIGLCC-DLSTM classifies data.</p>
Full article ">Figure 23
<p>Servo motor fault test bench.</p>
Full article ">Figure 24
<p>Servo motor test system schematic.</p>
Full article ">Figure 25
<p>Time–domain waveform and power spectrum of servo motor pressure signals: (<b>a</b>) normal; (<b>b</b>) servo valve internal leakage; (<b>c</b>) spring breakage; (<b>d</b>) quick-closing solenoid valve throttling orifice blockage; (<b>e</b>) internal oil leakage; and (<b>f</b>) external oil leakage.</p>
Full article ">Figure 26
<p>Confusion matrix of servo motor data diagnosis results.</p>
Full article ">Figure 27
<p>t-SNE feature visualization before and after servo motor diagnosis: (<b>a</b>) original data; (<b>b</b>) MIGLCC-DLSTM classifies data.</p>
Full article ">
Back to TopTop