[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,498)

Search Parameters:
Keywords = attention-based architecture

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
38 pages, 3841 KiB  
Review
Computer Vision-Based Gait Recognition on the Edge: A Survey on Feature Representations, Models, and Architectures
by Edwin Salcedo
J. Imaging 2024, 10(12), 326; https://doi.org/10.3390/jimaging10120326 - 18 Dec 2024
Abstract
Computer vision-based gait recognition (CVGR) is a technology that has gained considerable attention in recent years due to its non-invasive, unobtrusive, and difficult-to-conceal nature. Beyond its applications in biometrics, CVGR holds significant potential for healthcare and human–computer interaction. Current CVGR systems often transmit [...] Read more.
Computer vision-based gait recognition (CVGR) is a technology that has gained considerable attention in recent years due to its non-invasive, unobtrusive, and difficult-to-conceal nature. Beyond its applications in biometrics, CVGR holds significant potential for healthcare and human–computer interaction. Current CVGR systems often transmit collected data to a cloud server for machine learning-based gait pattern recognition. While effective, this cloud-centric approach can result in increased system response times. Alternatively, the emerging paradigm of edge computing, which involves moving computational processes to local devices, offers the potential to reduce latency, enable real-time surveillance, and eliminate reliance on internet connectivity. Furthermore, recent advancements in low-cost, compact microcomputers capable of handling complex inference tasks (e.g., Jetson Nano Orin, Jetson Xavier NX, and Khadas VIM4) have created exciting opportunities for deploying CVGR systems at the edge. This paper reports the state of the art in gait data acquisition modalities, feature representations, models, and architectures for CVGR systems suitable for edge computing. Additionally, this paper addresses the general limitations and highlights new avenues for future research in the promising intersection of CVGR and edge computing. Full article
(This article belongs to the Special Issue Image Processing and Computer Vision: Algorithms and Applications)
Show Figures

Figure 1

Figure 1
<p>A comparative analysis illustrating the growing preference for DL architectures. The illustrations summarise the findings from the papers reviewed in this survey.</p>
Full article ">Figure 2
<p>General structure of this survey paper and our proposed taxonomy of the existing technologies that facilitate on-device deployment of CVGR systems for real-time recognition.</p>
Full article ">Figure 3
<p>Broad perspective on gait feature representations.</p>
Full article ">Figure 4
<p>CVGR systems based on handcrafted representations typically employ one of two approaches: the systems extract silhouettes from 2D images (model-free) or rely on human body models (model-based). The video sample shown in the figure comes from the CASIA-B dataset [<a href="#B23-jimaging-10-00326" class="html-bibr">23</a>].</p>
Full article ">Figure 5
<p>DL-based end-to-end gait recognition scheme for CVGR systems. The sample with the walking subjects shown in the figure comes from the Penn–Fudan dataset [<a href="#B94-jimaging-10-00326" class="html-bibr">94</a>].</p>
Full article ">Figure 6
<p>Graphical depictions of various edge-oriented inference architectures.</p>
Full article ">Figure 7
<p>A large-scale scalable framework to support gait recognition computations in a distributed manner. This framework would incorporate multiple nodes and an edge server to handle data acquisition, detection, segmentation, and classification, enabling more feasible real-time computation. The video sample shown in the figure comes from the CASIA-A dataset [<a href="#B24-jimaging-10-00326" class="html-bibr">24</a>].</p>
Full article ">
29 pages, 9712 KiB  
Article
Cloud–Edge–End Collaborative Federated Learning: Enhancing Model Accuracy and Privacy in Non-IID Environments
by Ling Li, Lidong Zhu and Weibang Li
Sensors 2024, 24(24), 8028; https://doi.org/10.3390/s24248028 - 16 Dec 2024
Viewed by 226
Abstract
Cloud–edge–end computing architecture is crucial for large-scale edge data processing and analysis. However, the diversity of terminal nodes and task complexity in this architecture often result in non-independent and identically distributed (non-IID) data, making it challenging to balance data heterogeneity and privacy protection. [...] Read more.
Cloud–edge–end computing architecture is crucial for large-scale edge data processing and analysis. However, the diversity of terminal nodes and task complexity in this architecture often result in non-independent and identically distributed (non-IID) data, making it challenging to balance data heterogeneity and privacy protection. To address this, we propose a privacy-preserving federated learning method based on cloud–edge–end collaboration. Our method fully considers the three-tier architecture of cloud–edge–end systems and the non-IID nature of terminal node data. It enhances model accuracy while protecting the privacy of terminal node data. The proposed method groups terminal nodes based on the similarity of their data distributions and constructs edge subnetworks for training in collaboration with edge nodes, thereby mitigating the negative impact of non-IID data. Furthermore, we enhance WGAN-GP with attention mechanism to generate balanced synthetic data while preserving key patterns from original datasets, reducing the adverse effects of non-IID data on global model accuracy while preserving data privacy. In addition, we introduce data resampling and loss function weighting strategies to mitigate model bias caused by imbalanced data distribution. Experimental results on real-world datasets demonstrate that our proposed method significantly outperforms existing approaches in terms of model accuracy, F1-score, and other metrics. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

Figure 1
<p>Federated learning framework for cloud–edge–end architecture.</p>
Full article ">Figure 2
<p>Illustration of non-IID client data in federated learning.</p>
Full article ">Figure 3
<p>Generator structure of WGAN-GP after adding the self-attention layer.</p>
Full article ">Figure 4
<p>Discriminator structure of WGAN-GP after adding the self-attention layer.</p>
Full article ">Figure 5
<p>Examples of original MNIST dataset and WGAN-GP generated dataset.</p>
Full article ">Figure 6
<p>Examples of AnnualCrop label category from original EuroSAT and WGAN-GP generated dataset of the same label category.</p>
Full article ">Figure 7
<p>Performance of CEECFed, FedGS, FedAvg, and FedSGD based on the original MNIST dataset. (<b>a</b>) Accuracy, (<b>b</b>) Precision, (<b>c</b>) Recall, (<b>d</b>) F1-Score, (<b>e</b>) Average Loss.</p>
Full article ">Figure 8
<p>Performance of CEECFed, FedGS, FedAvg, and FedSGD based on the original EuroSAT dataset. (<b>a</b>) Accuracy, (<b>b</b>) Precision, (<b>c</b>) Recall, (<b>d</b>) F1-Score, (<b>e</b>) Average Loss.</p>
Full article ">Figure 9
<p>Performance of CEECFed based on original MNIST dataset and WGAN-GP generated datasets. (<b>a</b>) Accuracy, (<b>b</b>) Precision, (<b>c</b>) Recall, (<b>d</b>) F1-Score, (<b>e</b>) Average Loss.</p>
Full article ">Figure 10
<p>Performance of FedAvg based on original MNIST dataset and WGAN-GP generated datasets. (<b>a</b>) Accuracy, (<b>b</b>) Precision, (<b>c</b>) Recall, (<b>d</b>) F1-Score, (<b>e</b>) Average Loss.</p>
Full article ">Figure 11
<p>Performance of FedSGD based on original MNIST dataset and WGAN-GP generated datasets. (<b>a</b>) Accuracy, (<b>b</b>) Precision, (<b>c</b>) Recall, (<b>d</b>) F1-Score, (<b>e</b>) Average Loss.</p>
Full article ">
34 pages, 10110 KiB  
Review
Recent Developments in Electrospun Nanofiber-Based Triboelectric Nanogenerators: Materials, Structure, and Applications
by Qinglong Wei, Yuying Cao, Xiao Yang, Guosong Jiao, Xiaowen Qi and Guilin Wen
Membranes 2024, 14(12), 271; https://doi.org/10.3390/membranes14120271 - 16 Dec 2024
Viewed by 463
Abstract
Triboelectric nanogenerators (TENGs) have garnered significant attention due to their high energy conversion efficiency and extensive application potential in energy harvesting and self-powered devices. Recent advancements in electrospun nanofibers, attributed to their outstanding mechanical properties and tailored surface characteristics, have meant that they [...] Read more.
Triboelectric nanogenerators (TENGs) have garnered significant attention due to their high energy conversion efficiency and extensive application potential in energy harvesting and self-powered devices. Recent advancements in electrospun nanofibers, attributed to their outstanding mechanical properties and tailored surface characteristics, have meant that they can be used as a critical material for enhancing TENGs performance. This review provides a comprehensive overview of the developments in electrospun nanofiber-based TENGs. It begins with an exploration of the fundamental principles behind electrospinning and triboelectricity, followed by a detailed examination of the application and performance of various polymer materials, including poly (vinylidene fluoride) (PVDF), polyamide (PA), thermoplastic polyurethane (TPU), polyacrylonitrile (PAN), and other significant polymers. Furthermore, this review analyzes the influence of diverse structural designs—such as fiber architectures, bionic configurations, and multilayer structures—on the performance of TENGs. Applications across self-powered devices, environmental energy harvesting, and wearable technologies are discussed. The review concludes by highlighting current challenges and outlining future research directions, offering valuable insights for researchers and engineers in the field. Full article
Show Figures

Figure 1

Figure 1
<p>Principle of electrospun nanofibers. (<b>a</b>) Schematic diagram of a basic electrospinning setup [<a href="#B34-membranes-14-00271" class="html-bibr">34</a>]. (<b>b</b>) Schematic diagram showing the path of an electrospun jet [<a href="#B35-membranes-14-00271" class="html-bibr">35</a>]. All essential copyrights and permissions received.</p>
Full article ">Figure 2
<p>Working modes of TENGs (the red arrows indicate the direction of triboelectric layers movement; +: positive charge; −: negative charge) [<a href="#B24-membranes-14-00271" class="html-bibr">24</a>]. All essential copyrights and permissions received.</p>
Full article ">Figure 3
<p>Fiber structure. (<b>a</b>) Surface roughness curves and fiber diameter histograms of electrospun fiber membranes at different humidity levels [<a href="#B122-membranes-14-00271" class="html-bibr">122</a>]. (<b>b</b>) Schematic diagram of the grating TENG, including top view, side view, and cross-sectional view, as well as a fiber cross-sectional view of PVDF (red dash lines denote the top view, side view, cross-sectional view and SEM images of the same sample position) [<a href="#B123-membranes-14-00271" class="html-bibr">123</a>]. (<b>c</b>) Wave-shaped TENG [<a href="#B127-membranes-14-00271" class="html-bibr">127</a>]. (<b>d</b>) Wrinkle-type TENG (yellow arrow indicates the distance between the upper and lower layers) [<a href="#B128-membranes-14-00271" class="html-bibr">128</a>]. (<b>e</b>) Stack configuration of electrospun PVDF with different dipole orientation and direction [<a href="#B130-membranes-14-00271" class="html-bibr">130</a>]. All essential copyrights and permissions received.</p>
Full article ">Figure 4
<p>Bionic structure. (<b>a</b>) TENG based on petiole-shaped fiber mat [<a href="#B136-membranes-14-00271" class="html-bibr">136</a>]. (<b>b</b>) Janus textile inspired by the internal structures of plants (red dot lines mark a small part of the Janus textile that will be attached to the skin and point to the corresponding structure) [<a href="#B137-membranes-14-00271" class="html-bibr">137</a>]. (<b>c</b>) Structural design of the TENG-based e-skin (black dot lines mark the all-nanofiber TENG-based e-skin and point to the corresponding structure.) [<a href="#B141-membranes-14-00271" class="html-bibr">141</a>]. (<b>d</b>) Bio-inspired hydrophobic/cancellous/hydrophilic Trimurti-based TENG [<a href="#B142-membranes-14-00271" class="html-bibr">142</a>]. (<b>e</b>) Silk-inspired nanofibers [<a href="#B143-membranes-14-00271" class="html-bibr">143</a>]. (<b>f</b>) Bioinspired soft TENG fabricated based on animal body structures [<a href="#B144-membranes-14-00271" class="html-bibr">144</a>]. All essential copyrights and permissions received.</p>
Full article ">Figure 5
<p>Multilayer structure. (<b>a</b>) Schematic representation of the TENG construction [<a href="#B149-membranes-14-00271" class="html-bibr">149</a>]. (<b>b</b>) Structure of PT-NG [<a href="#B152-membranes-14-00271" class="html-bibr">152</a>]. (<b>c</b>) Schematic diagram of the hybrid generator [<a href="#B155-membranes-14-00271" class="html-bibr">155</a>]. (<b>d</b>) Schematic diagram of the double-layer nanofibrous TENG [<a href="#B156-membranes-14-00271" class="html-bibr">156</a>]. (<b>e</b>) Schematic illustration showing the layer-by-layer structure of the self-charging SPC [<a href="#B157-membranes-14-00271" class="html-bibr">157</a>]. (<b>f</b>) Structural model diagram of the MS-CES [<a href="#B158-membranes-14-00271" class="html-bibr">158</a>]. All essential copyrights and permissions received.</p>
Full article ">Figure 6
<p>Self-powered devices based on electrospun nanofiber TENGs. (<b>a</b>) Schematic illustration of CSYF TENG as a self-powered humidity sensor [<a href="#B167-membranes-14-00271" class="html-bibr">167</a>]. (<b>b</b>) Output performance of STENG as visible-blind UV photodetector [<a href="#B169-membranes-14-00271" class="html-bibr">169</a>]. (<b>c</b>) A real-time smart home control system using an MOF/PVDF (MPVDF) NF-based TENG device (red circle marks the MPVDF NF-based TENG) [<a href="#B170-membranes-14-00271" class="html-bibr">170</a>]. (<b>d</b>) Schematic of a natural human breath test [<a href="#B171-membranes-14-00271" class="html-bibr">171</a>]. (<b>e</b>) Illustration of the integration of SUPS for noninvasive multi-indicator cardiovascular monitoring (red circle marks the SUPS) [<a href="#B172-membranes-14-00271" class="html-bibr">172</a>]. (<b>f</b>) Structure diagram of self-powered TENG and its principle diagram in wound healing (large grey arrow points to the position of TENG, indicating its location in the whole system; small grey arrows represent the healing of wounds from both sides, approaching towards the middle, showing the direction and process of wound healing) [<a href="#B173-membranes-14-00271" class="html-bibr">173</a>]. All essential copyrights and permissions received.</p>
Full article ">Figure 7
<p>Environmental energy harvesting based on electrospun nanofiber TENGs (<b>a</b>) Acoustic NFM TENG (yellow arrow points to the overall structure of TENG; red arrow indicates a small part of the PLA layer and the MWCNTs within it; red line represents the wire) [<a href="#B191-membranes-14-00271" class="html-bibr">191</a>]. (<b>b</b>) Wind-driven TENG for W/O emulsion separation (red dash line marks the copper electrode and shows the charge distribution within it) [<a href="#B193-membranes-14-00271" class="html-bibr">193</a>]. (<b>c</b>) Water energy harvesting mechanism of the SNF-TENG (purple arrow shows the direction of charge movement) [<a href="#B194-membranes-14-00271" class="html-bibr">194</a>]. (<b>d</b>) When the rain droplets roll down the MWTT, triboelectric electricity is generated [<a href="#B195-membranes-14-00271" class="html-bibr">195</a>]. (<b>e</b>) Schematics of the G-TENG array for harvesting water wave energy [<a href="#B126-membranes-14-00271" class="html-bibr">126</a>]. All essential copyrights and permissions received.</p>
Full article ">Figure 8
<p>Wearable devices based on electrospun nanofiber TENGs (<b>a</b>) The TENG integrated into the mask is used to monitor breathing after walking or running at different speeds on a treadmill [<a href="#B218-membranes-14-00271" class="html-bibr">218</a>]. (<b>b</b>) The voltage changes in our device attached on throat muscle movement [<a href="#B219-membranes-14-00271" class="html-bibr">219</a>]. (<b>c</b>) Schematic diagram of the communication system for the real-time monitoring of abdominal respiratory status by the TENG sensor using a wired transmission device [<a href="#B107-membranes-14-00271" class="html-bibr">107</a>]. (<b>d</b>) The applications of ALTFM-based wearable electronics for human motion monitoring (pink arrows serve as pointers) [<a href="#B125-membranes-14-00271" class="html-bibr">125</a>]. (<b>e</b>) Human body movement recognition and detection using PENG and TENG devices based on PAG2-10 NFs fixed on different locations [<a href="#B220-membranes-14-00271" class="html-bibr">220</a>]. All essential copyrights and permissions received.</p>
Full article ">
20 pages, 8748 KiB  
Article
A Hysteresis Current Controller for the Interleaving Operation of the Paralleled Buck Converters Without Interconnecting Lines
by Ruwen Wang, Yu Chen, Jiashu Huang, Yitong Wu and Yong Kang
Electronics 2024, 13(24), 4928; https://doi.org/10.3390/electronics13244928 - 13 Dec 2024
Viewed by 352
Abstract
Paralleled buck converters have garnered significant attention for fulfilling the increasing demands of power supplies in modern applications. They have the advantages of increased current capacity and reduced current ripple. Under this architecture, applying distributed control brings the benefits of reliability and scalability. [...] Read more.
Paralleled buck converters have garnered significant attention for fulfilling the increasing demands of power supplies in modern applications. They have the advantages of increased current capacity and reduced current ripple. Under this architecture, applying distributed control brings the benefits of reliability and scalability. However, the interconnecting lines between the converters are required for achieving the interleaving operation, which reduces the reliability and scalability. To eliminate the interconnecting lines, this paper proposed a hysteresis current controller to achieve the symmetric interleaving operation. First, the parallel structure with a common output filter was proposed to provide the hardware basis for the proposed control method. Then, the hysteresis current controller was constructed based on the inductor voltage of each buck module to achieve the interleaving operation. The experiment of the three-module-paralleled buck converter was conducted. The experimental results show that the symmetric interleaving was achieved with a maximum phase shift error of 4.1%; the output of the parallel system can recover within 2.60 ms after the load transitions from 50% to 100%; and the parallel system can recover within 1.96 ms after one module removal. The simulation results of the six-module-paralleled buck converter show that the parallel system can recover within 1.00 ms after one module addition, which illustrates the scalability of the proposed method. Full article
Show Figures

Figure 1

Figure 1
<p>Structure of the paralleled buck converters with the common output filter.</p>
Full article ">Figure 2
<p>(<b>a</b>) Equivalent circuit of the paralleled buck converters with common output filter and (<b>b</b>) its voltage waveforms.</p>
Full article ">Figure 3
<p>Construction of the hysteresis current controller. (<b>a</b>) Traditional hysteresis current controller, (<b>b</b>) hysteresis current controller with <span class="html-italic">v<sub>Li</sub></span> directly constructing the hysteresis band and (<b>c</b>) the proposed hysteresis current controller based on the shaped-inductor-voltage.</p>
Full article ">Figure 4
<p>Key waveforms of the steady state and the transient state for <span class="html-italic">D</span> &lt; 1/3.</p>
Full article ">Figure 5
<p>Variation curve of Δ<span class="html-italic">T</span><sub>31<span class="html-italic">off</span></sub>[<span class="html-italic">k</span>]/Δ<span class="html-italic">T</span><sub>31<span class="html-italic">off</span></sub>[0].</p>
Full article ">Figure 6
<p>Steady-state waveforms for the case of (<b>a</b>) 1/3 ≤ <span class="html-italic">D</span> ≤ 2/3 and (<b>b</b>) <span class="html-italic">D</span> &gt; 2/3.</p>
Full article ">Figure 7
<p>Simulation results of (<b>a</b>) the dynamic process with the disturbance of <span class="html-italic">I</span><sub>ref</sub>, (<b>b</b>,<b>c</b>) the zoomed views of the steady states and (<b>d</b>) the key waveforms of the proposed hysteresis current controller.</p>
Full article ">Figure 8
<p>Steady-state waveforms for (<b>a</b>) 1/3 ≤ <span class="html-italic">D</span> ≤ 2/3 and (<b>b</b>) <span class="html-italic">D</span> &gt; 2/3.</p>
Full article ">Figure 9
<p>Simulation results of the module addition. (<b>a</b>) The dynamic process and the zoomed views of the stable state of (<b>b</b>) the five-module-paralleled system and (<b>c</b>) the six-module-paralleled system.</p>
Full article ">Figure 10
<p>The circuit diagram of module <span class="html-italic">i</span>.</p>
Full article ">Figure 11
<p>Prototype of the three-module-paralleled buck converter.</p>
Full article ">Figure 12
<p>Measured waveforms of the dynamic process of (<b>a</b>) the load addition and (<b>b</b>) the load removal, and the zoomed views of the steady states of (<b>c</b>) full load and (<b>d</b>) half load.</p>
Full article ">Figure 13
<p>Dynamic processes of (<b>a</b>) the module removal and its zoomed views (<b>b</b>,<b>c</b>), and (<b>d</b>) the module addition and its zoomed views (<b>e</b>,<b>f</b>).</p>
Full article ">Figure 14
<p>Simulation of the dynamic processes of module addition and removal.</p>
Full article ">Figure A1
<p>Variation curves of Δ<span class="html-italic">T</span><sub>31<span class="html-italic">off</span></sub>[<span class="html-italic">k</span>]/Δ<span class="html-italic">T</span><sub>31<span class="html-italic">off</span></sub>[0] with different parameters.</p>
Full article ">
24 pages, 541 KiB  
Article
Temporal Logical Attention Network for Log-Based Anomaly Detection in Distributed Systems
by Yang Liu, Shaochen Ren, Xuran Wang and Mengjie Zhou
Sensors 2024, 24(24), 7949; https://doi.org/10.3390/s24247949 - 12 Dec 2024
Viewed by 290
Abstract
Detecting anomalies in distributed systems through log analysis remains challenging due to the complex temporal dependencies between log events, the diverse manifestation of system states, and the intricate causal relationships across distributed components. This paper introduces a TLAN (Temporal Logical Attention Network), a [...] Read more.
Detecting anomalies in distributed systems through log analysis remains challenging due to the complex temporal dependencies between log events, the diverse manifestation of system states, and the intricate causal relationships across distributed components. This paper introduces a TLAN (Temporal Logical Attention Network), a novel deep learning framework that integrates temporal sequence modeling with logical dependency analysis for robust anomaly detection in distributed system logs. Our approach makes three key contributions: (1) a temporal logical attention mechanism that explicitly models both time-series patterns and logical dependencies between log events across distributed components, (2) a multi-scale feature extraction module that captures system behaviors at different temporal granularities while preserving causal relationships, and (3) an adaptive threshold strategy that dynamically adjusts detection sensitivity based on system load and component interactions. Extensive experiments on a large-scale synthetic distributed system log dataset show that TLAN outperforms existing methods by achieving a 9.4% improvement in F1-score and reducing false alarms by 15.3% while maintaining low latency in real-time detection. The framework demonstrates particular effectiveness in identifying complex anomalies that involve multiple interacting components and cascading failures. Through comprehensive empirical analysis and case studies, we validate that TLAN can effectively capture both temporal patterns and logical correlations in log sequences, making it especially suitable for modern distributed architectures. Our approach also shows strong generalization capability across different system scales and deployment scenarios, supported by thorough ablation studies and performance evaluations. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

Figure 1
<p>Overview of the TLAN framework.</p>
Full article ">Figure 2
<p>Performance comparison of different methods across various anomaly types. TLAN demonstrates superior detection capability particularly for system crashes and memory leaks while maintaining consistent performance across all anomaly categories.</p>
Full article ">Figure 3
<p>Cumulative detection rate over time for different methods. The steeper curve of TLAN indicates its faster detection capability, with over 90% of anomalies being detected within 2.1 s of their onset.</p>
Full article ">Figure 4
<p>Scalability analysis of TLAN: (<b>a</b>) Processing time shows linear growth with increasing log volume, demonstrating efficient handling of large-scale data streams; (<b>b</b>) Memory usage exhibits sub-linear growth with increasing number of components, indicating effective resource utilization in large distributed systems.</p>
Full article ">Figure 5
<p>Visualization of attention weight distributions for different mechanisms during a cascading failure scenario. TLAN shows stronger joint temporal logicalpatterns (darker colors indicate higher attention weights) compared to other methods, particularly in capturing cross-component interactions. The heatmaps demonstrate how TLAN effectively combines both temporal evolution (sequential patterns) and logical dependencies (component interactions) in its attention mechanism.</p>
Full article ">Figure 6
<p>Visualization of cascading service failures: (<b>a</b>) Timeline of component status and interactions; (<b>b</b>) TLAN’s attention weights highlighting critical dependencies; (<b>c</b>) Early warning indicators identified by the model.</p>
Full article ">Figure 7
<p>Analysis of intermittent network anomalies: (<b>a</b>) Network performance metrics over time; (<b>b</b>) Multi-scale feature importance visualization; (<b>c</b>) Comparison of detection results between TLAN and baseline methods. The shaded regions indicate ground truth anomaly periods.</p>
Full article ">Figure 8
<p>Resource contention analysis in database cluster: (<b>a</b>) Resource utilization patterns across nodes; (<b>b</b>) Cross-component correlation matrix; (<b>c</b>) Performance impact visualization. High correlation areas (in darker color) indicate potential resource contention points.</p>
Full article ">
13 pages, 3483 KiB  
Article
Classification of English Words into Grammatical Notations Using Deep Learning Technique
by Muhammad Imran, Sajjad Hussain Qureshi, Abrar Hussain Qureshi and Norah Almusharraf
Information 2024, 15(12), 801; https://doi.org/10.3390/info15120801 - 11 Dec 2024
Viewed by 354
Abstract
The impact of artificial intelligence (AI) on English language learning has become the center of attention in the past few decades. This study, with its potential to transform English language instruction and offer various instructional approaches, provides valuable insights and knowledge. To fully [...] Read more.
The impact of artificial intelligence (AI) on English language learning has become the center of attention in the past few decades. This study, with its potential to transform English language instruction and offer various instructional approaches, provides valuable insights and knowledge. To fully grasp the potential advantages of AI, more research is needed to improve, validate, and test AI algorithms and architectures. Grammatical notations provide a word’s information to the readers. If a word’s images are properly extracted and categorized using a CNN, it can help non-native English speakers improve their learning habits. The classification of parts of speech into different grammatical notations is the major problem that non-native English learners face. This situation stresses the need to develop a computer-based system using a machine learning algorithm to classify words into proper grammatical notations. A convolutional neural network (CNN) was applied to classify English words into nine classes: noun, pronoun, adjective, determiner, verb, adverb, preposition, conjunction, and interjection. A simulation of the selected model was performed in MATLAB. The model achieved an overall accuracy of 97.22%. The CNN showed 100% accuracy for pronouns, determiners, verbs, adverbs, and prepositions; 95% for nouns, adjectives, and conjunctions; and 90% for interjections. The significant results (p < 0.0001) of the chi-square test supported the use of the CNN by non-native English learners. The proposed approach is an important source of word classification for non-native English learners by putting the word image into the model. This not only helps beginners in English learning but also helps in setting standards for evaluating documents. Full article
(This article belongs to the Special Issue Applications of Machine Learning and Convolutional Neural Networks)
Show Figures

Figure 1

Figure 1
<p>Conceptual design of the proposed model.</p>
Full article ">Figure 2
<p>The proposed architecture of the lightweight model.</p>
Full article ">Figure 3
<p>Feature mapping and max pooling technique of CNN model.</p>
Full article ">Figure 4
<p>(<b>a</b>) Training and validation accuracy of the proposed model. (<b>b</b>) Training and validation loss of the proposed model.</p>
Full article ">Figure 5
<p>Confusion Matrix of parts of speech into nine classes.</p>
Full article ">Figure 6
<p>Results of model predictions.</p>
Full article ">Figure 7
<p>Comparison of grammatical notation recognition.</p>
Full article ">
26 pages, 6713 KiB  
Article
Improved Field Obstacle Detection Algorithm Based on YOLOv8
by Xinying Zhou, Wenming Chen and Xinhua Wei
Agriculture 2024, 14(12), 2263; https://doi.org/10.3390/agriculture14122263 - 11 Dec 2024
Viewed by 511
Abstract
To satisfy the obstacle avoidance requirements of unmanned agricultural machinery during autonomous operation and address the challenge of rapid obstacle detection in complex field environments, an improved field obstacle detection model based on YOLOv8 was proposed. This model enabled the fast detection and [...] Read more.
To satisfy the obstacle avoidance requirements of unmanned agricultural machinery during autonomous operation and address the challenge of rapid obstacle detection in complex field environments, an improved field obstacle detection model based on YOLOv8 was proposed. This model enabled the fast detection and recognition of obstacles such as people, tractors, and electric power pylons in the field. This detection model was built upon the YOLOv8 architecture with three main improvements. First, to adapt to different tasks and complex environments in the field, improve the sensitivity of the detector to various target sizes and positions, and enhance detection accuracy, the CBAM (Convolutional Block Attention Module) was integrated into the backbone layer of the benchmark model. Secondly, a BiFPN (Bi-directional Feature Pyramid Network) architecture took the place of the original PANet to enhance the fusion of features across multiple scales, thereby increasing the model’s capacity to distinguish between the background and obstacles. Third, WIoU v3 (Wise Intersection over Union v3) optimized the target boundary loss function, assigning greater focus to medium-quality anchor boxes and enhancing the detector’s overall performance. A dataset comprising 5963 images of people, electric power pylons, telegraph poles, tractors, and harvesters in a farmland environment was constructed. The training set comprised 4771 images, while the validation and test sets each consisted of 596 images. The results from the experiments indicated that the enhanced model attained precision, recall, and average precision scores of 85.5%, 75.1%, and 82.5%, respectively, on the custom dataset. This reflected increases of 1.3, 1.2, and 1.9 percentage points when compared to the baseline YOLOv8 model. Furthermore, the model reached 52 detection frames per second, thereby significantly enhancing the detection performance for common obstacles in the field. The model enhanced by the previously mentioned techniques guarantees a high level of detection accuracy while meeting the criteria for real-time obstacle identification in unmanned agricultural equipment during fieldwork. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>YOLOv8 network structure.</p>
Full article ">Figure 2
<p>CBBI-YOLO network structure. We added a lime-green CBAM module between the C2f module and the SPPF module, replaced the original green Concat module with a green BiFPN module, and replaced the original CIoU loss function with the WIoU v3 loss function from Bbox Loss, while leaving the other modules un-changed.</p>
Full article ">Figure 3
<p>CBAM attention, echanism. The blue module represents the original feature map; the orange module represents the channel attention module; the purple module represents the spatial attention module. The original input feature map is directly producted with the channel attention feature map, and the processed feature map is directly producted with the spatial attention feature map to obtain the pink module, which is the final feature map.</p>
Full article ">Figure 4
<p>Channel attention module. The green module represents the input feature map, after average pooling and maximum pooling operations to obtain two feature maps (light pink and light purple modules), which are fed into the multilayer perceptron MLP to obtain two feature maps (pink and green fused module, purple and green fused module), and then after summation and activation function (purple Relu module) operations to obtain the channel attention feature map (module of purple and pink fusion).</p>
Full article ">Figure 5
<p>Spatial attention module. The green module represents the input feature map, after average pooling and maximum pooling operations to obtain two feature maps (pink and purple modules), these two feature maps after splicing and convolution operations (blue Conv module) obtained feature maps (white module) to activation function to obtain the spatial attention feature maps (white and grey fusion of the module).</p>
Full article ">Figure 6
<p>PANet structure: (<b>a</b>) FPN structure; (<b>b</b>) bottom-up structure. (<b>a</b>,<b>b</b>) together form the PANet structure. The red dashed line indicates that the information is passed from the bottom fea-ture map to the high-level feature map, which undergoes a large number of convolution operations; the green dashed line indicates that the bottom information is fused into the current layer and the previ-ous layer until the highest information is reached (from C2 to P2 and then to N2 until N5), this greatly reduces the number of convolution calculations.</p>
Full article ">Figure 7
<p>BiFPN structure. P3, P4, and P5 are the outputs of the backbone network, which undergoes two downsampling operations to obtain P6 and P7 after performing a convolution operation to adjust the channels to obtain the input <math display="inline"><semantics> <mrow> <msubsup> <mrow> <mi>P</mi> </mrow> <mrow> <mi>n</mi> </mrow> <mrow> <mi>i</mi> <mi>n</mi> </mrow> </msubsup> </mrow> </semantics></math> . The middle part is <math display="inline"><semantics> <mrow> <msubsup> <mrow> <mi>P</mi> </mrow> <mrow> <mi>n</mi> </mrow> <mrow> <mi>t</mi> <mi>d</mi> </mrow> </msubsup> </mrow> </semantics></math> in the following equation; <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>w</mi> </mrow> <mrow> <mi>n</mi> </mrow> </msub> </mrow> </semantics></math> is the weighting factor. The right part is <math display="inline"><semantics> <mrow> <msubsup> <mrow> <mi>P</mi> </mrow> <mrow> <mi>n</mi> </mrow> <mrow> <mi>o</mi> <mi>u</mi> <mi>t</mi> </mrow> </msubsup> </mrow> </semantics></math>.</p>
Full article ">Figure 8
<p>Five examples of obstacles.</p>
Full article ">Figure 9
<p>The distribution of dataset labels. The left image represents the distribution of the center points of the target bounding box, and the horizontal (x) and vertical (y) axes represent the width and height normalized coordinates of the image, respectively. The right image represents the distribution of the width and height of the target bounding box, the horizontal axis represents the relative width of the target box, the vertical axis represents the relative height, and the dark-colored region is the location with higher fre-quency.</p>
Full article ">Figure 10
<p>Comparison of precision, recall, and mAP between YOLOv8 and CBBI-YOLO. (<span class="html-italic">X</span>-axis epochs denote the number of training epochs, which is a dimensionless quantity. <span class="html-italic">Y</span>-axis shows a percentage that usually ranges from 0 to 1. Due to the Early Stopping strategy, training stopped at around 150 epochs).</p>
Full article ">Figure 11
<p>Some test results from the field.</p>
Full article ">Figure 12
<p>Test results: (<b>a</b>) YOLO v8 model missed detection; (<b>b</b>) CBBI-YOLO model correctness detection.</p>
Full article ">Figure 13
<p>Test results: (<b>a</b>) YOLO v8 model confidence level; (<b>b</b>) CBBI-YOLO model confidence level.</p>
Full article ">Figure 14
<p>(<b>a</b>) Precision–confidence curve, (<b>b</b>) precision–recall curve, and (<b>c</b>) recall–confidence curve during CBBI-YOLO model training.</p>
Full article ">Figure 15
<p>Overview of precision, recall, and average precision during CBBI-YOLO model training. (<span class="html-italic">X</span>-axis epochs denotes the number of training epochs, which is a dimensionless quantity. <span class="html-italic">Y</span>-axis shows a percentage that usually ranges from 0 to 1. Due to the Early Stopping strategy, training stopped at around 150 epochs).</p>
Full article ">
20 pages, 6078 KiB  
Article
A Smart Motor Rehabilitation System Based on the Internet of Things and Humanoid Robotics
by Yasamin Moghbelan, Alfonso Esposito, Ivan Zyrianoff, Giulia Spaletta, Stefano Borgo, Claudio Masolo, Fabiana Ballarin, Valeria Seidita, Roberto Toni, Fulvio Barbaro, Giusy Di Conza, Francesca Pia Quartulli and Marco Di Felice
Appl. Sci. 2024, 14(24), 11489; https://doi.org/10.3390/app142411489 - 10 Dec 2024
Viewed by 506
Abstract
The Internet of Things (IoT) is gaining increasing attention in healthcare due to its potential to enable continuous monitoring of patients, both at home and in controlled medical environments. In this paper, we explore the integration of IoT with human-robotics in the context [...] Read more.
The Internet of Things (IoT) is gaining increasing attention in healthcare due to its potential to enable continuous monitoring of patients, both at home and in controlled medical environments. In this paper, we explore the integration of IoT with human-robotics in the context of motor rehabilitation for groups of patients performing moderate physical routines, focused on balance, stretching, and posture. Specifically, we propose the I-TROPHYTS framework, which introduces a step-change in motor rehabilitation by advancing towards more sustainable medical services and personalized diagnostics. Our framework leverages wearable sensors to monitor patients’ vital signs and edge computing to detect and estimate motor routines. In addition, it incorporates a humanoid robot that mimics the actions of a physiotherapist, adapting motor routines in real-time based on the patient’s condition. All data from physiotherapy sessions are modeled using an ontology, enabling automatic reasoning and planning of robot actions. In this paper, we present the architecture of the proposed framework, which spans four layers, and discuss its enabling components. Furthermore, we detail the current deployment of the IoT system for patient monitoring and automatic identification of motor routines via Machine Learning techniques. Our experimental results, collected from a group of volunteers performing balance and stretching exercises, demonstrate that we can achieve nearly 100% accuracy in distinguishing between shoulder abduction and shoulder flexion, using Inertial Measurement Unit data from wearable IoT devices placed on the wrist and elbow of the test subjects. Full article
Show Figures

Figure 1

Figure 1
<p>Framework and architecture of I-TROPHYTS.</p>
Full article ">Figure 2
<p>Implementation of the first two layers of I-TROPHYTS.</p>
Full article ">Figure 3
<p>Illustration of two exercises performed during the experiments.</p>
Full article ">Figure 4
<p>Accelerometer and gyroscope raw data—AR exercise.</p>
Full article ">Figure 5
<p>Accelerometer and gyroscope raw data—BL exercise.</p>
Full article ">Figure 6
<p>Comparison of accuracy and F1-Score metrics in evaluating different learning algorithms.</p>
Full article ">Figure 7
<p>Comparison of Accuracy for evaluating FFNN performance using different signals on a variable number of devices.</p>
Full article ">Figure 8
<p>Comparison of F1-Score for evaluating FFNN performance using different signals on a variable number of devices.</p>
Full article ">Figure 9
<p>Accuracy and F1-Score for predicting motion using FFNN across different time windows.</p>
Full article ">Figure 10
<p>Heart rate comparison of two subjects—AL exercise.</p>
Full article ">Figure 11
<p>Peak detection—AR exercise.</p>
Full article ">Figure 12
<p>Predicted versus actual repetitions of exercises.</p>
Full article ">Figure 13
<p>Agent-based cognitive architecture for structuring robotic systems that can monitor, suggest, explain in complex scenarios.</p>
Full article ">
16 pages, 3260 KiB  
Article
Online Purchase Behavior Prediction Model Based on Recurrent Neural Network and Naive Bayes
by Chaohui Zhang, Jiyuan Liu and Shichen Zhang
J. Theor. Appl. Electron. Commer. Res. 2024, 19(4), 3461-3476; https://doi.org/10.3390/jtaer19040168 - 9 Dec 2024
Viewed by 452
Abstract
In the current competition process of e-commerce platforms, the technical and algorithmic wars that can quickly grasp user needs and accurately recommend target commodities are the core tools of platform competition. At the same time, the existing online purchase behavior prediction models lack [...] Read more.
In the current competition process of e-commerce platforms, the technical and algorithmic wars that can quickly grasp user needs and accurately recommend target commodities are the core tools of platform competition. At the same time, the existing online purchase behavior prediction models lack consideration of time series features. This paper combines the Recurrent Neural Network, which is more suitable for the commodity recommendation scenario of the e-commerce platform, with Naive Bayes, which is simple in logic and efficient in operation and constructs the online purchase behavior prediction model RNN-NB, which can consider the features of time series. The RNN-NB model is trained and tested using 3 million time series data with purchase behavior provided by the Ali Tianchi big data platform. The prediction effect of the RNN-NB model and Naive Bayes model is evaluated and compared respectively under the same experimental conditions. The results show that the overall prediction effect of the RNN-NB model is better and more stable. In addition, through the analysis of user time series features, it can be found that the possibility of user purchase is negatively correlated with the length of time series, and merchants should pay more attention to those users with shorter time series in commodity recommendation and targeted offers. The contributions of this paper are as follows: (1) By constructing an online purchasing behavior model RNN-NB, which integrates the N vs 1 structure Recurrent Neural Network and naive Bayesian model, the validity limitations of some single-architecture recommendation algorithms are solved. (2) Based on the existing naive Bayesian model, the prediction accuracy of online purchasing behavior is further improved. (3) The analysis based on the features of the time series provides new ideas for the research of later scholars and new guidance for the marketing of platform merchants. Full article
(This article belongs to the Topic Digital Marketing Dynamics: From Browsing to Buying)
Show Figures

Figure 1

Figure 1
<p>Overall framework of the RNN-NB model.</p>
Full article ">Figure 2
<p>Structure diagram of the N vs. 1 RNN model.</p>
Full article ">Figure 3
<p>Model training demonstration.</p>
Full article ">Figure 4
<p><b>User</b> behavior and commodity category association.</p>
Full article ">Figure 5
<p>ROC curve comparison.</p>
Full article ">Figure 6
<p>Analysis of the number of behaviors before buying.</p>
Full article ">
28 pages, 7479 KiB  
Article
TUH-NAS: A Triple-Unit NAS Network for Hyperspectral Image Classification
by Feng Chen, Baishun Su and Zongpu Jia
Sensors 2024, 24(23), 7834; https://doi.org/10.3390/s24237834 - 7 Dec 2024
Viewed by 352
Abstract
Over the last few years, neural architecture search (NAS) technology has achieved good results in hyperspectral image classification. Nevertheless, existing NAS-based classification methods have not specifically focused on the complex connection between spectral and spatial data. Strengthening the integration of spatial and spectral [...] Read more.
Over the last few years, neural architecture search (NAS) technology has achieved good results in hyperspectral image classification. Nevertheless, existing NAS-based classification methods have not specifically focused on the complex connection between spectral and spatial data. Strengthening the integration of spatial and spectral features is crucial to boosting the overall classification efficacy of hyperspectral images. In this paper, a triple-unit hyperspectral NAS network (TUH-NAS) aimed at hyperspectral image classification is introduced, where the fusion unit emphasizes the enhancement of the intrinsic relationship between spatial and spectral information. We designed a new hyperspectral image attention mechanism module to increase the focus on critical regions and enhance sensitivity to priority areas. We also adopted a composite loss function to enhance the model’s focus on hard-to-classify samples. Experimental evaluations on three publicly accessible hyperspectral datasets demonstrated that, despite utilizing a limited number of samples, TUH-NAS outperforms existing NAS classification methods in recognizing object boundaries. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>The overall algorithm flowchart of the proposed method.</p>
Full article ">Figure 2
<p>Internal search architecture and search results. (<b>a</b>) illustrates the structure of the basic search cell. The red arrows represent the set of all candidate operations. (<b>b</b>) represents the structure search results, where each node retains only the two most valuable input paths. The red arrows represent the selected basic operations.</p>
Full article ">Figure 3
<p>External search network framework.</p>
Full article ">Figure 4
<p>The final optimized training network framework.</p>
Full article ">Figure 5
<p>Diagram of the HSIAM structure.</p>
Full article ">Figure 6
<p>Comparison of experimental results on MUUFL with 30 training samples per class. (<b>a</b>) False color composite image; (<b>b</b>) SpectralFormer; (<b>c</b>) SSFTT; (<b>d</b>) GAHT; (<b>e</b>) 3-D-ANAS; (<b>f</b>) Hyt-NAS; (<b>g</b>) TUH-NAS; (<b>h</b>) Ground-truth map.</p>
Full article ">Figure 7
<p>Comparison of experimental results on Houston2018 with 30 training samples per class. (<b>a</b>) False color composite image; (<b>b</b>) SpectralFormer; (<b>c</b>) SSFTT; (<b>d</b>) GAHT; (<b>e</b>) 3-D-ANAS; (<b>f</b>) Hyt-NAS; (<b>g</b>) TUH-NAS; (<b>h</b>) Ground-truth map.</p>
Full article ">Figure 8
<p>Comparison of experimental results on XiongAn with 30 training samples per class. (<b>a</b>) False color composite image; (<b>b</b>) SpectralFormer; (<b>c</b>) SSFTT; (<b>d</b>) GAHT; (<b>e</b>) 3-D-ANAS; (<b>f</b>) Hyt-NAS; (<b>g</b>) TUH-NAS; (<b>h</b>) Ground-truth map.</p>
Full article ">Figure 9
<p>Final architectures found for three datasets. From top to bottom: MUUFL, Houston2018, and XiongAn. The green nodes are the main working units of the search network, the yellow nodes are the output nodes, and the blue nodes are the sub-nodes within the working units.</p>
Full article ">Figure 10
<p>Line chart corresponding to the data in <a href="#sensors-24-07834-t011" class="html-table">Table 11</a>.</p>
Full article ">
17 pages, 2598 KiB  
Article
From Binary to Multi-Class Classification: A Two-Step Hybrid CNN-ViT Model for Chest Disease Classification Based on X-Ray Images
by Yousra Hadhoud, Tahar Mekhaznia, Akram Bennour, Mohamed Amroune, Neesrin Ali Kurdi, Abdulaziz Hadi Aborujilah and Mohammed Al-Sarem
Diagnostics 2024, 14(23), 2754; https://doi.org/10.3390/diagnostics14232754 - 6 Dec 2024
Viewed by 482
Abstract
Background/Objectives: Chest disease identification for Tuberculosis and Pneumonia diseases presents diagnostic challenges due to overlapping radiographic features and the limited availability of expert radiologists, especially in developing countries. The present study aims to address these challenges by developing a Computer-Aided Diagnosis (CAD) system [...] Read more.
Background/Objectives: Chest disease identification for Tuberculosis and Pneumonia diseases presents diagnostic challenges due to overlapping radiographic features and the limited availability of expert radiologists, especially in developing countries. The present study aims to address these challenges by developing a Computer-Aided Diagnosis (CAD) system to provide consistent and objective analyses of chest X-ray images, thereby reducing potential human error. By leveraging the complementary strengths of convolutional neural networks (CNNs) and vision transformers (ViTs), we propose a hybrid model for the accurate detection of Tuberculosis and for distinguishing between Tuberculosis and Pneumonia. Methods: We designed a two-step hybrid model that integrates the ResNet-50 CNN with the ViT-b16 architecture. It uses the transfer learning on datasets from Guangzhou Women’s and Children’s Medical Center for Pneumonia cases and datasets from Qatar and Dhaka (Bangladesh) universities for Tuberculosis cases. CNNs capture hierarchical structures in images, while ViTs, with their self-attention mechanisms, excel at identifying relationships between features. Combining these approaches enhances the model’s performance on binary and multi-class classification tasks. Results: Our hybrid CNN-ViT model achieved a binary classification accuracy of 98.97% for Tuberculosis detection. For multi-class classification, distinguishing between Tuberculosis, viral Pneumonia, and bacterial Pneumonia, the model achieved an accuracy of 96.18%. These results underscore the model’s potential in improving diagnostic accuracy and reliability for chest disease classification based on X-ray images. Conclusions: The proposed hybrid CNN-ViT model demonstrates substantial potential in advancing the accuracy and robustness of CAD systems for chest disease diagnosis. By integrating CNN and ViT architectures, our approach enhances the diagnostic precision, which may help to alleviate the burden on healthcare systems in resource-limited settings and improve patient outcomes in chest disease diagnosis. Full article
(This article belongs to the Special Issue Artificial Intelligence in Clinical Medical Imaging: 2nd Edition)
Show Figures

Figure 1

Figure 1
<p>The overall ensemble model architecture.</p>
Full article ">Figure 2
<p>Example of Tuberculosis cases (<b>a</b>): the original image; (<b>b</b>): the image after the application of CLAHE.</p>
Full article ">Figure 3
<p>Ensemble model performance in terms of loss and accuracy for the binary classification.</p>
Full article ">Figure 4
<p>Ensemble model performance in terms of precision recall for the binary classification.</p>
Full article ">Figure 5
<p>Ensemble model performance in terms of loss and accuracy for the multi-class classification.</p>
Full article ">Figure 6
<p>Ensemble model performance in terms of precision and recall for the multi-class classification.</p>
Full article ">
22 pages, 6998 KiB  
Article
VBCNet: A Hybird Network for Human Activity Recognition
by Fei Ge, Zhenyang Dai, Zhimin Yang, Fei Wu and Liansheng Tan
Sensors 2024, 24(23), 7793; https://doi.org/10.3390/s24237793 - 5 Dec 2024
Viewed by 360
Abstract
In recent years, the research on human activity recognition based on channel state information (CSI) of Wi-Fi has gradually attracted much attention in order to avoid the deployment of additional devices and reduce the risk of personal privacy leakage. In this paper, we [...] Read more.
In recent years, the research on human activity recognition based on channel state information (CSI) of Wi-Fi has gradually attracted much attention in order to avoid the deployment of additional devices and reduce the risk of personal privacy leakage. In this paper, we propose a hybrid network architecture, named VBCNet, that can effectively identify human activity postures. Firstly, we extract CSI sequences from each antenna of Wi-Fi signals, and the data are preprocessed and tokenised. Then, in the encoder part of the model, we introduce a layer of long short-term memory network to further extract the temporal features in the sequences and enhance the ability of the model to capture the temporal information. Meanwhile, VBCNet employs a convolutional feed-forward network instead of the traditional feed-forward network to enhance the model’s ability to process local and multi-scale features. Finally, the model classifies the extracted features into human behaviours through a classification layer. To validate the effectiveness of VBCNet, we conducted experimental evaluations on the classical human activity recognition datasets UT-HAR and Widar3.0 and achieved an accuracy of 98.65% and 77.92%. These results show that VBCNet exhibits extremely high effectiveness and robustness in human activity recognition tasks in complex scenarios. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

Figure 1
<p>CSI data collection.</p>
Full article ">Figure 2
<p>Magnitude and phase waveforms of Walk, different colours represent different carriers. (<b>a</b>) Amplitude waveform, (<b>b</b>) phase waveform.</p>
Full article ">Figure 3
<p>CSI amplitude waveforms of the “Walk” obtained from each of the three antennas.</p>
Full article ">Figure 4
<p>Comparison before and after outlier processing. (<b>a</b>) Original CSI Signal (<b>b</b>) CSI Signal after Hampel filtering.</p>
Full article ">Figure 5
<p>CSI signals processed with different Symlets and decomposition layers.</p>
Full article ">Figure 6
<p>(<b>a</b>) VBCNet model structure. The input data in the figure is an example only; the input data is not a CSI data map, but a sequence of CSI data. (<b>b</b>) Improved encoder. (<b>c</b>) Convolutional Feed-forward (CF) module.</p>
Full article ">Figure 7
<p>Confusion matrix for six models on the UT-HAR dataset.</p>
Full article ">Figure 8
<p>Training curves for the six models.</p>
Full article ">Figure 9
<p>Accuracy of different models for 22 gestures on Widar 3.0 dataset. (H) represents Horizontal and (V) represents Vertical.</p>
Full article ">Figure 10
<p>VBCNet confusion matrix on 22 gestures.</p>
Full article ">Figure 11
<p>Recall of 22 gestures.</p>
Full article ">Figure 12
<p>Signal representation of “Stand up” and “Sit down” actions. The dashed frame shows the signal fluctuations when these two actions occur.</p>
Full article ">
16 pages, 3599 KiB  
Article
Classification of Diabetic Retinopathy Based on Efficient Computational Modeling
by Jiao Xue, Jianyu Wu, Yingxu Bian, Shiyan Zhang and Qinsheng Du
Appl. Sci. 2024, 14(23), 11327; https://doi.org/10.3390/app142311327 - 4 Dec 2024
Viewed by 554
Abstract
Convolutional neural networks (CNN) and Vision Transformers (ViT) have long been the main backbone networks for visual classification in the field of deep learning. Although ViT has recently received more attention than CNN due to its excellent fitting ability, their scalability is largely [...] Read more.
Convolutional neural networks (CNN) and Vision Transformers (ViT) have long been the main backbone networks for visual classification in the field of deep learning. Although ViT has recently received more attention than CNN due to its excellent fitting ability, their scalability is largely limited by the quadratic complexity of attention computation. For the determination of diabetic retinopathy, the fundus lesions as well as the width, angle, and branching pattern of retinal blood vessels are characterized, inspired by the ability of Mamba and VMamba to efficiently model long sequences, VMamba-m is proposed in this paper. This is a generalized visual skeleton model designed to reduce computational complexity to linear while retaining the advantageous features of ViTs. By modifying the cross-entropy loss function, we enhance the model’s attention to rare categories, especially in large-scale multi-category classification tasks. In order to enhance the adaptability of the VMamba-m model in processing visual data, we introduce the se channel attention mechanism, which enables the model to learn features in the channel dimension and form the importance of each channel. Finally, different weights are assigned to each channel through the incentive part. In addition to this, this paper further improves the implementation details and architectural design by introducing a novel attention mechanism implemented based on the local windowing method, which aims to optimize the model’s ability in processing long sequence data to enhance the performance of VMamba-m and improve its inference speed. Extensive experimental results show that VMamba-m performs well in the retinopathy V classification task, and it has significant advantages in terms of accuracy and computation time over existing benchmark models. Full article
Show Figures

Figure 1

Figure 1
<p>Structure of the VMamba model.</p>
Full article ">Figure 2
<p>Improved VSS module.</p>
Full article ">Figure 3
<p>Improved VSS module structure of autonomous design module.</p>
Full article ">Figure 4
<p>Preprocessed dataset. (<b>a</b>) Picture of diabetes retinopathy before processing; (<b>b</b>) processed picture of diabetes retinopathy.</p>
Full article ">Figure 5
<p>The accuracy plot of training and validation data.</p>
Full article ">Figure 6
<p>The loss plot of validation data.</p>
Full article ">Figure 7
<p>ROC curve of diabetic retinopathy severity grade based on VMamba-m model.</p>
Full article ">Figure 8
<p>The confusion matrix of VMamba-m.</p>
Full article ">
24 pages, 10105 KiB  
Article
SiamRhic: Improved Cross-Correlation and Ranking Head-Based Siamese Network for Object Tracking in Remote Sensing Videos
by Afeng Yang, Zhuolin Yang and Wenqing Feng
Remote Sens. 2024, 16(23), 4549; https://doi.org/10.3390/rs16234549 - 4 Dec 2024
Viewed by 409
Abstract
Object tracking in remote sensing videos is a challenging task in computer vision. Recent advances in deep learning have sparked significant interest in tracking algorithms based on Siamese neural networks. However, many existing algorithms fail to deliver satisfactory performance in complex scenarios due [...] Read more.
Object tracking in remote sensing videos is a challenging task in computer vision. Recent advances in deep learning have sparked significant interest in tracking algorithms based on Siamese neural networks. However, many existing algorithms fail to deliver satisfactory performance in complex scenarios due to challenging conditions and limited computational resources. Thus, enhancing tracking efficiency and improving algorithm responsiveness in complex scenarios are crucial. To address tracking drift caused by similar objects and background interference in remote sensing image tracking, we propose an enhanced Siamese network based on the SiamRhic architecture, incorporating a cross-correlation and ranking head for improved object tracking. We first use convolutional neural networks for feature extraction and integrate the CBAM (Convolutional Block Attention Module) to enhance the tracker’s representational capacity, allowing it to focus more effectively on the objects. Additionally, we replace the original depth-wise cross-correlation operation with asymmetric convolution, enhancing both speed and performance. We also introduce a ranking loss to reduce the classification confidence of interference objects, addressing the mismatch between classification and regression. We validate the proposed algorithm through experiments on the OTB100, UAV123, and OOTB remote sensing datasets. Specifically, SiamRhic achieves success, normalized precision, and precision rates of 0.533, 0.786, and 0.812, respectively, on the OOTB benchmark. The OTB100 benchmark achieves a success rate of 0.670 and a precision rate of 0.892. Similarly, in the UAV123 benchmark, SiamRhic achieves a success rate of 0.621 and a precision rate of 0.823. These results demonstrate the algorithm’s high precision and success rates, highlighting its practical value. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The network architecture starts by taking a template image and a search image. It then extracts deep features using an enhanced ResNet50 network with a weighted attention mechanism. The CBAM attention mechanism is incorporated between the third, fourth, and fifth convolutional layers of the feature extraction network. These features are then input into an adaptive head network for cross-correlation and multi-layer feature fusion. Finally, ranking loss is applied to suppress the classification confidence scores of interfering items and reduce the mismatch between classification and regression.</p>
Full article ">Figure 2
<p>Attention mechanism. Feature maps from the third, fourth, and fifth convolutional blocks are processed through both channel and spatial attention mechanisms before being sent to the head network. The red box represents the channel attention mechanism, while the blue box represents the spatial attention mechanism.</p>
Full article ">Figure 3
<p>Channel attention module (CAM) and spatial attention module (SAM).</p>
Full article ">Figure 4
<p>Asymmetric convolution. (<b>a</b>) DW-Xcorr. (<b>b</b>) A naive approach for fusing feature maps of varying sizes. (<b>c</b>) Symmetric convolution.</p>
Full article ">Figure 5
<p>Ranking loss. We focus on samples with high classification confidence and increased IoU to achieve higher rankings, leveraging the relationship between the classification and regression branches. The red points represent the center point of the object obtained by classification, and the red boxes represent the bounding box of the object obtained by regression.</p>
Full article ">Figure 6
<p>The precision and success rates of our tracker compared to other trackers on the OTB100 dataset. (<b>a</b>) Success plots; (<b>b</b>) Precision plots.</p>
Full article ">Figure 7
<p>The success rate of our tracker compared to other trackers across the 11 challenges of the OTB100 dataset. (<b>a</b>) In-plane Rotation; (<b>b</b>) Fast Motion; (<b>c</b>) Out-of-view; (<b>d</b>) Low Resolution; (<b>e</b>) Occlusion; (<b>f</b>) Illumination Variation; (<b>g</b>) Deformation; (<b>h</b>) Motion Blur; (<b>i</b>) Out-of-plane Rotation; (<b>j</b>) Scale Variation; (<b>k</b>) Background Clutter.</p>
Full article ">Figure 8
<p>The precision of our tracker in comparison to other trackers across the 11 challenges of the OTB100 dataset. (<b>a</b>) In-plane Rotation; (<b>b</b>) Fast Motion; (<b>c</b>) Out-of-view; (<b>d</b>) Low Resolution; (<b>e</b>) Occlusion; (<b>f</b>) Il-lumination Variation; (<b>g</b>) Deformation; (<b>h</b>) Motion Blur; (<b>i</b>) Out-of-plane Rotation; (<b>j</b>) Background Clutter; (<b>k</b>) Scale Variation.</p>
Full article ">Figure 9
<p>The precision and success rates of our tracker, along with those of the comparison trackers, are evaluated on the UAV123 dataset. (<b>a</b>) Success plots; (<b>b</b>) Precision plots.</p>
Full article ">Figure 10
<p>The success rates of our tracker, along with those of the comparison trackers, are assessed across the twelve challenges of the UAV123 dataset. (<b>a</b>) Viewpoint Change; (<b>b</b>) Similar Object; (<b>c</b>) Fast Motion; (<b>d</b>) Out-of-view; (<b>e</b>) Full Occlusion; (<b>f</b>) Illumination Variation; (<b>g</b>) Background Clutter; (<b>h</b>) Aspect Ratio Variation; (<b>i</b>) Scale Variation; (<b>j</b>) Partial Occlusion; (<b>k</b>) Low Resolution; (<b>l</b>) Camera Motion.</p>
Full article ">Figure 11
<p>The precision of our tracker, as well as that of the comparison trackers, is evaluated across the twelve challenges presented in the UAV123 dataset. (<b>a</b>) Viewpoint Change; (<b>b</b>) Similar Object; (<b>c</b>) Fast Motion; (<b>d</b>) Out-of-view; (<b>e</b>) Full Occlusion; (<b>f</b>) Illumination Variation; (<b>g</b>) Background Clutter; (<b>h</b>) Aspect Ratio Variation; (<b>i</b>) Scale Variation; (<b>j</b>) Partial Occlusion; (<b>k</b>) Low Resolution; (<b>l</b>) Camera Motion.</p>
Full article ">Figure 12
<p>The precision, normalized precision, and success rates of both our tracker and the comparison trackers are assessed on the OOTB dataset. (<b>a</b>) Precision plot; (<b>b</b>) Normalized precision plots; (<b>c</b>) Success plots.</p>
Full article ">Figure 13
<p>The precision and success rates of our tracker compared to other trackers on the LaSOT dataset. (<b>a</b>) Success plots; (<b>b</b>) Precision plots.</p>
Full article ">Figure 14
<p>Visualization of the tracking results for our tracker and the comparative trackers across four video sequences from the OOTB dataset. The tracking results, displayed from left to right and top to bottom, correspond to the videos car_11_1, plane_1_1, ship_12_1, and train_1_1.</p>
Full article ">Figure 15
<p>Visualization of the tracking results for our tracker and the comparative trackers across four video sequences from the OTB dataset.</p>
Full article ">Figure 16
<p>Visualization of the tracking results for our tracker and the comparative trackers across four video sequences from the UAV123 dataset.</p>
Full article ">
14 pages, 1464 KiB  
Article
An Improved Neural Network Model Based on DenseNet for Fabric Texture Recognition
by Li Tan, Qiang Fu and Jing Li
Sensors 2024, 24(23), 7758; https://doi.org/10.3390/s24237758 - 4 Dec 2024
Viewed by 471
Abstract
In modern knitted garment production, accurate identification of fabric texture is crucial for enabling automation and ensuring consistent quality control. Traditional manual recognition methods not only demand considerable human effort but also suffer from inefficiencies and are prone to subjective errors. Although machine [...] Read more.
In modern knitted garment production, accurate identification of fabric texture is crucial for enabling automation and ensuring consistent quality control. Traditional manual recognition methods not only demand considerable human effort but also suffer from inefficiencies and are prone to subjective errors. Although machine learning-based approaches have made notable advancements, they typically rely on manual feature extraction. This dependency is time-consuming and often limits recognition accuracy. To address these limitations, this paper introduces a novel model, called the Differentiated Leaning Weighted DenseNet (DLW-DenseNet), which builds upon the DenseNet architecture. Specifically, DLW-DenseNet introduces a learnable weight mechanism that utilizes channel attention to enhance the selection of relevant channels. The proposed mechanism reduces information redundancy and expands the feature search space of the model. To maintain the effectiveness of channel selection in the later stages of training, DLW-DenseNet incorportes a differentiated learning strategy. By assigning distinct learning rates to the learnable weights, the model ensures continuous and efficient channel selection throughout the training process, thus facilitating effective model pruning. Furthermore, in response to the absence of publicly available datasets for fabric texture recognition, we construct a new dataset named KF9 (knitted fabric). Compared to the fabric recognition network based on the improved ResNet, the recognition accuracy has increased by five percentage points, achieving a higher recognition rate. Experimental results demonstrate that DLW-DenseNet significantly outperforms other representative methods in terms of recognition accuracy on the KF9 dataset. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>The original 4-layer DenseNet block.</p>
Full article ">Figure 2
<p>A 4-layer weighted DenseNet block. In this block, each feature channel is assigned a weight w, denoted by two subscripts. The first subscript indicates the layer to which the weight is applied, while the second subscript corresponds to the specific feature channel. For example, w<sub>4,1</sub> represents the weight applied by the fourth layer to the first feature channel. For clarity, only the weights associated with cross-layer feature channels are illustrated, while the weights within non-cross-layer channels are omitted.</p>
Full article ">Figure 3
<p>The process of input feature weighting involves multiplying each feature channel by its corresponding weight parameter, represented by the symbol <math display="inline"><semantics> <mrow> <mo>◯</mo> <mspace width="-8.5359pt"/> <mo>×</mo> </mrow> </semantics></math>.</p>
Full article ">Figure 4
<p>The 4-layer Weighted DenseNet block with the introduction of a differential learning strategy. The dashed lines indicate channels where the corresponding weights have been reduced to near zero during the learning process, effectively performing pruning.</p>
Full article ">Figure 5
<p>Photography setup, where “light” represents the strong, uniform lighting conditions. The “camera” refers to the high-resolution digital camera used for capturing detailed images, while the “table” serves as the platform for positioning the fabric samples. The “fabric” refers to the knitted fabric textures being photographed.</p>
Full article ">Figure 6
<p>Class cardinality of different categories in the KF9 knitted fabric dataset.</p>
Full article ">Figure 7
<p>The legend corresponding to the nine categories of knitted fabric textures.</p>
Full article ">Figure 8
<p>The comparison of performance among the improved ResNet, VGGNet-16, and our proposed network. The horizontal axis represents the number of iterations, while the vertical axis denotes classification accuracy. The solid line represents the smoothed data, whereas the dashed line corresponds to the original data.</p>
Full article ">Figure 9
<p>Comparison between the original DenseNet and the Weighted DenseNet.</p>
Full article ">Figure 10
<p>Comparison between the Weighted DenseNet and the DLW-DenseNet.</p>
Full article ">
Back to TopTop