Saved Queries

This study presents an adaptation of the YOLOv4 deep learning algorithm for 3D object detection, addressing a critical challenge in autonomous vehicle (AV) systems: accurate real-time perception of the surrounding environment in three dimensions. Traditional 2D detection methods, while efficient, fall short in providing the depth and spatial information necessary for safe navigation. This research modifies the YOLOv4 architecture to predict 3D bounding boxes, object depth, and orientation. Key contributions include introducing a multi-task loss function that optimizes 2D and 3D predictions and integrating sensor fusion techniques that combine RGB camera data with LIDAR point clouds for improved depth estimation. The adapted model, tested on real-world datasets, demonstrates a significant increase in 3D detection accuracy, achieving a mean average precision (mAP) of 85%, intersection over union (IoU) of 78%, and near real-time performance at 93–97% for detecting vehicles and 75–91% for detecting people. This approach balances high detection accuracy and real-time processing, making it highly suitable for AV applications. This study advances the field by showing how an efficient 2D detector can be extended to meet the complex demands of 3D object detection in real-world driving scenarios without sacrificing computational efficiency. Full article

(This article belongs to the Special Issue Motion Planning and Control of Autonomous Vehicles)

►▼ Show Figures

Figure 1

29 pages, 8224 KiB

Open AccessArticle

Detection of Domain Name Server Amplification Distributed Reflection Denial of Service Attacks Using Convolutional Neural Network-Based Image Deep Learning

by Hoon Shin, Jaeyeong Jeong, Kyumin Cho, Jaeil Lee, Ohjin Kwon and Dongkyoo Shin

Electronics 2025, 14(1), 76; https://doi.org/10.3390/electronics14010076 - 27 Dec 2024

Abstract

Domain Name Server (DNS) amplification Distributed Reflection Denial of Service (DRDoS) attacks are a Distributed Denial of Service (DDoS) attack technique in which multiple IT systems forge the original IP of the target system, send a request to the DNS server, and then send a large number of response packets to the target system. In this attack, it is difficult to identify the attacker because of its ability to deceive the source, and unlike TCP-based DDoS attacks, it usually uses the UDP protocol, which has a fast communication speed and amplifies network traffic by simple manipulating options, making it one of the most widely used DDoS techniques. In this study, we propose a simple convolutional neural network (CNN) model that is designed to detect DNS amplification DRDoS attack traffic and has hyperparameters adjusted through experiments. As a result of evaluating the accuracy of the proposed CNN model for detecting DNS amplification DRDoS attacks, the average accuracy of the experiment was 0.9995, which was significantly better than several machine learning (ML) models in terms of performance. It also showed good performance compared to other deep learning (DL) models, and, in particular, it was confirmed that this simple CNN had the fastest time in terms of execution compared to other deep learning models by experimentation. Full article

(This article belongs to the Special Issue Machine Learning and Cybersecurity—Trends and Future Challenges)

►▼ Show Figures

Figure 1

22 pages, 4773 KiB

Open AccessArticle

GFN: A Garbage Classification Fusion Network Incorporating Multiple Attention Mechanisms

by Zhaoqi Wang, Wenxue Zhou and Yanmei Li

Electronics 2025, 14(1), 75; https://doi.org/10.3390/electronics14010075 - 27 Dec 2024

Abstract

With the increasing global attention to environmental protection and the sustainable use of resources, waste classification has become a critical issue that needs urgent resolution in social development. Compared with the traditional manual waste classification methods, deep learning-based waste classification systems offer significant advantages. This paper proposes an innovative deep learning framework, Garbage FusionNet (GFN), aimed at tackling the waste classification challenge. GFN enhances classification performance by integrating the local feature extraction strengths of ResNet with the global information processing capabilities of the Vision Transformer (ViT). Furthermore, GFN incorporates the Pyramid Pooling Module (PPM) and the Convolutional Block Attention Module (CBAM), which collectively improve multi-scale feature extraction and emphasize critical features, thereby increasing the model’s robustness and accuracy. The experimental results on the Garbage Dataset and Trashnet demonstrate that GFN achieves superior performance compared with other comparison models. Full article

(This article belongs to the Section Artificial Intelligence)

►▼ Show Figures

Figure 1

20 pages, 8443 KiB

Open AccessArticle

Damage Detection and Identification on Elevator Systems Using Deep Learning Algorithms and Multibody Dynamics Models

by Josef Koutsoupakis, Dimitrios Giagopoulos, Panagiotis Seventekidis, Georgios Karyofyllas and Amalia Giannakoula

Sensors 2025, 25(1), 101; https://doi.org/10.3390/s25010101 - 27 Dec 2024

Abstract

Timely damage detection on a mechanical system can prevent the appearance of catastrophic damage in it, as well as allow for better scheduling of its maintenance and repair process. For this purpose, multiple signal analysis methods have been developed to help identify anomalies in a system, through quantities such as vibrations or deformations in its critical components. In most applications, however, these data may be scarce or inexistent, hindering the overall process. For this purpose, a novel approach for damage detection and identification on elevator systems is developed in this work, where vibration data obtained through physical measurements and high-fidelity multibody dynamics models are combined with deep learning algorithms. High-quality training data are first generated through multibody dynamics simulations and are then combined with healthy state vibration measurements to train an ensemble of autoencoders and convolutional neural networks for damage detection and classification. A dedicated data acquisition system is then developed and integrated with an elevator cabin, allowing for condition monitoring through this novel methodology. The results indicate that the developed framework can accurately identify damages in the system, hinting at its potential as a powerful structural health monitoring tool for such applications, where manual damage localization would otherwise be considerably time-consuming. Full article

(This article belongs to the Special Issue Advanced Sensor Technologies for Fault Diagnosis and Condition Monitoring)

►▼ Show Figures

Figure 1

21 pages, 40095 KiB

Open AccessArticle

Enhanced Landslide Susceptibility Assessment in Western Sichuan Utilizing DCGAN-Generated Samples

by Yuanxin Tong, Hongxia Luo, Zili Qin, Hua Xia and Xinyao Zhou

Land 2025, 14(1), 34; https://doi.org/10.3390/land14010034 - 27 Dec 2024

Abstract

The scarcity of landslide samples poses a critical challenge, impeding the broad application of machine learning techniques in landslide susceptibility assessment (LSA). To address this issue, this study introduces a novel approach leveraging a deep convolutional generative adversarial network (DCGAN) for data augmentation aimed at enhancing the efficacy of various machine learning methods in LSA, including support vector machines (SVMs), convolutional neural networks (CNNs), and residual neural networks (ResNets). Experimental results present substantial enhancements across all three models, with accuracy improved by 2.18%, 2.57%, and 5.28%, respectively. In-depth validation based on large landslide image data demonstrates the superiority of the DCGAN-ResNet, achieving a remarkable landslide prediction accuracy of 91.31%. Consequently, the generation of supplementary samples via the DCGAN is an effective strategy for enhancing the performance of machine learning models in LSA, underscoring the promise of this methodology in advancing early landslide warning systems in western Sichuan. Full article

►▼ Show Figures

Figure 1

14 pages, 1424 KiB

Open AccessArticle

Rice Disease Classification Using a Stacked Ensemble of Deep Convolutional Neural Networks

by Zhibin Wang, Yana Wei, Cuixia Mu, Yunhe Zhang and Xiaojun Qiao

Sustainability 2025, 17(1), 124; https://doi.org/10.3390/su17010124 - 27 Dec 2024

Abstract

Rice is a staple food for almost half of the world’s population, and the stability and sustainability of rice production plays a decisive role in food security. Diseases are a major cause of loss in rice crops. The timely discovery and control of diseases are important in reducing the use of pesticides, protecting the agricultural eco-environment, and improving the yield and quality of rice crops. Deep convolutional neural networks (DCNNs) have achieved great success in disease image classification. However, most models have complex network structures that frequently cause problems, such as redundant network parameters, low training efficiency, and high computational costs. To address this issue and improve the accuracy of rice disease classification, a lightweight deep convolutional neural network (DCNN) ensemble method for rice disease classification is proposed. First, a new lightweight DCNN model (called CG-EfficientNet), which is based on an attention mechanism and EfficientNet, was designed as the base learner. Second, CG-EfficientNet models with different optimization algorithms and network parameters were trained on rice disease datasets to generate seven different CG-EfficientNets, and a resampling strategy was used to enhance the diversity of the individual models. Then, the sequential least squares programming algorithm was used to calculate the weight of each base model. Finally, logistic regression was used as the meta-classifier for stacking. To verify the effectiveness, classification experiments were performed on five classes of rice tissue images: rice bacterial blight, rice kernel smut, rice false smut, rice brown spot, and healthy leaves. The accuracy of the proposed method was 96.10%, which is higher than the results of the classic CNN models VGG16, InceptionV3, ResNet101, and DenseNet201 and four integration methods. The experimental results show that the proposed method is not only capable of accurately identifying rice diseases but is also computationally efficient. Full article

(This article belongs to the Special Issue New Technological Applications in Agriculture for the Development of the Circular Bioeconomy)

►▼ Show Figures

Figure 1

Figure 1
Representative rice images used in this study. (a) Rice bacterial blight, (b) rice brown spot, (c) rice kernel smut, (d) rice false smut, and (e) healthy leaves. Full article ">Figure 2
Framework of the proposed method. Full article ">Figure 3
Architecture of CG-EfficientNet. (a) Overall structure, (b) MBConv1, and (c) MBConv6. Full article ">Figure 4
Structure of the stacking ensemble algorithm. Full article ">

22 pages, 2379 KiB

Open AccessArticle

Harnessing Convolutional Neural Networks for Automated Wind Turbine Blade Defect Detection

by Mislav Spajić, Mirko Talajić and Mirjana Pejić Bach

Designs 2025, 9(1), 2; https://doi.org/10.3390/designs9010002 - 27 Dec 2024

Abstract

The shift towards renewable energy, particularly wind energy, is rapidly advancing globally, with Southeastern Europe and Croatia, in particular, experiencing a notable increase in wind turbine construction. The frequent exposure of wind turbine blades to environmental stressors and operational forces requires regular inspections to identify defects, such as erosion, cracks, and lightning damage, in order to minimize maintenance costs and operational downtime. This study aims to develop a machine learning model using convolutional neural networks to simplify the defect detection process for wind turbine blades, enhancing the efficiency and accuracy of inspections conducted by drones. The model leverages transfer learning on the YOLOv7 architecture and is trained on a dataset of 231 images with 246 annotated defects across eight categories, achieving a mean average precision of 0.76 at an intersection over the union threshold of 0.5. This research not only presents a robust framework for automated defect detection but also proposes a methodological approach for future studies in deep learning for structural inspections, highlighting significant economic benefits and improvements in inspection quality and speed. Full article

(This article belongs to the Special Issue Design and Analysis of Offshore Wind Turbines)

►▼ Show Figures

Figure 1

22 pages, 5600 KiB

Open AccessArticle

Coffee Rust Severity Analysis in Agroforestry Systems Using Deep Learning in Peruvian Tropical Ecosystems

by Candy Ocaña-Zuñiga, Lenin Quiñones-Huatangari, Elgar Barboza, Naili Cieza Peña, Sherson Herrera Zamora and Jose Manuel Palomino Ojeda

Agriculture 2025, 15(1), 39; https://doi.org/10.3390/agriculture15010039 - 27 Dec 2024

Abstract

Agroforestry systems can influence the occurrence and abundance of pests and diseases because integrating crops with trees or other vegetation can create diverse microclimates that may either enhance or inhibit their development. This study analyzes the severity of coffee rust in two agroforestry systems in the provinces of Jaén and San Ignacio in the department of Cajamarca (Peru). This research used a quantitative descriptive approach, and 319 photographs were collected with a professional camera during field trips. The photographs were segmented, classified and analyzed using the deep learning MobileNet and VGG16 transfer learning models with two methods for measuring rust severity from SENASA Peru and SENASICA Mexico. The results reported that grade 1 is the most prevalent rust severity according to the SENASA methodology (1 to 5% of the leaf affected) and SENASICA Mexico (0 to 2% of the leaf affected). Moreover, the proposed MobileNet model presented the best classification accuracy rate of 94% over 50 epochs. This research demonstrates the capacity of machine learning algorithms in disease diagnosis, which could be an alternative to help experts quantify the severity of coffee rust in coffee trees and broadens the field of research for future low-cost computational tools for disease recognition and classification Full article

(This article belongs to the Section Digital Agriculture)

►▼ Show Figures

Figure 1

18 pages, 678 KiB

Open AccessArticle

Privacy-Preserving Federated Learning-Based Intrusion Detection System for IoHT Devices

by Fatemeh Mosaiyebzadeh, Seyedamin Pouriyeh, Meng Han, Liyuan Liu, Yixin Xie, Liang Zhao and Daniel Macêdo Batista

Electronics 2025, 14(1), 67; https://doi.org/10.3390/electronics14010067 - 27 Dec 2024

Abstract

In recent years, Internet of Healthcare Things (IoHT) devices have attracted significant attention from computer scientists, healthcare professionals, and patients. These devices enable patients, especially in areas without access to hospitals, to easily record and transmit their health data to medical staff via the Internet. However, the analysis of sensitive health information necessitates a secure environment to safeguard patient privacy. Given the sensitivity of healthcare data, ensuring security and privacy is crucial in this sector. Federated learning (FL) provides a solution by enabling collaborative model training without sharing sensitive health data with third parties. Despite FL addressing some privacy concerns, the privacy of IoHT data remains an area needing further development. In this paper, we propose a privacy-preserving federated learning framework to enhance the privacy of IoHT data. Our approach integrates federated learning with

ϵ

-differential privacy to design an effective and secure intrusion detection system (IDS) for identifying cyberattacks on the network traffic of IoHT devices. In our FL-based framework, SECIoHT-FL, we employ deep neural network (DNN) including convolutional neural network (CNN) models. We assess the performance of the SECIoHT-FL framework using metrics such as accuracy, precision, recall, F1-score, and privacy budget (

ϵ

). The results confirm the efficacy and efficiency of the framework. For instance, the proposed CNN model within SECIoHT-FL achieved an accuracy of 95.48% and a privacy budget (

ϵ

) of 0.34 when detecting attacks on one of the datasets used in the experiments. To facilitate the understanding of the models and the reproduction of the experiments, we provide the explainability of the results by using SHAP and share the source code of the framework publicly as free and open-source software. Full article

(This article belongs to the Special Issue Navigating the Digital Age: Security, Ethics and Trust in Emerging Technologies)

►▼ Show Figures

Figure 1

Figure 1
Architecture of our SECIoHT-FL for anomaly detection. Full article ">Figure 2
SHAP values for detected attacks in wustl-ehms-2020. Full article ">Figure 3
SHAP values for detected attacks in ECU-IoHT. Full article ">

42 pages, 7308 KiB

Open AccessArticle

Vertical Force Monitoring of Racing Tires: A Novel Deep Neural Network-Based Estimation Method

by Semih Öngir, Egemen Cumhur Kaleli, Mehmet Zeki Konyar and Hüseyin Metin Ertunç

Appl. Sci. 2025, 15(1), 123; https://doi.org/10.3390/app15010123 - 27 Dec 2024

Abstract

This study aims to accurately estimate vertical tire forces on racing tires of specific stiffness using acceleration, pressure, and speed data measurements from a test rig. A hybrid model, termed Random Forest Assisted Deep Neural Network (RFADNN), is introduced, combining a novel deep learning framework with the Random Forest Algorithm to enhance estimation accuracy. By leveraging the Temporal Convolutional Network (TCN), Minimal Gated Unit (MGU), Long Short-Term Memory (LSTM), and Attention mechanisms, the deep learning framework excels in extracting complex features, which the Random Forest Model subsequently analyzes to improve the accuracy of estimating vertical tire forces. Validated with test data, this approach outperforms standard models, achieving an MAE of 0.773 kgf, demonstrating the advantage of the RFADNN method in required vertical force estimation tasks for race tires. This comparison emphasizes the significant benefits of incorporating advanced deep learning with traditional machine learning to provide a comprehensive and interpretable solution for complex estimation challenges in automotive engineering. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

►▼ Show Figures

Figure 1

1 pages, 127 KiB

Open AccessCorrection

Correction: Khalid et al. Real-Time Plant Health Detection Using Deep Convolutional Neural Networks. Agriculture 2023, 13, 510

by Mahnoor Khalid, Muhammad Shahzad Sarfraz, Uzair Iqbal, Muhammad Umar Aftab, Gniewko Niedbała and Hafiz Tayyab Rauf

Agriculture 2025, 15(1), 38; https://doi.org/10.3390/agriculture15010038 - 27 Dec 2024

Abstract

Affiliation Revision [...] Full article

(This article belongs to the Section Digital Agriculture)

19 pages, 5488 KiB

Open AccessArticle

Aircraft Position Estimation Using Deep Convolutional Neural Networks for Low SNR (Signal-to-Noise Ratio) Values

by Przemyslaw Mazurek and Wojciech Chlewicki

Sensors 2025, 25(1), 97; https://doi.org/10.3390/s25010097 - 27 Dec 2024

Viewed by 4

Abstract

The safety of the airspace could be improved by the use of visual methods for the detection and tracking of aircraft. However, in the case of the small angular size of airplanes and the high noise level in the image, sufficient use of such methods might be difficult. By using the ConvNN (Convolutional Neural Network), it is possible to obtain a detector that performs the segmentation task for aircraft images that are very small and lost in the background noise. In the learning process, a database of actual aircraft images was used. Using the Monte Carlo method, four types of Max algorithms, i.e., Pixel Value, Min. Pixel Value, and Max. Abs. Pixel Value, were compared with ConvNN’s forward architecture. The obtained results showed superior detection with ConvNN. For example, if the standard deviation equals 0.1, it was twice as large. Deep dream analysis for network layers is presented, which shows a preference for images with horizontal contrast lines. The proposed solution uses the processed image values for the tracking process with the raw data using the Track-Before-Detect method. Full article

(This article belongs to the Section Sensing and Imaging)

►▼ Show Figures

Figure 1

18 pages, 7873 KiB

Open AccessArticle

Fault Diagnosis of Lithium Battery Modules via Symmetrized Dot Pattern and Convolutional Neural Networks

by Meng-Hui Wang, Jing-Xuan Hong and Shiue-Der Lu

Sensors 2025, 25(1), 94; https://doi.org/10.3390/s25010094 - 27 Dec 2024

Viewed by 86

Abstract

This paper proposes a hybrid algorithm combining the symmetrized dot pattern (SDP) method and a convolutional neural network (CNN) for fault detection in lithium battery modules. The study focuses on four fault types: overcharge, over-discharge, aging, and leakage caused by manual perforation. An 80.5 kHz high-frequency square wave signal is input into the battery module and recorded using a high-speed data acquisition card. The signal is processed by the SDP method to generate characteristic images for fault diagnosis. Finally, a deep learning algorithm is used to evaluate the state of the lithium battery. A total of 3000 samples were collected, with 400 samples used for training and 200 for testing for each fault type, achieving an overall identification accuracy of 99.9%, demonstrating the effectiveness of the proposed method. Full article

(This article belongs to the Special Issue Smart Sensors for Machine Condition Monitoring and Fault Diagnosis)

►▼ Show Figures

Figure 1

21 pages, 9375 KiB

Open AccessArticle

Reconstruction of Optical Coherence Tomography Images from Wavelength Space Using Deep Learning

by Maryam Viqar, Erdem Sahin, Elena Stoykova and Violeta Madjarova

Sensors 2025, 25(1), 93; https://doi.org/10.3390/s25010093 - 27 Dec 2024

Viewed by 98

Abstract

Conventional Fourier domain Optical Coherence Tomography (FD-OCT) systems depend on resampling into a wavenumber (k) domain to extract the depth profile. This either necessitates additional hardware resources or amplifies the existing computational complexity. Moreover, the OCT images also suffer from speckle noise, due to systemic reliance on low-coherence interferometry. We propose a streamlined and computationally efficient approach based on Deep Learning (DL) which enables reconstructing speckle-reduced OCT images directly from the wavelength (λ) domain. For reconstruction, two encoder–decoder styled networks, namely Spatial Domain Convolution Neural Network (SD-CNN) and Fourier Domain CNN (FD-CNN), are used sequentially. The SD-CNN exploits the highly degraded images obtained by Fourier transforming the (λ) domain fringes to reconstruct the deteriorated morphological structures along with suppression of unwanted noise. The FD-CNN leverages this output to enhance the image quality further by optimization in the Fourier domain (FD). We quantitatively and visually demonstrate the efficacy of the method in obtaining high-quality OCT images. Furthermore, we illustrate the computational complexity reduction by harnessing the power of DL models. We believe that this work lays the framework for further innovations in the realm of OCT image reconstruction. Full article

(This article belongs to the Section Sensing and Imaging)

►▼ Show Figures

Figure 1

Figure 1
Schematics of the proposed framework containing Spatial Domain–Convolution Neural Network (SD-CNN) and Fourier Domain–Convolution Neural Network (FD-CNN); <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>x</mi> </mrow> <mrow> <mi>i</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msubsup> <mrow> <mi>x</mi> </mrow> <mrow> <mi>i</mi> </mrow> <mrow> <mo>′</mo> </mrow> </msubsup> </mrow> </semantics></math> are the outputs and <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>y</mi> </mrow> <mrow> <mi>i</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msubsup> <mrow> <mi>y</mi> </mrow> <mrow> <mi>i</mi> </mrow> <mrow> <mo>′</mo> </mrow> </msubsup> </mrow> </semantics></math> are the ground truths for SD-CNN and FD-CNN, respectively. Full article ">Figure 2
(a) DL network used as the Fourier Domain–Convolution Neural Network and Spatial Domain–Convolution Neural Network; (b) 1 B-scan and image averaging using 5, 7, 9 B-scans for vein, lemon, and cherry; (c) Wavenumber layer interleaved with the input of Spatial Domain–Convolution Neural Network. Full article ">Figure 3
Comparison of k-linearization fringes for different volumes used: vein, finger, lemon, tooth, cherry, flounder-egg, seed (pea). Full article ">Figure 4
Comparison of B-scans from five different volumes. (A) Ground truth (seven B-scans of OCT system output averaged), (B) OCT system output, (C) OCT system raw data input, (D) output of the proposed framework. The reconstruction (D) shows high similarity to the desired ground truth. Full article ">Figure 5
Comparison of line plot to show the variation in intensity (A.U.) for the central column of the red rectangle marked in <a href="#sensors-25-00093-f004" class="html-fig">Figure 4</a>; here, plots correspond to (a) vein, (b) finger (c) lemon, (d) tooth, and (e) cherry samples, for ground truth, input, OCT output, and proposed framework outputs shown using different lines in each plot. Full article ">Figure 6
Comparison between the ground truth, the output of the Spatial Domain–Convolution Neural Network and Fourier Domain–Convolution Neural Network for (a) vein, (b) finger, (c) lemon, (d) tooth, and (e) cherry. For each image, the magnified regions are shown for better comparison. The results of the combined SD-CNN+FD-CNN show enhanced performance when compared to output of only SD-CNN, demonstrating the better reconstruction capability of high-frequency details using FD-CNN. Full article ">Figure 7
Comparison of cross-validation results on (a,c,e) flounder egg and (b,d) seed (pea) samples for ground truth, OCT output, and reconstructions obtained by the proposed model. The cross-validation results show robustness and generalization capability of the proposed model on a completely unseen volume. Full article ">

29 pages, 1433 KiB

Open AccessArticle

Sparse Convolution FPGA Accelerator Based on Multi-Bank Hash Selection

by Jia Xu, Han Pu and Dong Wang

Micromachines 2025, 16(1), 22; https://doi.org/10.3390/mi16010022 - 27 Dec 2024

Viewed by 79

Abstract

Reconfigurable processor-based acceleration of deep convolutional neural network (DCNN) algorithms has emerged as a widely adopted technique, with particular attention on sparse neural network acceleration as an active research area. However, many computing devices that claim high computational power still struggle to execute neural network algorithms with optimal efficiency, low latency, and minimal power consumption. Consequently, there remains significant potential for further exploration into improving the efficiency, latency, and power consumption of neural network accelerators across diverse computational scenarios. This paper investigates three key techniques for hardware acceleration of sparse neural networks. The main contributions are as follows: (1) Most neural network inference tasks are typically executed on general-purpose computing devices, which often fail to deliver high energy efficiency and are not well-suited for accelerating sparse convolutional models. In this work, we propose a specialized computational circuit for the convolutional operations of sparse neural networks. This circuit is designed to detect and eliminate the computational effort associated with zero values in the sparse convolutional kernels, thereby enhancing energy efficiency. (2) The data access patterns in convolutional neural networks introduce significant pressure on the high-latency off-chip memory access process. Due to issues such as data discontinuity, the data reading unit often fails to fully exploit the available bandwidth during off-chip read and write operations. In this paper, we analyze bandwidth utilization in the context of convolutional accelerator data handling and propose a strategy to improve off-chip access efficiency. Specifically, we leverage a compiler optimization plugin developed for Vitis HLS, which automatically identifies and optimizes on-chip bandwidth utilization. (3) In coefficient-based accelerators, the synchronous operation of individual computational units can significantly hinder efficiency. Previous approaches have achieved asynchronous convolution by designing separate memory units for each computational unit; however, this method consumes a substantial amount of on-chip memory resources. To address this issue, we propose a shared feature map cache design for asynchronous convolution in the accelerators presented in this paper. This design resolves address access conflicts when multiple computational units concurrently access a set of caches by utilizing a hash-based address indexing algorithm. Moreover, the shared cache architecture reduces data redundancy and conserves on-chip resources. Using the optimized accelerator, we successfully executed ResNet50 inference on an Intel Arria 10 1150GX FPGA, achieving a throughput of 497 GOPS, or an equivalent computational power of 1579 GOPS, with a power consumption of only 22 watts. Full article

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 188.

Go to page 1 2 3 4 5

Search Results (9,371)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI