[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (100)

Search Parameters:
Keywords = PANet

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 6914 KiB  
Article
YOLO-TC: An Optimized Detection Model for Monitoring Safety-Critical Small Objects in Tower Crane Operations
by Dong Ding, Zhengrong Deng and Rui Yang
Algorithms 2025, 18(1), 27; https://doi.org/10.3390/a18010027 - 6 Jan 2025
Viewed by 263
Abstract
Ensuring operational safety within high-risk environments, such as construction sites, is paramount, especially for tower crane operations where distractions can lead to severe accidents. Despite existing behavioral monitoring approaches, the task of identifying small yet hazardous objects like mobile phones and cigarettes in [...] Read more.
Ensuring operational safety within high-risk environments, such as construction sites, is paramount, especially for tower crane operations where distractions can lead to severe accidents. Despite existing behavioral monitoring approaches, the task of identifying small yet hazardous objects like mobile phones and cigarettes in real time remains a significant challenge in ensuring operator compliance and site safety. Traditional object detection models often fall short in crane operator cabins due to complex lighting conditions, cluttered backgrounds, and the small physical scale of target objects. To address these challenges, we introduce YOLO-TC, a refined object detection model tailored specifically for tower crane monitoring applications. Built upon the robust YOLOv7 architecture, our model integrates a novel channel–spatial attention mechanism, ECA-CBAM, into the backbone network, enhancing feature extraction without an increase in parameter count. Additionally, we propose the HA-PANet architecture to achieve progressive feature fusion, addressing scale disparities and prioritizing small object detection while reducing noise from unrelated objects. To improve bounding box regression, the MPDIoU Loss function is employed, resulting in superior accuracy for small, critical objects in dense environments. The experimental results on both the PASCAL VOC benchmark and a custom dataset demonstrate that YOLO-TC outperforms baseline models, showcasing its robustness in identifying high-risk objects under challenging conditions. This model holds significant promise for enhancing automated safety monitoring, potentially reducing occupational hazards by providing a proactive, resilient solution for real-time risk detection in tower crane operations. Full article
(This article belongs to the Special Issue Advances in Computer Vision: Emerging Trends and Applications)
Show Figures

Figure 1

Figure 1
<p>ECANet structure.</p>
Full article ">Figure 2
<p>AFPN structure.</p>
Full article ">Figure 3
<p>YOLO-TC network structure.</p>
Full article ">Figure 4
<p>E-CAM.</p>
Full article ">Figure 5
<p>ECA-CBAM structure.</p>
Full article ">Figure 6
<p>HA-PANet structure.</p>
Full article ">Figure 7
<p>Detection head reconstruction.</p>
Full article ">Figure 8
<p>Partial dataset images.</p>
Full article ">Figure 9
<p>Precision–recall curve for attentional mechanisms. (<b>a</b>) CBAM. (<b>b</b>) ECA-CBAM.</p>
Full article ">Figure 10
<p>YOLO-TC accuracy–recall curve and visualized confusion matrix. (<b>a</b>) Accuracy–recall curve. (<b>b</b>) Confusion matrix.</p>
Full article ">Figure 11
<p>YOLO-TC vs. YOLOv7 and YOLOv8 class mAP plot on the PASCAL VOC dataset.</p>
Full article ">Figure 12
<p>Comparison of detection results of the algorithm before and after improvement. (<b>a</b>) Original picture. (<b>b</b>) Results of YOLOv7. (<b>c</b>) Results of YOLO-TC.</p>
Full article ">
21 pages, 6316 KiB  
Article
A Method for Tomato Ripeness Recognition and Detection Based on an Improved YOLOv8 Model
by Zhanshuo Yang, Yaxian Li, Qiyu Han, Haoming Wang, Chunjiang Li and Zhandong Wu
Horticulturae 2025, 11(1), 15; https://doi.org/10.3390/horticulturae11010015 - 27 Dec 2024
Viewed by 303
Abstract
With the rapid development of agriculture, tomatoes, as an important economic crop, require accurate ripeness recognition technology to enable selective harvesting. Therefore, intelligent tomato ripeness recognition plays a crucial role in agricultural production. However, factors such as lighting conditions and occlusion lead to [...] Read more.
With the rapid development of agriculture, tomatoes, as an important economic crop, require accurate ripeness recognition technology to enable selective harvesting. Therefore, intelligent tomato ripeness recognition plays a crucial role in agricultural production. However, factors such as lighting conditions and occlusion lead to issues such as low detection accuracy, false detections, and missed detections. Thus, a deep learning algorithm for tomato ripeness detection based on an improved YOLOv8n is proposed in this study. First, the improved YOLOv8 model is used for tomato target detection and ripeness classification. The RCA-CBAM (Region and Color Attention Convolutional Block Attention Module) module is introduced into the YOLOv8 backbone network to enhance the model’s focus on key features. By incorporating attention mechanisms across three dimensions—color, channel, and spatial attention—the model’s ability to recognize changes in tomato color and spatial positioning is improved. Additionally, the BiFPN (Bidirectional Feature Pyramid Network) module is introduced to replace the traditional PANet connection, which achieves efficient feature fusion across different scales of tomato skin color, size, and surrounding environment and optimizes the expression ability of the feature map. Finally, an Inner-FocalerIoU loss function is designed and integrated to address the difficulty of ripeness classification caused by class imbalance in the samples. The results show that the improved YOLOv8+ model is capable of accurately recognizing the ripeness level of tomatoes, achieving relatively high values of 95.8% precision value and 91.7% accuracy on the test dataset. It is concluded that the new model has strong detection performance and real-time detection. Full article
Show Figures

Figure 1

Figure 1
<p>Image processing process.</p>
Full article ">Figure 2
<p>LabelImg Annotation and Classification.</p>
Full article ">Figure 3
<p>RCA-CBAM Attention Module.</p>
Full article ">Figure 4
<p>Structure of Inner-Focaleriou module.</p>
Full article ">Figure 5
<p>Structure of BiFPN combined with RCA-CBAM detection.</p>
Full article ">Figure 6
<p>The overall structure of the improved YOLOv8+.</p>
Full article ">Figure 7
<p>Ablation experiments against the original model.</p>
Full article ">Figure 8
<p>YOLOv8+ test results in different environments: (<b>a</b>) Exposure environment; (<b>b</b>) Localized occlusion; (<b>c</b>) Backlit environment; (<b>d</b>) Small target at a distance.</p>
Full article ">Figure 8 Cont.
<p>YOLOv8+ test results in different environments: (<b>a</b>) Exposure environment; (<b>b</b>) Localized occlusion; (<b>c</b>) Backlit environment; (<b>d</b>) Small target at a distance.</p>
Full article ">Figure 9
<p>Changes in mAP@0.5 after training with different models.</p>
Full article ">Figure 10
<p>Detection results of four models under different environments: (<b>a</b>) Backlit environment; (<b>b</b>) Small target at a distance; (<b>c</b>) Exposure environment; (<b>d</b>) Localized occlusion.</p>
Full article ">
26 pages, 6713 KiB  
Article
Improved Field Obstacle Detection Algorithm Based on YOLOv8
by Xinying Zhou, Wenming Chen and Xinhua Wei
Agriculture 2024, 14(12), 2263; https://doi.org/10.3390/agriculture14122263 - 11 Dec 2024
Viewed by 813
Abstract
To satisfy the obstacle avoidance requirements of unmanned agricultural machinery during autonomous operation and address the challenge of rapid obstacle detection in complex field environments, an improved field obstacle detection model based on YOLOv8 was proposed. This model enabled the fast detection and [...] Read more.
To satisfy the obstacle avoidance requirements of unmanned agricultural machinery during autonomous operation and address the challenge of rapid obstacle detection in complex field environments, an improved field obstacle detection model based on YOLOv8 was proposed. This model enabled the fast detection and recognition of obstacles such as people, tractors, and electric power pylons in the field. This detection model was built upon the YOLOv8 architecture with three main improvements. First, to adapt to different tasks and complex environments in the field, improve the sensitivity of the detector to various target sizes and positions, and enhance detection accuracy, the CBAM (Convolutional Block Attention Module) was integrated into the backbone layer of the benchmark model. Secondly, a BiFPN (Bi-directional Feature Pyramid Network) architecture took the place of the original PANet to enhance the fusion of features across multiple scales, thereby increasing the model’s capacity to distinguish between the background and obstacles. Third, WIoU v3 (Wise Intersection over Union v3) optimized the target boundary loss function, assigning greater focus to medium-quality anchor boxes and enhancing the detector’s overall performance. A dataset comprising 5963 images of people, electric power pylons, telegraph poles, tractors, and harvesters in a farmland environment was constructed. The training set comprised 4771 images, while the validation and test sets each consisted of 596 images. The results from the experiments indicated that the enhanced model attained precision, recall, and average precision scores of 85.5%, 75.1%, and 82.5%, respectively, on the custom dataset. This reflected increases of 1.3, 1.2, and 1.9 percentage points when compared to the baseline YOLOv8 model. Furthermore, the model reached 52 detection frames per second, thereby significantly enhancing the detection performance for common obstacles in the field. The model enhanced by the previously mentioned techniques guarantees a high level of detection accuracy while meeting the criteria for real-time obstacle identification in unmanned agricultural equipment during fieldwork. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>YOLOv8 network structure.</p>
Full article ">Figure 2
<p>CBBI-YOLO network structure. We added a lime-green CBAM module between the C2f module and the SPPF module, replaced the original green Concat module with a green BiFPN module, and replaced the original CIoU loss function with the WIoU v3 loss function from Bbox Loss, while leaving the other modules un-changed.</p>
Full article ">Figure 3
<p>CBAM attention, echanism. The blue module represents the original feature map; the orange module represents the channel attention module; the purple module represents the spatial attention module. The original input feature map is directly producted with the channel attention feature map, and the processed feature map is directly producted with the spatial attention feature map to obtain the pink module, which is the final feature map.</p>
Full article ">Figure 4
<p>Channel attention module. The green module represents the input feature map, after average pooling and maximum pooling operations to obtain two feature maps (light pink and light purple modules), which are fed into the multilayer perceptron MLP to obtain two feature maps (pink and green fused module, purple and green fused module), and then after summation and activation function (purple Relu module) operations to obtain the channel attention feature map (module of purple and pink fusion).</p>
Full article ">Figure 5
<p>Spatial attention module. The green module represents the input feature map, after average pooling and maximum pooling operations to obtain two feature maps (pink and purple modules), these two feature maps after splicing and convolution operations (blue Conv module) obtained feature maps (white module) to activation function to obtain the spatial attention feature maps (white and grey fusion of the module).</p>
Full article ">Figure 6
<p>PANet structure: (<b>a</b>) FPN structure; (<b>b</b>) bottom-up structure. (<b>a</b>,<b>b</b>) together form the PANet structure. The red dashed line indicates that the information is passed from the bottom fea-ture map to the high-level feature map, which undergoes a large number of convolution operations; the green dashed line indicates that the bottom information is fused into the current layer and the previ-ous layer until the highest information is reached (from C2 to P2 and then to N2 until N5), this greatly reduces the number of convolution calculations.</p>
Full article ">Figure 7
<p>BiFPN structure. P3, P4, and P5 are the outputs of the backbone network, which undergoes two downsampling operations to obtain P6 and P7 after performing a convolution operation to adjust the channels to obtain the input <math display="inline"><semantics> <mrow> <msubsup> <mrow> <mi>P</mi> </mrow> <mrow> <mi>n</mi> </mrow> <mrow> <mi>i</mi> <mi>n</mi> </mrow> </msubsup> </mrow> </semantics></math> . The middle part is <math display="inline"><semantics> <mrow> <msubsup> <mrow> <mi>P</mi> </mrow> <mrow> <mi>n</mi> </mrow> <mrow> <mi>t</mi> <mi>d</mi> </mrow> </msubsup> </mrow> </semantics></math> in the following equation; <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>w</mi> </mrow> <mrow> <mi>n</mi> </mrow> </msub> </mrow> </semantics></math> is the weighting factor. The right part is <math display="inline"><semantics> <mrow> <msubsup> <mrow> <mi>P</mi> </mrow> <mrow> <mi>n</mi> </mrow> <mrow> <mi>o</mi> <mi>u</mi> <mi>t</mi> </mrow> </msubsup> </mrow> </semantics></math>.</p>
Full article ">Figure 8
<p>Five examples of obstacles.</p>
Full article ">Figure 9
<p>The distribution of dataset labels. The left image represents the distribution of the center points of the target bounding box, and the horizontal (x) and vertical (y) axes represent the width and height normalized coordinates of the image, respectively. The right image represents the distribution of the width and height of the target bounding box, the horizontal axis represents the relative width of the target box, the vertical axis represents the relative height, and the dark-colored region is the location with higher fre-quency.</p>
Full article ">Figure 10
<p>Comparison of precision, recall, and mAP between YOLOv8 and CBBI-YOLO. (<span class="html-italic">X</span>-axis epochs denote the number of training epochs, which is a dimensionless quantity. <span class="html-italic">Y</span>-axis shows a percentage that usually ranges from 0 to 1. Due to the Early Stopping strategy, training stopped at around 150 epochs).</p>
Full article ">Figure 11
<p>Some test results from the field.</p>
Full article ">Figure 12
<p>Test results: (<b>a</b>) YOLO v8 model missed detection; (<b>b</b>) CBBI-YOLO model correctness detection.</p>
Full article ">Figure 13
<p>Test results: (<b>a</b>) YOLO v8 model confidence level; (<b>b</b>) CBBI-YOLO model confidence level.</p>
Full article ">Figure 14
<p>(<b>a</b>) Precision–confidence curve, (<b>b</b>) precision–recall curve, and (<b>c</b>) recall–confidence curve during CBBI-YOLO model training.</p>
Full article ">Figure 15
<p>Overview of precision, recall, and average precision during CBBI-YOLO model training. (<span class="html-italic">X</span>-axis epochs denotes the number of training epochs, which is a dimensionless quantity. <span class="html-italic">Y</span>-axis shows a percentage that usually ranges from 0 to 1. Due to the Early Stopping strategy, training stopped at around 150 epochs).</p>
Full article ">
15 pages, 9384 KiB  
Article
BSMD-YOLOv8: Enhancing YOLOv8 for Book Signature Marks Detection
by Long Guo, Lubin Wang, Qiang Yu and Xiaolan Xie
Appl. Sci. 2024, 14(23), 10829; https://doi.org/10.3390/app142310829 - 22 Nov 2024
Viewed by 535
Abstract
In the field of bookbinding, accurately and efficiently detecting signature sequences during the binding process is crucial for enhancing quality, improving production efficiency, and advancing industrial automation. Despite significant advancements in object detection technology, verifying the correctness of signature sequences remains challenging due [...] Read more.
In the field of bookbinding, accurately and efficiently detecting signature sequences during the binding process is crucial for enhancing quality, improving production efficiency, and advancing industrial automation. Despite significant advancements in object detection technology, verifying the correctness of signature sequences remains challenging due to the small size, dense distribution, and abundance of low-quality signature marks. To tackle these challenges, we introduce the Book Signature Marks Detection (BSMD-YOLOv8) model, specifically designed for scenarios involving small, closely spaced objects such as signature marks. Our proposed backbone, the Lightweight Multi-scale Residual Network (LMRNet), achieves a lightweight network while enhancing the accuracy of small object detection. To address the issue of insufficient fusion of local and global feature information in PANet, we design the Low-stage gather-and-distribute (Low-GD) module and the High-stage gather-and-distribute (High-GD) module to enhance the model’s multi-scale feature fusion capabilities, thereby refining the integration of local and global features of signature marks. Furthermore, we introduce Wise-IoU (WIoU) as a replacement for CIoU, prioritizing anchor boxes with moderate quality and mitigating harmful gradients from low-quality examples. Experimental results demonstrate that, compared to YOLOv8n, BSMD-YOLOv8 reduces the number of parameters by 65%, increases the frame rate by 7 FPS, and enhances accuracy, recall, and mAP50 by 2.2%, 8.6%, and 3.9% respectively, achieving rapid and accurate detection of signature marks. Full article
Show Figures

Figure 1

Figure 1
<p>Examples of mismatched detections and accurate detections.</p>
Full article ">Figure 2
<p>Examples of the dataset: (<b>a</b>) before data augmentation; (<b>b</b>) after data augmentation.</p>
Full article ">Figure 3
<p>Analysis results of the dataset: (<b>a</b>) Information regarding the manual annotation process for objects in the dataset; (<b>b</b>) Low-quality signature mark examples (The areas circled in red).</p>
Full article ">Figure 4
<p>BSMD-YOLOv8 network structure diagram.</p>
Full article ">Figure 5
<p>Different backbone network designs: (<b>a</b>) CSP-Darknet53; (<b>b</b>) LMRNet.</p>
Full article ">Figure 6
<p>Multi-scale residual convolution module structure diagram.</p>
Full article ">Figure 7
<p>PANet and Improved PANet structure diagram.</p>
Full article ">Figure 8
<p>Low-stage gather-and-distribute module structure diagram.</p>
Full article ">Figure 9
<p>High-stage gather-and-distribute module structure diagram.</p>
Full article ">Figure 10
<p>Comparison of actual test results: (<b>a</b>) original image; (<b>b</b>) inference results for YOLOv8n; (<b>c</b>) inference results for Improved-YOLOv5s; (<b>d</b>) inference results for BSMD-YOLOv8.</p>
Full article ">Figure 11
<p>Training progress plot comparing ablation experiments based on mAP50 and Recall: (<b>a</b>) mAP50 vs. Epochs; (<b>b</b>) Recall vs. Epochs.</p>
Full article ">Figure 12
<p>Comparison of detection results between YOLOv8_L_I and BSMD-YOLOv8: (<b>a</b>) YOLOv8_L_I (using CIoU); (<b>b</b>) BSMD-YOLOv8 (using WIoU).</p>
Full article ">
16 pages, 6553 KiB  
Article
Cucumber Leaf Segmentation Based on Bilayer Convolutional Network
by Tingting Qian, Yangxin Liu, Shenglian Lu, Linyi Li, Xiuguo Zheng, Qingqing Ju, Yiyang Li, Chun Xie and Guo Li
Agronomy 2024, 14(11), 2664; https://doi.org/10.3390/agronomy14112664 - 12 Nov 2024
Viewed by 714
Abstract
When monitoring crop growth using top-down images of the plant canopies, leaves in agricultural fields appear very dense and significantly overlap each other. Moreover, the image can be affected by external conditions such as background environment and light intensity, impacting the effectiveness of [...] Read more.
When monitoring crop growth using top-down images of the plant canopies, leaves in agricultural fields appear very dense and significantly overlap each other. Moreover, the image can be affected by external conditions such as background environment and light intensity, impacting the effectiveness of image segmentation. To address the challenge of segmenting dense and overlapping plant leaves under natural lighting conditions, this study employed a Bilayer Convolutional Network (BCNet) method for accurate leaf segmentation across various lighting environments. The major contributions of this study are as follows: (1) Utilized Fully Convolutional Object Detection (FCOS) for plant leaf detection, incorporating ResNet-50 with the Convolutional Block Attention Module (CBAM) and Feature Pyramid Network (FPN) to enhance Region of Interest (RoI) feature extraction from canopy top-view images. (2) Extracted the sub-region of the RoI based on the position of the detection box, using this region as input for the BCNet, ensuring precise segmentation. (3) Utilized instance segmentation of canopy top-view images using BCNet, improving segmentation accuracy. (4) Applied the Varifocal Loss Function to improve the classification loss function in FCOS, leading to better performance metrics. The experimental results on cucumber canopy top-view images captured in glass greenhouse and plastic greenhouse environments show that our method is highly effective. For cucumber leaves at different growth stages and under various lighting conditions, the Precision, Recall and Average Precision (AP) metrics for object recognition are 97%, 94% and 96.57%, respectively. For instance segmentation, the Precision, Recall and Average Precision (AP) metrics are 87%, 83% and 84.71%, respectively. Our algorithm outperforms commonly used deep learning algorithms such as Faster R-CNN, Mask R-CNN, YOLOv4 and PANet, showcasing its superior capability in complex agricultural settings. The results of this study demonstrate the potential of our method for accurate recognition and segmentation of highly overlapping leaves in diverse agricultural environments, significantly contributing to the application of deep learning algorithms in smart agriculture. Full article
(This article belongs to the Special Issue AI, Sensors and Robotics for Smart Agriculture—2nd Edition)
Show Figures

Figure 1

Figure 1
<p>Bilayer decomposition diagram.</p>
Full article ">Figure 2
<p>Flow chart of image segmentation based on improved BCNet.</p>
Full article ">Figure 3
<p>Schematic diagram of image annotation method. <span class="html-fig-inline" id="agronomy-14-02664-i001"><img alt="Agronomy 14 02664 i001" src="/agronomy/agronomy-14-02664/article_deploy/html/images/agronomy-14-02664-i001.png"/></span> means labeled, <span class="html-fig-inline" id="agronomy-14-02664-i002"><img alt="Agronomy 14 02664 i002" src="/agronomy/agronomy-14-02664/article_deploy/html/images/agronomy-14-02664-i002.png"/></span> means unlabeled.</p>
Full article ">Figure 4
<p>Schematic diagram of image expansion scheme.</p>
Full article ">Figure 5
<p>Segmentation effect of cucumber plant images in glass greenhouses. (<b>a</b>) Early Growth Stage, Sunny. (<b>b</b>) Early Growth Stage, Cloudy. (<b>c</b>) Metaphase Growth Stage, Sunny. (<b>d</b>) Metaphase Growth Stage. Cloudy. (<b>e</b>) Terminal Growth Stage, Sunny. (<b>f</b>) Terminal Growth Stage, Cloudy.</p>
Full article ">Figure 6
<p>Effect of cucumber plant image segmentation in plastic greenhouses. (<b>a</b>) Early Growth Stage, Sunny. (<b>b</b>) Early Growth Stage, Cloudy. (<b>c</b>) Terminal Growth Stage, Sunny. (<b>d</b>) Terminal Growth Stage, Cloudy.</p>
Full article ">Figure 7
<p>Target recognition and instance segmentation P-R curve of six models: (<b>a</b>) Object detection P-R curve; (<b>b</b>) Example split P-R curve.</p>
Full article ">Figure 8
<p>Effect of cucumber plant image segmentation in glass greenhouse: (<b>a</b>) improved BCNet; (<b>b</b>) BCNet; (<b>c</b>) PANet; (<b>d</b>) Mask R-CNN; (<b>e</b>) YOLOv4; (<b>f</b>) Faster R-CNN.</p>
Full article ">
19 pages, 8993 KiB  
Article
Segmentation-Based Detection for Luffa Seedling Grading Using the Seg-FL Model
by Sheng Jiang, Fangnan Xie, Jiangbo Ao, Yechen Wei, Jingye Lu, Shilei Lyu and Zhen Li
Agronomy 2024, 14(11), 2557; https://doi.org/10.3390/agronomy14112557 - 31 Oct 2024
Viewed by 489
Abstract
This study addresses the issue of inaccurate and error-prone grading judgments in luffa plug seedlings. A new Seg-FL seedling segmentation model is proposed as an extension of the YOLOv5s-Seg model. The small leaves of early-stage luffa seedlings are liable to be mistaken for [...] Read more.
This study addresses the issue of inaccurate and error-prone grading judgments in luffa plug seedlings. A new Seg-FL seedling segmentation model is proposed as an extension of the YOLOv5s-Seg model. The small leaves of early-stage luffa seedlings are liable to be mistaken for impurities in the plug trays. To address this issue, cross-scale connections and weighted feature fusion are introduced in order to integrate feature information from different levels, thereby improving the recognition and segmentation accuracy of seedlings or details by refining the PANet structure. To address the ambiguity of seedling edge information during segmentation, an efficient channel attention module is incorporated to enhance the network’s focus on seedling edge information and suppress irrelevant features, thus sharpening the model’s focus on luffa seedlings. By optimizing the CIoU loss function, the calculation of overlapping areas, center point distances, and aspect ratios between predicted and ground truth boxes is preserved, thereby accelerating the convergence process and reducing the computational resource requirements on edge devices. The experimental results demonstrate that the proposed model attains a mean average precision of 97.03% on a self-compiled luffa plug seedling dataset, representing a 6.23 percentage point improvement over the original YOLOv5s-Seg. Furthermore, compared to the YOLACT++, FCN, and Mask R-CNN segmentation models, the improved model displays increases in [email protected] of 12.93%, 13.73%, and 10.53%, respectively, and improvements in precision of 15.73%, 16.93%, and 13.33%, respectively. This research not only validates the viability of the enhanced model for luffa seedling grading but also provides tangible technical support for the automation of grading in agricultural production. Full article
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Data collection points; (<b>b</b>) greenhouse; (<b>c</b>) typical hole tray seedling.</p>
Full article ">Figure 1 Cont.
<p>(<b>a</b>) Data collection points; (<b>b</b>) greenhouse; (<b>c</b>) typical hole tray seedling.</p>
Full article ">Figure 2
<p>Image acquisition system.</p>
Full article ">Figure 3
<p>(<b>a</b>) No seedings; (<b>b</b>) weak seedings; (<b>c</b>) strong seedings.</p>
Full article ">Figure 4
<p>(<b>a</b>) Flip horizontal; (<b>b</b>) gray processing; (<b>c</b>) random cropping.</p>
Full article ">Figure 5
<p>YOLOv5s seg network structure.</p>
Full article ">Figure 6
<p>(<b>a</b>) The architectures of PANet; (<b>b</b>) the architectures of BiFPN.</p>
Full article ">Figure 7
<p>ECA mechanism structure diagram.</p>
Full article ">Figure 8
<p>Schematic diagram of angle cost calculation.</p>
Full article ">Figure 9
<p>Seg FL algorithm structure diagram.</p>
Full article ">Figure 10
<p>Network training loss curve.</p>
Full article ">Figure 11
<p>P-R curves of two models. (<b>a</b>) No seedings; (<b>b</b>) weak seedings; (<b>c</b>) strong seedings.</p>
Full article ">Figure 12
<p>The segmentation result images before model improvement. (<b>a</b>) No seedings; (<b>b</b>) weak seedings; (<b>c</b>) strong seedings.</p>
Full article ">Figure 13
<p>The segmentation result images following model improvement. (<b>a</b>) No seedings; (<b>b</b>) weak seedings; (<b>c</b>) strong seedings.</p>
Full article ">
19 pages, 20082 KiB  
Article
An Ontology-Based Vehicle Behavior Prediction Method Incorporating Vehicle Light Signal Detection
by Xiaolong Xu, Xiaolin Shi, Yun Chen and Xu Wu
Sensors 2024, 24(19), 6459; https://doi.org/10.3390/s24196459 - 6 Oct 2024
Viewed by 991
Abstract
Although deep learning techniques have potential in vehicle behavior prediction, it is difficult to integrate traffic rules and environmental information. Moreover, its black-box nature leads to an opaque and difficult-to-interpret prediction process, limiting its acceptance in practical applications. In contrast, ontology reasoning, which [...] Read more.
Although deep learning techniques have potential in vehicle behavior prediction, it is difficult to integrate traffic rules and environmental information. Moreover, its black-box nature leads to an opaque and difficult-to-interpret prediction process, limiting its acceptance in practical applications. In contrast, ontology reasoning, which can utilize human domain knowledge and mimic human reasoning, can provide reliable explanations for the speculative results. To address the limitations of the above deep learning methods in the field of vehicle behavior prediction, this paper proposes a front vehicle behavior prediction method that combines deep learning techniques with ontology reasoning. Specifically, YOLOv5s is first selected as the base model for recognizing the brake light status of vehicles. In order to further enhance the performance of the model in complex scenes and small target recognition, the Convolutional Block Attention Module (CBAM) is introduced. In addition, so as to balance the feature information of different scales more efficiently, a weighted bi-directional feature pyramid network (BIFPN) is introduced to replace the original PANet structure in YOLOv5s. Next, using a four-lane intersection as an application scenario, multiple factors affecting vehicle behavior are analyzed. Based on these factors, an ontology model for predicting front vehicle behavior is constructed. Finally, for the purpose of validating the effectiveness of the proposed method, we make our own brake light detection dataset. The accuracy and [email protected] of the improved model on the self-made dataset are 3.9% and 2.5% higher than that of the original model, respectively. Afterwards, representative validation scenarios were selected for inference experiments. The ontology model created in this paper accurately reasoned out the behavior that the target vehicle would slow down until stopping and turning left. The reasonableness and practicality of the front vehicle behavior prediction method constructed in this paper are verified. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

Figure 1
<p>Proposed workflow. (<b>a</b>) General diagram of the workflow as detailed in (<b>b</b>). The workflow consists of two modules: a brake light detection module based on YOLOv5s-C&amp;B and an ontology reasoning module. Firstly, the vehicle brake light state is detected, combined with the vehicle driving environment, the inference unit is generated, and the ontology inference is finally performed.</p>
Full article ">Figure 2
<p>Improved YOLOv5s network structure diagram.</p>
Full article ">Figure 3
<p>Flowchart of the channel attention module.</p>
Full article ">Figure 4
<p>Flowchart of the spatial attention module.</p>
Full article ">Figure 5
<p>Application Scenario Layout Diagram. The yellow box is the target vehicle, the red box line indicates the road sign’s role in indicating the vehicle, and the green box line represents the traffic light’s control of the vehicle’s behavior.</p>
Full article ">Figure 6
<p>Visual illustration of hierarchy of front vehicle behavior prediction ontology model.</p>
Full article ">Figure 7
<p>Performance of YOLOv5s-C&amp;B. (<b>a</b>) Change curve of mAP@0.5 (<b>b</b>) Change curve of bounding box loss on training and test sets (<b>c</b>) Detection effect graph of real scene.</p>
Full article ">Figure 8
<p>Effect of continuous process detection.</p>
Full article ">Figure 9
<p>Detection result of the case with steering signal. (<b>a</b>,<b>d</b>) Brake light activated only. (<b>b</b>,<b>e</b>) Brake light and turn signal operational simultaneously. (<b>c</b>,<b>f</b>) Turn signal activated only.</p>
Full article ">Figure 10
<p>Performance of different models.</p>
Full article ">Figure 11
<p>Ontology model instance setup.</p>
Full article ">Figure 12
<p>Ontology reasoning results. (<b>a</b>) The information utilized for the reasoning and the source of that information. (<b>b</b>) Yellow underlining shows the model’s reasoning results. (<b>c</b>) The subsequent behavior of the vehicle. The blue boxed line connects the inference result and its corresponding vehicle behavior.</p>
Full article ">
11 pages, 1080 KiB  
Article
Real-World Outcomes of Incurable Cancer Patients Treated with Unlisted Anticancer Treatments in an Academic Center in Quebec, Canada
by Adam Miller, Francois Panet, Victoria Korsos, Wilson H. Miller and Gerald Batist
Curr. Oncol. 2024, 31(10), 5908-5918; https://doi.org/10.3390/curroncol31100440 - 1 Oct 2024
Viewed by 1164
Abstract
Medical oncology is a rapidly evolving field, with new medications being discovered yearly, contributing to increased survival rates. However, accessing drugs in a timely manner can be challenging. In Quebec, Canada, a physician can prescribe an unlisted anticancer treatment through a regulated pathway [...] Read more.
Medical oncology is a rapidly evolving field, with new medications being discovered yearly, contributing to increased survival rates. However, accessing drugs in a timely manner can be challenging. In Quebec, Canada, a physician can prescribe an unlisted anticancer treatment through a regulated pathway under exceptional circumstances. We conducted a quality improvement study describing the outcomes of incurable cancer patients receiving unlisted anticancer therapy at the Jewish General Hospital between 2018 and 2019. Though our study did not include a comparator arm, unlisted anticancer therapies were associated with interesting median progression-free survival (11 months) and overall survival (25 months). Moreover, a large proportion of treatments, 44%, were subsequently reimbursed in the province of Quebec. Given the delay in anticancer drug reimbursement, this pathway is essential for timely access to oncology drugs. Such ‘special access’ programs will likely become increasingly important as precision medicine becomes the standard of practice. Full article
Show Figures

Figure 1

Figure 1
<p>Consort Diagram.</p>
Full article ">Figure 2
<p>Kaplan–Meier curves describing PFS and OS in patients with incurable cancer receiving an unlisted anticancer treatment at the Jewish General Hospital between 2018–2019. (<b>A</b>) In all patients. (<b>B</b>) Classified between hematologic and solid malignancy. Median PFS and OS are given below the curves.</p>
Full article ">Figure 3
<p>Kaplan–Meier curves describing PFS and OS in patients with incurable cancer receiving an unlisted anticancer treatment at the Jewish General Hospital between 2018–2019, depending on whether the request is based on a phase III clinical trial or other types of evidence. Median PFS and OS are given below the curves. The <span class="html-italic">p</span>-value between the curves was calculated using a two-sided log-rank test.</p>
Full article ">Figure 4
<p>Kaplan–Meier curves describing PFS and OS in patients with incurable cancer receiving an unlisted anticancer treatment at the Jewish General Hospital between 2018–2019 and the type of unlisted medication used. Median PFS and OS are given below the curves.</p>
Full article ">
20 pages, 18366 KiB  
Article
A Lightweight Insulator Defect Detection Model Based on Drone Images
by Yang Lu, Dahua Li, Dong Li, Xuan Li, Qiang Gao and Xiao Yu
Drones 2024, 8(9), 431; https://doi.org/10.3390/drones8090431 - 26 Aug 2024
Cited by 2 | Viewed by 1185
Abstract
With the continuous development and construction of new power systems, using drones to inspect the condition of transmission line insulators has become an inevitable trend. To facilitate the deployment of drone hardware equipment, this paper proposes IDD-YOLO (Insulator Defect D [...] Read more.
With the continuous development and construction of new power systems, using drones to inspect the condition of transmission line insulators has become an inevitable trend. To facilitate the deployment of drone hardware equipment, this paper proposes IDD-YOLO (Insulator Defect Detection-YOLO), a lightweight insulator defect detection model. Initially, the backbone network of IDD-YOLO employs GhostNet for feature extraction. However, due to the limited feature extraction capability of GhostNet, we designed a lightweight attention mechanism called LCSA (Lightweight Channel-Spatial Attention), which is combined with GhostNet to capture features more comprehensively. Secondly, the neck network of IDD-YOLO utilizes PANet for feature transformation and introduces GSConv and C3Ghost convolution modules to reduce redundant parameters and lighten the network. The head network employs the YOLO detection head, incorporating the EIOU loss function and Mish activation function to optimize the speed and accuracy of insulator defect detection. Finally, the model is optimized using TensorRT and deployed on the NVIDIA Jetson TX2 NX mobile platform to test the actual inference speed of the model. The experimental results demonstrate that the model exhibits outstanding performance on both the proprietary ID-2024 insulator defect dataset and the public SFID insulator dataset. After optimization with TensorRT, the actual inference speed of the IDD-YOLO model reached 20.83 frames per second (FPS), meeting the demands for accurate and real-time inspection of insulator defects by drones. Full article
Show Figures

Figure 1

Figure 1
<p>IDD-YOLO network architecture diagram. The orange box in the rightmost image shows the detection result.</p>
Full article ">Figure 2
<p>IDD-YOLO basic module structure diagram.</p>
Full article ">Figure 3
<p>Schematic diagram of LCSA’s attention mechanism.</p>
Full article ">Figure 4
<p>Structural diagram of the GSConv module.</p>
Full article ">Figure 5
<p>Structural diagram of the C3Ghost module.</p>
Full article ">Figure 6
<p>Comparison of ReLU and Mish functions.</p>
Full article ">Figure 7
<p>Real-life scene of insulator defects captured by drone.</p>
Full article ">Figure 8
<p>Images of insulators under different conditions: (<b>a</b>) missing cap; (<b>b</b>) after standard fogging algorithm; (<b>c</b>) after atmospheric scattering model fogging; (<b>d</b>) with flashover; (<b>e</b>) normal; (<b>f</b>) broken.</p>
Full article ">Figure 9
<p>Experimental results comparison between LCSA and mainstream attention mechanisms.</p>
Full article ">Figure 10
<p>Heatmap of insulator with missing cap.</p>
Full article ">Figure 11
<p>Heatmap of insulator with damage.</p>
Full article ">Figure 12
<p>Experimental results comparing IDD-YOLO with mainstream lightweight detection models.</p>
Full article ">Figure 13
<p>IDD-YOLO detection results. (<b>a</b>) Insulator flashover detection results. (<b>b</b>) Insulator damage detection results. (<b>c</b>) Detection results of normal insulators. (<b>d</b>) Detection results of insulators with missing caps.</p>
Full article ">Figure 14
<p>Jetson TX2 NX experimental platform and actual detection output.</p>
Full article ">
15 pages, 5989 KiB  
Article
Instance Segmentation of Lentinus edodes Images Based on YOLOv5seg-BotNet
by Xingmei Xu, Xiangyu Su, Lei Zhou, Helong Yu and Jian Zhang
Agronomy 2024, 14(8), 1808; https://doi.org/10.3390/agronomy14081808 - 16 Aug 2024
Viewed by 920
Abstract
The shape and quantity of Lentinus edodes (commonly known as shiitake) fruiting bodies significantly affect their quality and yield. Accurate and rapid segmentation of these fruiting bodies is crucial for quality grading and yield prediction. This study proposed the YOLOv5seg-BotNet, a model for [...] Read more.
The shape and quantity of Lentinus edodes (commonly known as shiitake) fruiting bodies significantly affect their quality and yield. Accurate and rapid segmentation of these fruiting bodies is crucial for quality grading and yield prediction. This study proposed the YOLOv5seg-BotNet, a model for the instance segmentation of Lentinus edodes, to research its application for the mushroom industry. First, the backbone network was replaced with the BoTNet, and the spatial convolutions in the local backbone network were replaced with global self-attention modules to enhance the feature extraction ability. Subsequently, the PANet was adopted to effectively manage and integrate Lentinus edodes images in complex backgrounds at various scales. Finally, the Varifocal Loss function was employed to adjust the weights of different samples, addressing the issues of missed segmentation and mis-segmentation. The enhanced model demonstrated improvements in the precision, recall, Mask_AP, F1-Score, and FPS, achieving 97.58%, 95.74%, 95.90%, 96.65%, and 32.86 frames per second, respectively. These values represented the increases of 2.37%, 4.55%, 4.56%, 3.50%, and 2.61% compared to the original model. The model achieved dual improvements in segmentation accuracy and speed, exhibiting excellent detection and segmentation performance on Lentinus edodes fruiting bodies. This study provided technical fundamentals for future application of image detection and decision-making processes to evaluate mushroom production, including quality grading and intelligent harvesting. Full article
Show Figures

Figure 1

Figure 1
<p>Example of enhanced image sample of <span class="html-italic">Lentinus edodes</span>. (<b>a</b>) original picture; (<b>b</b>) brightness adjusted; (<b>c</b>) noise added; (<b>d</b>) panning; (<b>e</b>) value randomly changed; and (<b>f</b>) horizontal flip.</p>
Full article ">Figure 2
<p>Structure of YOLOv5seg. NMS is non-maximal suppression; <math display="inline"><semantics> <mrow> <mo>⊗</mo> </mrow> </semantics></math> denotes the matrix multiplication; Crop denotes a zeroing operation for the mask outside the bounding box, and Threshold denotes an image binarization of the generated mask with a threshold of 0.5. The same annotations are applicable across the subsequent figures.</p>
Full article ">Figure 3
<p>Scaled dot-product attention and MHSA module structure diagram. (<b>a</b>) Scaled dot-product attention, (<b>b</b>) MHSA layers running in parallel, and Q, K, and V are matrices obtained from the input matrix by a linear transformation.</p>
Full article ">Figure 4
<p>Structure of PANet. (<b>a</b>) FPN backbone; (<b>b</b>) Bottom–up Path Enhancement; (<b>c</b>) Adaptive Feature Pooling; (<b>d</b>) Box Branch, and (<b>e</b>) Fully Connected Fusion. For brevity, the channel dimensions of the feature maps in (<b>a</b>,<b>b</b>) were omitted.</p>
Full article ">Figure 5
<p>Structure of YOLOv5seg-BotNet. Black arrows indicate the direction of the data flow during network operation, with different colors representing different network modules. For example, the CBS module is represented in light blue, and the C3 module is represented in dark blue. The name of each module is located in the middle of the rectangle. Similarly, the improved MHSA module is indicated by a red module, and PANet is marked with a red rectangle.</p>
Full article ">Figure 6
<p>Comparison of Mask_AP in different modules. The horizontal axis represents the number of iterations, while the vertical axis represents the value of Mask_AP.</p>
Full article ">Figure 7
<p>Comparison of Mask_AP in different models. The horizontal axis represents the number of iterations, while the vertical axis represents the value of Mask_AP.</p>
Full article ">Figure 8
<p>Comparison of the segmentation effect of different models. The yellow circle represents false segmentation, and the green circle represents missing segmentation.</p>
Full article ">Figure 9
<p>Comparison of the segmentation effect of different models at different angles. The yellow circle represents false segmentation, and the green circle represents missing segmentation.</p>
Full article ">Figure 10
<p>Comparison of different loss functions. The horizontal axis represents the number of iterations, while the vertical axis represents the loss value. The comparison between the different loss functions can be clearly seen in the local magnification.</p>
Full article ">Figure 11
<p>Comparison of different modules. (<b>a</b>) Segmentation results of the original model, and (<b>b</b>) segmentation results after introducing the VFL. The yellow circle represents missing segmentation.</p>
Full article ">
16 pages, 9003 KiB  
Article
SiM-YOLO: A Wood Surface Defect Detection Method Based on the Improved YOLOv8
by Honglei Xi, Rijun Wang, Fulong Liang, Yesheng Chen, Guanghao Zhang and Bo Wang
Coatings 2024, 14(8), 1001; https://doi.org/10.3390/coatings14081001 - 7 Aug 2024
Viewed by 1724
Abstract
Wood surface defect detection is a challenging task due to the complexity and variability of defect types. To address these challenges, this paper introduces a novel deep learning approach named SiM-YOLO, which is built upon the YOLOv8 object detection framework. A fine-grained convolutional [...] Read more.
Wood surface defect detection is a challenging task due to the complexity and variability of defect types. To address these challenges, this paper introduces a novel deep learning approach named SiM-YOLO, which is built upon the YOLOv8 object detection framework. A fine-grained convolutional structure, SPD-Conv, is introduced with the aim of preserving detailed defect information during the feature extraction process, thus enabling the model to capture the subtle variations and complex details of wood surface defects. In the feature fusion stage, a SiAFF-PANet-based wood defect feature fusion module is designed to improve the model’s ability to focus on local contextual information and enhance defect localization. For classification and regression tasks, the multi-attention detection head (MADH) is employed to capture cross-channel information and the accurate spatial localization of defects. In addition, MPDIoU is employed to optimize the loss function of the model to reduce the leakage of detection due to defect overlap. The experimental results show that SiM-YOLO achieves superior performance compared to the state-of-the-art YOLO algorithm, with a 9.3% improvement in mAP over YOLOX and a 4.3% improvement in mAP over YOLOv8. The Grad-CAM visualization further illustrates that SiM-YOLO provides more accurate defect localization and effectively reduces misdetection and omission issues. This study highlights the effectiveness of SiM-YOLO for wood surface defect detection and offers valuable insights for future research and practical applications in quality control. Full article
(This article belongs to the Section Surface Characterization, Deposition and Modification)
Show Figures

Figure 1

Figure 1
<p>Structure of SiM-YOLO.</p>
Full article ">Figure 2
<p>Convolution downsampling module.</p>
Full article ">Figure 3
<p>Structure of SPD-Conv (scale = 2).</p>
Full article ">Figure 4
<p>Structure of multi-scale channel attention module. (<b>a</b>) MS-CAM; (<b>b</b>) SMS-CAM.</p>
Full article ">Figure 5
<p>Structure of SiAFF.</p>
Full article ">Figure 6
<p>Structure of MADH.</p>
Full article ">Figure 7
<p>Structure of CA and CCA. (<b>a</b>) CA, (<b>b</b>) CCA.</p>
Full article ">Figure 8
<p>Pine surface defect types in the dataset. (<b>a</b>) Live_Knot; (<b>b</b>) Marrow (Pith); (<b>c</b>) Resin (Resin pocket); (<b>d</b>) Dead_Knot; (<b>e</b>) Knot_with_crack; (<b>f</b>) Knot_missing; and (<b>g</b>) Crack.</p>
Full article ">Figure 9
<p>Precision–recall (P-R) curves. (<b>a</b>) YOLOv7; (<b>b</b>) YOLOv8; and (<b>c</b>) SiM-YOLO.</p>
Full article ">Figure 10
<p>Portion of visualization results.</p>
Full article ">Figure 11
<p>Grad-CAM images of portion of experimental results. (<b>a</b>) Original image of Pine surface defect; (<b>b</b>) YOLOv7; (<b>c</b>) YOLOv8; and (<b>d</b>) SiM-YOLO.</p>
Full article ">
29 pages, 13503 KiB  
Article
YOSMR: A Ship Detection Method for Marine Radar Based on Customized Lightweight Convolutional Networks
by Zhe Kang, Feng Ma, Chen Chen and Jie Sun
J. Mar. Sci. Eng. 2024, 12(8), 1316; https://doi.org/10.3390/jmse12081316 - 3 Aug 2024
Cited by 1 | Viewed by 1282
Abstract
In scenarios such as nearshore and inland waterways, the ship spots in a marine radar are easily confused with reefs and shorelines, leading to difficulties in ship identification. In such settings, the conventional ARPA method based on fractal detection and filter tracking performs [...] Read more.
In scenarios such as nearshore and inland waterways, the ship spots in a marine radar are easily confused with reefs and shorelines, leading to difficulties in ship identification. In such settings, the conventional ARPA method based on fractal detection and filter tracking performs relatively poorly. To accurately identify radar targets in such scenarios, a novel algorithm, namely YOSMR, based on the deep convolutional network, is proposed. The YOSMR uses the MobileNetV3(Large) network to extract ship imaging data of diverse depths and acquire feature data of various ships. Meanwhile, taking into account the issue of feature suppression for small-scale targets in algorithms composed of deep convolutional networks, the feature fusion module known as PANet has been subject to a lightweight reconstruction leveraging depthwise separable convolutions to enhance the extraction of salient features for small-scale ships while reducing model parameters and computational complexity to mitigate overfitting problems. To enhance the scale invariance of convolutional features, the feature extraction backbone is followed by an SPP module, which employs a design of four max-pooling constructs to preserve the prominent ship features within the feature representations. In the prediction head, the Cluster-NMS method and α-DIoU function are used to optimize non-maximum suppression (NMS) and positioning loss of prediction boxes, improving the accuracy and convergence speed of the algorithm. The experiments showed that the recall, accuracy, and precision of YOSMR reached 0.9308, 0.9204, and 0.9215, respectively. The identification efficacy of this algorithm exceeds that of various YOLO algorithms and other lightweight algorithms. In addition, the parameter size and calculational consumption were controlled to only 12.4 M and 8.63 G, respectively, exhibiting an 80.18% and 86.9% decrease compared to the standard YOLO model. As a result, the YOSMR displays a substantial advantage in terms of convolutional computation. Hence, the algorithm achieves an accurate identification of ships with different trail features and various scenes in marine radar images, especially in different interference and extreme scenarios, showing good robustness and applicability. Full article
Show Figures

Figure 1

Figure 1
<p>The pipeline of the proposed algorithm. We present a novel detection algorithm grounded in the YOLO architecture, which we term YOSMR. The holistic architecture of YOSMR can be delineated into three core components: Backbone, Neck, and Head. Furthermore, the algorithm also encompasses loss functions and training strategies as pivotal elements. Relative to the standard YOLO framework, YOSMR has undertaken adaptive adjustments across its Backbone, Neck, Head, Loss function, and NMS components to better cater to the unique characteristics of radar-based applications. (a) Within the Backbone, YOSMR has integrated a mature feature extraction network, MobileNetV3(Large), and appended a feature enhancement module known as the Spatial Pyramid Pooling (SPP). (b) We leverage the efficient Depthwise Separable Convolution (DSC), a lightweight convolutional unit, to reconstruct the feature fusion network. This not only ensures the effective extraction of small-scale object features but also significantly reduces the convolution parameters. (c) In the Head structure, we have introduced three prediction channels of diverse scales to encompass the detection of various target types. (d) we have incorporated Cluster NMS and designed the α-DIoU loss to optimize the algorithm’s training and accelerate convergence.</p>
Full article ">Figure 2
<p>Structure of the Bneck module. In the process of forward convolution, this module employs an attention calculation mechanism and a residual edge structure to enhance the extraction of crucial features relevant to the targets.</p>
Full article ">Figure 3
<p>Convolutional heatmaps. It is apparent that MobileNetV3(Large) leverages the detection of radar spot features to discern the validity of targets. The heatmaps substantiate the remarkable precision of the feature network in localizing ships while effectively mitigating false positives.</p>
Full article ">Figure 3 Cont.
<p>Convolutional heatmaps. It is apparent that MobileNetV3(Large) leverages the detection of radar spot features to discern the validity of targets. The heatmaps substantiate the remarkable precision of the feature network in localizing ships while effectively mitigating false positives.</p>
Full article ">Figure 4
<p>Structure of the SPP. Through the concatenation of multiple scales of maximum pooling layers, this module captures the relatively prominent feature representations from diverse local regions of the feature map. This strategy ensures the positional invariance of the feature data and contributes to mitigating the risk of overfitting.</p>
Full article ">Figure 5
<p>Comparison of identification results with and without SPP. The SPP module, by extracting finer-grained target features, enables effective discrimination of adjacent spots in dense scenes, reducing the probability of misidentification and enhancing the model’s robustness.</p>
Full article ">Figure 6
<p>Structure of the LightPANet. In this network, the employment of Depthwise Separable Convolution (DSC) results in a remarkable reduction in parameter count. By decomposing the convolution operation into depthwise convolution and pointwise convolution, DSC achieves a significant decrease in parameters, thereby reducing model complexity and computational demands. Moreover, the independent processing of each input channel during the depthwise convolution allows for the extraction of highly discriminative features. This facilitates the network’s ability to capture spatial information within the input data and enhances its generalization capabilities.</p>
Full article ">Figure 7
<p>Comparison of identification results between LightPANet and PANet. The utilization of the optimized feature fusion network, empowered by the integration of the DSC module, yields higher precision in localizing ship spots and enhances the accuracy of identifying small-scale targets. Consequently, this leads to a reduced occurrence of false positive predictions.</p>
Full article ">Figure 8
<p>Comparison between Cluster-NMS and other methods. By comparison, Cluster-NMS stands out by utilizing an innovative weighted fusion approach to process candidate prediction boxes, leading to satisfactory precision in target localization.</p>
Full article ">Figure 9
<p>Key indicators of α-DIoU function. Through separate adjustments of a hyperparameter, this method effectively modifies the impact weight of the center point distance metric compared to the standard DIoU metric. This adjustment, made during the loss calculation, facilitates faster convergence of prediction boxes for small-scale targets.</p>
Full article ">Figure 10
<p>Marine radar images. The radar spots present in the image are characterized by their minuscule scale, while small islands and atmospheric clusters, due to their high feature similarity, significantly interfere with the accurate recognition of actual ships.</p>
Full article ">Figure 11
<p>Comparison of training processes of various algorithms. Throughout the entirety of the training process, the proposed algorithm exhibits an expedited and consistent convergence compared to standard methods, resulting in obviously lower overall loss computation values.</p>
Full article ">Figure 12
<p>The identification results of YOSMR for different marine radar images. Regardless of the ship scales, YOSMR exhibits remarkable efficiency in identifying ship spots, effectively capturing target information across various sizes. Moreover, in navigation-intensive environments, this model excels in accurately localizing targets and demonstrates a reduced occurrence of false positives.</p>
Full article ">Figure 12 Cont.
<p>The identification results of YOSMR for different marine radar images. Regardless of the ship scales, YOSMR exhibits remarkable efficiency in identifying ship spots, effectively capturing target information across various sizes. Moreover, in navigation-intensive environments, this model excels in accurately localizing targets and demonstrates a reduced occurrence of false positives.</p>
Full article ">Figure 13
<p>Identification of YOSMR for different types of ships under various noises. YOSMR exhibits resilience to a certain degree of interference, as it can effectively discern and accurately locate the majority of authentic ship spots despite the varying impact on target confidence scores caused by different types of disturbances.</p>
Full article ">Figure 13 Cont.
<p>Identification of YOSMR for different types of ships under various noises. YOSMR exhibits resilience to a certain degree of interference, as it can effectively discern and accurately locate the majority of authentic ship spots despite the varying impact on target confidence scores caused by different types of disturbances.</p>
Full article ">Figure 13 Cont.
<p>Identification of YOSMR for different types of ships under various noises. YOSMR exhibits resilience to a certain degree of interference, as it can effectively discern and accurately locate the majority of authentic ship spots despite the varying impact on target confidence scores caused by different types of disturbances.</p>
Full article ">Figure 14
<p>Comparison of different algorithms for small-scale ship identification. It should be noted that the image presented above has been appropriately cropped and magnified from the original radar image to provide a clearer visualization of the detection results of the spots. In the challenging context of identifying small-scale ships, typical models often produce a considerable number of false positive targets in their recognition outcomes. This issue persists even with advanced models such as YOLOv8 and the latest version of YOLOv5, which are widely acknowledged. However, through comparison, it becomes apparent that the proposed YOSMR exhibits remarkable detection performance, particularly in its ability to effectively suppress false positive targets.</p>
Full article ">Figure 14 Cont.
<p>Comparison of different algorithms for small-scale ship identification. It should be noted that the image presented above has been appropriately cropped and magnified from the original radar image to provide a clearer visualization of the detection results of the spots. In the challenging context of identifying small-scale ships, typical models often produce a considerable number of false positive targets in their recognition outcomes. This issue persists even with advanced models such as YOLOv8 and the latest version of YOLOv5, which are widely acknowledged. However, through comparison, it becomes apparent that the proposed YOSMR exhibits remarkable detection performance, particularly in its ability to effectively suppress false positive targets.</p>
Full article ">
16 pages, 2681 KiB  
Article
Local and Global Context-Enhanced Lightweight CenterNet for PCB Surface Defect Detection
by Weixun Chen, Siming Meng and Xueping Wang
Sensors 2024, 24(14), 4729; https://doi.org/10.3390/s24144729 - 21 Jul 2024
Cited by 2 | Viewed by 1284
Abstract
Printed circuit board (PCB) surface defect detection is an essential part of the PCB manufacturing process. Currently, advanced CCD or CMOS sensors can capture high-resolution PCB images. However, the existing computer vision approaches for PCB surface defect detection require high computing effort, leading [...] Read more.
Printed circuit board (PCB) surface defect detection is an essential part of the PCB manufacturing process. Currently, advanced CCD or CMOS sensors can capture high-resolution PCB images. However, the existing computer vision approaches for PCB surface defect detection require high computing effort, leading to insufficient efficiency. To this end, this article proposes a local and global context-enhanced lightweight CenterNet (LGCL-CenterNet) to detect PCB surface defects in real time. Specifically, we propose a two-branch lightweight vision transformer module with local and global attention, named LGT, as a complement to extract high-dimension features and leverage context-aware local enhancement after the backbone network. In the local branch, we utilize coordinate attention to aggregate more powerful features of PCB defects with different shapes. In the global branch, Bi-Level Routing Attention with pooling is used to capture long-distance pixel interactions with limited computational cost. Furthermore, a Path Aggregation Network (PANet) feature fusion structure is incorporated to mitigate the loss of shallow features caused by the increase in model depth. Then, we design a lightweight prediction head by using depthwise separable convolutions, which further compresses the computational complexity and parameters while maintaining the detection capability of the model. In the experiment, the LGCL-CenterNet increased the [email protected] by 2% and 1.4%, respectively, in comparison to CenterNet-ResNet18 and YOLOv8s. Meanwhile, our approach requires fewer model parameters (0.542M) than existing techniques. The results show that the proposed method improves both detection accuracy and inference speed and indicate that the LGCL-CenterNet has better real-time performance and robustness. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>Six different kinds of PCB defects.</p>
Full article ">Figure 2
<p>The overall network architecture of the proposed local and global context-enhanced lightweight CenterNet.</p>
Full article ">Figure 3
<p>Ratio of PCB defect bounding box area to total image area.</p>
Full article ">Figure 4
<p>The structure of local coordinate attention and global self-attention.</p>
Full article ">Figure 5
<p>Sparse attention is used to skip computations in the most irrelevant region, and pooling is used to downsample the key and value to reduce FLOPs.</p>
Full article ">Figure 6
<p>Detection results of different object detection algorithms. More detection results of the other defects can be found in the <a href="#app1-sensors-24-04729" class="html-app">Supplementary Materials</a>.</p>
Full article ">
20 pages, 5258 KiB  
Article
YOMO-Runwaynet: A Lightweight Fixed-Wing Aircraft Runway Detection Algorithm Combining YOLO and MobileRunwaynet
by Wei Dai, Zhengjun Zhai, Dezhong Wang, Zhaozi Zu, Siyuan Shen, Xinlei Lv, Sheng Lu and Lei Wang
Drones 2024, 8(7), 330; https://doi.org/10.3390/drones8070330 - 18 Jul 2024
Cited by 1 | Viewed by 1411
Abstract
The runway detection algorithm for fixed-wing aircraft is a hot topic in the field of aircraft visual navigation. High accuracy, high fault tolerance, and lightweight design are the core requirements in the domain of runway feature detection. This paper aims to address these [...] Read more.
The runway detection algorithm for fixed-wing aircraft is a hot topic in the field of aircraft visual navigation. High accuracy, high fault tolerance, and lightweight design are the core requirements in the domain of runway feature detection. This paper aims to address these needs by proposing a lightweight runway feature detection algorithm named YOMO-Runwaynet, designed for edge devices. The algorithm features a lightweight network architecture that follows the YOMO inference framework, combining the advantages of YOLO and MobileNetV3 in feature extraction and operational speed. Firstly, a lightweight attention module is introduced into MnasNet, and the improved MobileNetV3 is employed as the backbone network to enhance the feature extraction efficiency. Then, PANet and SPPnet are incorporated to aggregate the features from multiple effective feature layers. Subsequently, to reduce latency and improve efficiency, YOMO-Runwaynet generates a single optimal prediction for each object, eliminating the need for non-maximum suppression (NMS). Finally, experimental results on embedded devices demonstrate that YOMO-Runwaynet achieves a detection accuracy of over 89.5% on the ATD (Aerovista Runway Dataset), with a pixel error rate of less than 0.003 for runway keypoint detection, and an inference speed exceeding 90.9 FPS. These results indicate that the YOMO-Runwaynet algorithm offers high accuracy and real-time performance, providing effective support for the visual navigation of fixed-wing aircraft. Full article
(This article belongs to the Topic Civil and Public Domain Applications of Unmanned Aviation)
Show Figures

Figure 1

Figure 1
<p>Mobilenet V3 block structure (using depthwise separable convolutions to enhance the computational efficiency by separating the spatial filtering and feature generation).</p>
Full article ">Figure 2
<p>The YOMO-Runwaynet network framework.</p>
Full article ">Figure 3
<p>YOMO-Runwaynet module branch calculation method.</p>
Full article ">Figure 4
<p>Runwaynet internal structure.</p>
Full article ">Figure 5
<p>Definition of the runway feature corner point information.</p>
Full article ">Figure 6
<p>Loss function.</p>
Full article ">Figure 7
<p>Aerovista runway dataset (airport runway data for each scenario).</p>
Full article ">Figure 8
<p>Results after model convergence (in the indicators of YOMO-Runwaynet: (<b>a</b>) model recall convergence results; and (<b>b</b>) the mAP results of the model).</p>
Full article ">Figure 9
<p>YOMO-Runwaynet testing results on the ATD mountain scene dataset (data from different altitude segments in mountain scenes: each group from left to right corresponds to altitudes of 300 ft, 200 ft, and 150 ft. Each row represents a set of data corresponding to: (<b>a</b>) foggy day; (<b>b</b>) sunny day; and (<b>c</b>) rainy and snowy day).</p>
Full article ">Figure 9 Cont.
<p>YOMO-Runwaynet testing results on the ATD mountain scene dataset (data from different altitude segments in mountain scenes: each group from left to right corresponds to altitudes of 300 ft, 200 ft, and 150 ft. Each row represents a set of data corresponding to: (<b>a</b>) foggy day; (<b>b</b>) sunny day; and (<b>c</b>) rainy and snowy day).</p>
Full article ">Figure 10
<p>YOMO-Runwaynet testing results on the ATD urban scene dataset (data from different altitude segments in urban scenes: each group starts from the first column with altitudes of 600 ft, 300 ft, and 150 ft. Each row represents a set of data corresponding to: (<b>a</b>) sunny day with dual runways, camera installation angle 0°; (<b>b</b>) sunny day with dual runways, camera installation angle 3.5°; (<b>c</b>) urban scene with infrared camera; (<b>d</b>) farmland scene, camera installation angle 5°).</p>
Full article ">Figure 10 Cont.
<p>YOMO-Runwaynet testing results on the ATD urban scene dataset (data from different altitude segments in urban scenes: each group starts from the first column with altitudes of 600 ft, 300 ft, and 150 ft. Each row represents a set of data corresponding to: (<b>a</b>) sunny day with dual runways, camera installation angle 0°; (<b>b</b>) sunny day with dual runways, camera installation angle 3.5°; (<b>c</b>) urban scene with infrared camera; (<b>d</b>) farmland scene, camera installation angle 5°).</p>
Full article ">Figure 11
<p>Horizontal and longitudinal pixel error of runway feature points.</p>
Full article ">
23 pages, 17168 KiB  
Article
MEAG-YOLO: A Novel Approach for the Accurate Detection of Personal Protective Equipment in Substations
by Hong Zhang, Chunyang Mu, Xing Ma, Xin Guo and Chong Hu
Appl. Sci. 2024, 14(11), 4766; https://doi.org/10.3390/app14114766 - 31 May 2024
Cited by 1 | Viewed by 954
Abstract
Timely and accurately detecting personal protective equipment (PPE) usage among workers is essential for substation safety management. However, traditional algorithms encounter difficulties in substations due to issues such as varying target scales, intricate backgrounds, and many model parameters. Therefore, this paper proposes MEAG-YOLO, [...] Read more.
Timely and accurately detecting personal protective equipment (PPE) usage among workers is essential for substation safety management. However, traditional algorithms encounter difficulties in substations due to issues such as varying target scales, intricate backgrounds, and many model parameters. Therefore, this paper proposes MEAG-YOLO, an enhanced PPE detection model for substations built upon YOLOv8n. First, the model incorporates the Multi-Scale Channel Attention (MSCA) module to improve feature extraction. Second, it newly designs the EC2f structure with one-dimensional convolution to enhance feature fusion efficiency. Additionally, the study optimizes the Path Aggregation Network (PANet) structure to improve feature learning and the fusion of multi-scale targets. Finally, the GhostConv module is integrated to optimize convolution operations and reduce computational complexity. The experimental results show that MEAG-YOLO achieves a 2.4% increase in precision compared to YOLOv8n, with a 7.3% reduction in FLOPs. These findings suggest that MEAG-YOLO is effective in identifying PPE in complex substation scenarios, contributing to the development of smart grid systems. Full article
Show Figures

Figure 1

Figure 1
<p>YOLOv8n model structure diagram.</p>
Full article ">Figure 2
<p>MEAG-YOLO model diagram.</p>
Full article ">Figure 3
<p>MSCA structure diagram.</p>
Full article ">Figure 4
<p>C2f structure.</p>
Full article ">Figure 5
<p>EC2f structure diagram.</p>
Full article ">Figure 6
<p>GAP structure diagram.</p>
Full article ">Figure 7
<p>ASFF module structure diagram.</p>
Full article ">Figure 8
<p>GhostConv module structure diagram.</p>
Full article ">Figure 9
<p>Data-enhancement diagram. (<b>a</b>) Adjusting brightness; (<b>b</b>) adjusting saturation; (<b>c</b>) adding noise; and (<b>d</b>) random panning.</p>
Full article ">Figure 10
<p>(<b>a</b>) PPE category diagram; (<b>b</b>) distribution plot of x and y coordinates.</p>
Full article ">Figure 11
<p>(<b>a</b>) YOLOv8n P-R curve; (<b>b</b>) MEAG-YOLO P-R curve.</p>
Full article ">Figure 12
<p>Ablation experiment results: (<b>a</b>) comparison of the accuracy of each model; (<b>b</b>) comparison of the FLOPs.</p>
Full article ">Figure 13
<p>Comparison of precision with different model FLOPs.</p>
Full article ">Figure 14
<p>Comparison of experimental results.</p>
Full article ">Figure 15
<p>Comparison of model detection results. (<b>a</b>) YOLOv8n’s detection results; (<b>b</b>) MEAG-YOLO’s detection results. (<b>c</b>) YOLOv8n’s detection results; (<b>d</b>) MEAG-YOLO’s detection results.</p>
Full article ">Figure 16
<p>Comparison of model detection results. (<b>a</b>) YOLOv8n’s detection results; (<b>b</b>) MEAG-YOLO’s detection results; (<b>c</b>) Local magnification of YOLOv8n’s detection results; (<b>d</b>) Local magnification of MEAG-YOLO’s detection results.</p>
Full article ">
Back to TopTop