[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3528114.3528131acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdsdeConference Proceedingsconference-collections
research-article

FM-YOLO Object Detection Algorithm

Published: 24 June 2022 Publication History

Abstract

Aiming at the problems of YOLOv4 with many parameters and low detection accuracy, a lightweight FM-YOLO (Fused Mobile-You Only Look Once) object detection algorithm is proposed. The algorithm improves the convolution layer of the deep network for feature extraction in two aspects. First, MBConv is used to expand the width of the convolution operation and reduce the amount of model parameters. Secondly, Fused-MBConv in the shallow network solves the problem that the depthwise separable convolution cannot fully utilize GPU acceleration in the shallow network, resulting in slower training speed. The activation function is changed from Mish to SiLU, and dropout is added to avoid network training over-fitting. From the test results of the VOC2007 data set, the parameter of the YOLOv4 algorithm is reduced by 1.4M compared with YOLOv4, and the mAP is increased by 1.45%. This algorithm improves the accuracy of detection while slightly reducing the amount of model parameters.

References

[1]
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25: 1097-1105.
[2]
Chen L, Li S, Bai Q, Review of Image Classification Algorithms Based on Convolutional Neural Networks[J]. Remote Sensing, 2021, 13(22): 4712.
[3]
Junos M H, Khairuddin A S M, Dahari M. Automated object detection on aerial images for limited capacity embedded device using a lightweight CNN model[J]. Alexandria Engineering Journal, 2021.
[4]
Li K, Zhu J, Li N. Lightweight Automatic Identification and Location Detection Model of Farmland Pests[J]. Wireless Communications and Mobile Computing, 2021, 2021.
[5]
Girshick R. Fast r-cnn[C]. Proceedings of the IEEE international conference on computer vision, 2015: 1440-1448.
[6]
Girshick R, Donahue J, Darrell T, Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2014: 580-587.
[7]
He K, Gkioxari G, Dollár P, Mask r-cnn[C]. Proceedings of the IEEE international conference on computer vision, 2017: 2961-2969.
[8]
Ren S, He K, Girshick R, Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28: 91-99.
[9]
Bochkovskiy A, Wang C-Y, Liao H-Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.
[10]
Redmon J, Divvala S, Girshick R, You only look once: Unified, real-time object detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016: 779-788.
[11]
Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2017: 7263-7271.
[12]
Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.
[13]
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.
[14]
He K, Zhang X, Ren S, Deep residual learning for image recognition[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016: 770-778.
[15]
Howard A G, Zhu M, Chen B, Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017.
[16]
Sandler M, Howard A, Zhu M, Mobilenetv2: Inverted residuals and linear bottlenecks[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2018: 4510-4520.
[17]
Wang C-Y, Liao H-Y M, Wu Y-H, CSPNet: A new backbone that can enhance learning capability of CNN[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2020: 390-391.
[18]
He K, Zhang X, Ren S, Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916.
[19]
Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks[C]. International Conference on Machine Learning, 2019: 6105-6114.
[20]
Tan M, Le Q V. Efficientnetv2: Smaller models and faster training[J]. arXiv preprint arXiv:2104.00298, 2021.
[21]
Liu S, Qi L, Qin H, Path aggregation network for instance segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2018: 8759-8768.

Cited By

View all
  • (2023)PASFLN: Positional Association and Semantic Fusion Learning Network for Traffic Object Detection2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC57777.2023.10422508(329-334)Online publication date: 24-Sep-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
DSDE '22: Proceedings of the 2022 5th International Conference on Data Storage and Data Engineering
February 2022
124 pages
ISBN:9781450395724
DOI:10.1145/3528114
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Fused-MBConv
  2. Lightweight Network
  3. MBConv
  4. Object Detection
  5. YOLOv4

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Chongqing University of Technology Postgraduate Innovation Project Fund
  • Chongqing Municipal Education Commission's Social Network Community Division and Influence Maximization Method Research Project Fund in a Competitive Environment

Conference

DSDE 2022

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)1
Reflects downloads up to 19 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)PASFLN: Positional Association and Semantic Fusion Learning Network for Traffic Object Detection2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC57777.2023.10422508(329-334)Online publication date: 24-Sep-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media