UAV-YOLO: Small Object Detection on Unmanned Aerial Vehicle Perspective
<p>The structure of our proposed UAV-YOLO.</p> "> Figure 2
<p>The backbone structure of YOLOv3 and UAV-YOLO. (<b>a</b>) Original YOLOv3 structure. (<b>b</b>) The proposed UAV-YOLO structure.</p> "> Figure 3
<p>The proposed training method and steps to enhance YOLOv3 performance on UAV-viewed human detection.</p> "> Figure 4
<p>Clustering results on different number of anchor boxes.</p> "> Figure 5
<p>Detected results of different methods on UAV-viewed dataset. (<b>a</b>) Detected results of original YOLOv3; (<b>b</b>) detected results of YOLOv3 using optimized training method; (<b>c</b>) detected results of UAV-YOLO.</p> ">
Abstract
:1. Introduction
2. Related Work
3. Proposed Method
3.1. YOLOv3 Used for Human Detection on UAV Perspective
3.2. UAV-YOLO: Optimized YOLOv3 for UAV-Viewed Human Detection
3.3. Optimized UAV-YOLO Training for UAV-Viewed Human Detection
4. Experimental Results
4.1. UAV-Viewed Dataset and Implementation Detail
4.2. Experimental Results of Optimized Training Method Using YOLOv3
4.3. State-of-the-Art Comparison
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Blaschke, T.; Lang, S.; Hay, G. Object-Based Image Analysis: Apatial Concepts for Knowledge-Driven Remote Sensing Applications; Springer Science & Business Media: Heidelberg, Germany, 2008. [Google Scholar]
- Hengstler, S.; Prashanth, D.; Fong, S.; Aghajan, H. MeshEye: A hybrid-resolution smart camera mote for applications in distributed intelligent surveillance. In Proceedings of the 6th International Conference on Information Processing in Sensor Networks (IPSN), Cambridge MA, USA, 25–27 April 2007; pp. 360–369. [Google Scholar]
- Viola, P.; Jones, M. Robust real-time object detection. Int. J. Comput. Vision 2001, 4, 34–47. [Google Scholar]
- Zhao, H.; Zhou, Y.; Zhang, L.; Hu, X.; Peng, H.; Cai, X. Mixed YOLOv3-LITE: A lightweight real-time object detection method. Sensors 2020, 7, 1861. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 11, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
- Felzenszwalb, P.F.; Girshick, R.B.; McAllester, D.; Ramanan, D. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 9, 1627–1645. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Kauai, HI, UAS, 8–14 December 2001; pp. 511–518. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Jingcheng, Z.H.A.O.; Xinru, F.U.; Zongkai, Y.A.N.G.; Fengtong, X.U. UAV detection and identification in the Internet of Things. In Proceedings of the 15th International Conference on Wireless Communications and Mobile Computing Conference (IWCMC), Guilin, China, 26–28 October 2019; pp. 1499–1503. [Google Scholar]
- Deng, Z.; Sun, H.; Zhou, S.; Zhao, J.; Lei, L.; Zou, H. Multi-scale object detection in remote sensing imagery with convolutional neural networks. ISPRS J. Photogramm. Remote Sens. 2018, 145, 3–22. [Google Scholar] [CrossRef]
- Xu, Z.; Shi, H.; Li, N.; Xiang, C.; Zhou, H. Vehicle Detection Under UAV Based on Optimal Dense YOLO Method. In Proceedings of the 5th International Conference on Systems and Informatics (ICSAI), Nanjing, China, 10–12 October 2018; pp. 407–411. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Moranduzzo, T.; Melgani, F.; Bazi, Y.; Alajlan, N. A fast object detector based on high-order gradients and Gaussian process regression for UAV images. Int. J. Remote Sens. 2015, 10, 2713–2733. [Google Scholar] [CrossRef]
- Dong, Q.; Zou, Q. Visual UAV detection method with online feature classification. In Proceedings of the IEEE 2nd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chengdu, China, 15–17 December 2017; pp. 429–432. [Google Scholar]
- Author1, T. Euclidean distance based algorithm for UAV acoustic detection. In Proceedings of the International Conference on Electronics, Information, and Communication (ICEIC), Beijing, China, 15–17 June 2018; pp. 1–2. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 24–27 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 11–18 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, QC, Canada, 7–12 December 2015; pp. 91–99. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherland, 8–16 October 2016; pp. 21–37. [Google Scholar]
- Shrivastava, A.; Gupta, A.; Girshick, R. Training region-based object detectors with online hard example mining. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 761–769. [Google Scholar]
- Sünderhauf, N.; Shirazi, S.; Dayoub, F.; Upcroft, B.; Milford, M. On the performance of convnet features for place recognition. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 4297–4304. [Google Scholar]
- Cai, G.; Chen, B.M.; Lee, T.H. Unmanned Rotorcraft Systems; Springer Science & Business Media: London, UK, 2011. [Google Scholar]
- Victor, G.R.; Juan, A.R.; Jose, M.M.G.; Nuria, S.A.; Jose, M.L.M.; Federico, A. Automatic Change Detection System over Unmanned Aerial Vehicle Video Sequences Based on Convolutional Neural Networks. Sensors 2019, 19, 4484. [Google Scholar]
- Fu, C.Y.; Liu, W.; Ranga, A.; Tyagi, A.; Berg, A.C. DSSD: Deconvolutional single shot detector. arXiv 2017, arXiv:1701.06659. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Li, Z.; Zhou, F. FSSD: Feature fusion single shot multibox detector. arXiv 2017, arXiv:1712.00960 2017. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes (voc) challenge. Int. J. Comput. Vision 2010, 2, 303–338. [Google Scholar] [CrossRef] [Green Version]
- Ammour, N.; Alhichri, H.; Bazi, Y.; Benjdira, B.; Alajlan, N.; Zuair, M. Deep learning approach for car detection in UAV imagery. Remote Sens. 2017, 4, 312. [Google Scholar] [CrossRef] [Green Version]
- Bazi, Y.; Melgani, F. Convolutional SVM networks for object detection in UAV imagery. IEEE Trans. Geosci. Remote Sens. 2018, 6, 3107–3118. [Google Scholar] [CrossRef]
- Konoplich, G.V.; Putin, E.O.; Filchenkov, A.A. Application of deep learning to the problem of vehicle detection in UAV images. In Proceedings of the IEEE International Conference on Soft Computing and Measurements (SCM), St. Petersburg, Russia, 25–27 May 2016; pp. 4–6. [Google Scholar]
- Jiang, Z.; Yuan, Y.; Wang, Q. Contour-aware network for semantic segmentation via adaptive depth. Neurocomputing 2018, 284, 27–35. [Google Scholar] [CrossRef]
- Jiang, Z.; Wang, Q.; Yuan, Y. Modeling with prejudice: Small-sample learning via adversary for semantic segmentation. IEEE Access 2018, 6, 77965–77974. [Google Scholar] [CrossRef]
- Wagstaff, K.; Cardie, C.; Rogers, S.; Schrödl, S. Constrained k-means clustering with background knowledge. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williamstown, MA, USA, 28 June–1 July 2001; pp. 577–584. [Google Scholar]
- Jain, A.K. Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 2010, 8, 651–666. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
UAV-Viewed | Normal | Games | Far | |||||
---|---|---|---|---|---|---|---|---|
Optimized Method | mAP/% | IOU/% | mAP/% | IOU/% | mAP/% | IOU/% | mAP/% | IOU/% |
Original data | 51.41 | 66.17 | 90.81 | 85.75 | 94.32 | 91.44 | 15.58 | 29.12 |
Classified data | 90.88 | 78.20 | 90.83 | 80.40 | 90.76 | 74.11 | 56.44 | 55.02 |
Anchor3 | 90.84 | 79.07 | 90.86 | 80.20 | 90.91 | 72.62 | 59.60 | 55.83 |
Anchor6 | 90.88 | 78.77 | 90.85 | 82.13 | 87.35 | 71.05 | 57.15 | 56.43 |
Anchor9 | 90.91 | 80.59 | 90.91 | 84.16 | 90.91 | 74.40 | 60.67 | 57.43 |
Mining | 90.89 | 80.29 | 90.90 | 83.85 | 90.48 | 74.11 | 61.72 | 61.84 |
UAV-Viewed | Normal | Games | Far | |||||
---|---|---|---|---|---|---|---|---|
Optimized Method | mAP/% | IOU/% | mAP/% | IOU/% | mAP/% | IOU/% | mAP/% | IOU/% |
UAV-YOLO | 90.86 | 80.42 | 90.90 | 84.11 | 90.62 | 76.54 | 64.42 | 68.02 |
YOLOv3 | 90.89 | 80.29 | 90.90 | 83.85 | 90.48 | 74.11 | 61.72 | 61.84 |
SSD300 | 89.87 | 72.34 | 90.68 | 76.45 | 89.19 | 68.21 | 56.01 | 52.98 |
SSD512 | 90.92 | 74.23 | 90.89 | 78.86 | 90.71 | 70.08 | 61.09 | 56.84 |
Methods | mAP/% | IOU/% | Time/fps |
---|---|---|---|
UAV-YOLO | 72.54 | 70.05 | 20 |
YOLOv3 | 72.21 | 68.43 | 20 |
SSD300 | 62.94 | 60.72 | 23 |
SSD512 | 72.54 | 60.72 | 23 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, M.; Wang, X.; Zhou, A.; Fu, X.; Ma, Y.; Piao, C. UAV-YOLO: Small Object Detection on Unmanned Aerial Vehicle Perspective. Sensors 2020, 20, 2238. https://doi.org/10.3390/s20082238
Liu M, Wang X, Zhou A, Fu X, Ma Y, Piao C. UAV-YOLO: Small Object Detection on Unmanned Aerial Vehicle Perspective. Sensors. 2020; 20(8):2238. https://doi.org/10.3390/s20082238
Chicago/Turabian StyleLiu, Mingjie, Xianhao Wang, Anjian Zhou, Xiuyuan Fu, Yiwei Ma, and Changhao Piao. 2020. "UAV-YOLO: Small Object Detection on Unmanned Aerial Vehicle Perspective" Sensors 20, no. 8: 2238. https://doi.org/10.3390/s20082238
APA StyleLiu, M., Wang, X., Zhou, A., Fu, X., Ma, Y., & Piao, C. (2020). UAV-YOLO: Small Object Detection on Unmanned Aerial Vehicle Perspective. Sensors, 20(8), 2238. https://doi.org/10.3390/s20082238