DeepPilot: A CNN for Autonomous Drone Racing
<p>Overview of our approach, it consists of 4 steps: (1) Data acquisition using the drone’s onboard camera; (2) Real-time mosaic generation, consisting of 6 frames; (3) Flight commands prediction using our proposed CNN named DeepPilot, these commands are represented by the tuple <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>ϕ</mi> <mo>,</mo> <mi>θ</mi> <mo>,</mo> <mi>ψ</mi> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </semantics></math>; (4) Implementation of a filter to smooth the signal. A video illustrating the performance of our proposed DeepPilot can be found at <a href="https://youtu.be/Qo48pRCxM40" target="_blank">https://youtu.be/Qo48pRCxM40</a>.</p> "> Figure 2
<p>Quadcopter body frame: <math display="inline"><semantics> <mi>ϕ</mi> </semantics></math> is the rotation on the X-axis, <math display="inline"><semantics> <mi>θ</mi> </semantics></math> is the rotation on the Y-axis, and <math display="inline"><semantics> <mi>ψ</mi> </semantics></math> is the rotation on the Z-axis.</p> "> Figure 3
<p>DeepPilot Architecture: our proposed DeepPilot runs 3 specialized models in parallel. The first one predicts <math display="inline"><semantics> <mi>ϕ</mi> </semantics></math> and <math display="inline"><semantics> <mi>θ</mi> </semantics></math> angular positions of the body frame; the second one predicts <math display="inline"><semantics> <mi>ψ</mi> </semantics></math>, the rotational speed over the Z-axis; and the third one predicts <span class="html-italic">h</span>, the vertical speed. The size of the kernels is indicated in the colored boxes at the bottom-left.</p> "> Figure 4
<p>Racetracks in Gazebo for data collection. (<b>a</b>) This racetrack is composed of 7 gates 2 m in height. The track spans over a surface of 53.5 m × 9.6 m, and space in between gates from 10 m to 12 m. (<b>b</b>) A second racetrack composed of 3 gates 3.5 m in height, 4 gates 2 m in height and 4 gates 1.2 m in height, randomly positioned. The track spans over a surface of 72 m × 81 m, and space in between gates from 2 m to 12 m.</p> "> Figure 5
<p>Schematic drone’s lateral motion: (<b>a</b>) outside view of the corresponding side motion when the gate appears to the left of the image (<b>b</b>); (<b>c</b>) corresponding side motion when the gate appears to the right of the image (<b>d</b>). The flight command <math display="inline"><semantics> <mi>ϕ</mi> </semantics></math> will take values in the range of <math display="inline"><semantics> <mrow> <mo>[</mo> <mo>−</mo> <mn>1</mn> <mo>,</mo> <mn>1</mn> <mo>]</mo> </mrow> </semantics></math>; values for <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>θ</mi> <mo>,</mo> <mi>ψ</mi> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </semantics></math> will be set to zero.</p> "> Figure 6
<p>Schematic drone’s forward motion: (<b>a</b>) outside view of the corresponding forward motion when the gate appears in the image center (<b>b</b>,<b>c</b>). The flight command <math display="inline"><semantics> <mi>θ</mi> </semantics></math> will take values in the range of <math display="inline"><semantics> <mrow> <mo>[</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>]</mo> </mrow> </semantics></math>; values for <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>ϕ</mi> <mo>,</mo> <mi>ψ</mi> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </semantics></math> will be set to zero.</p> "> Figure 7
<p>Schematic drone’s rotational motion: (<b>a</b>) outside view of the corresponding rotational motion in the yaw angle when the gate appears skewed towards the right of the image (<b>b</b>); (<b>c</b>) corresponding yaw motion when the gate appears skewed towards the left of the image (<b>d</b>). The flight command <math display="inline"><semantics> <mi>ψ</mi> </semantics></math> will take values in the range of <math display="inline"><semantics> <mrow> <mo>[</mo> <mo>−</mo> <mn>1</mn> <mo>,</mo> <mn>1</mn> <mo>]</mo> </mrow> </semantics></math>; values for <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>ϕ</mi> <mo>,</mo> <mi>θ</mi> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </semantics></math> will be set to zero.</p> "> Figure 8
<p>Schematic drone’s vertical motion: (<b>a</b>) outside view of the drone’s motion flying upwards when the gate appears at the bottom of the image (<b>b</b>); (<b>c</b>) downwards motion when the gate appears at the top of the image (<b>d</b>). The flight command <span class="html-italic">h</span> will take values in the range of <math display="inline"><semantics> <mrow> <mo>[</mo> <mo>−</mo> <mn>1</mn> <mo>,</mo> <mn>1</mn> <mo>]</mo> </mrow> </semantics></math>; values for <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>ϕ</mi> <mo>,</mo> <mi>θ</mi> <mo>,</mo> <mi>ψ</mi> <mo>)</mo> </mrow> </semantics></math> will be set to zero.</p> "> Figure 9
<p>Examples of the dataset collected during the pilot’s flight.</p> "> Figure 10
<p>Mosaic image examples where each mosaic image (<b>a</b>)–(<b>d</b>) is composed of 2 frames.</p> "> Figure 11
<p>Mosaicimage examples where each mosaic image (<b>a</b>)–(<b>d</b>) is composed of 4 frames.</p> "> Figure 12
<p>Mosaic image examples where each mosaic image (<b>a</b>)–(<b>d</b>) is composed of 6 frames.</p> "> Figure 13
<p>Mosaic image examples where each mosaic image (<b>a</b>)–(<b>d</b>) is composed of 8 frames.</p> "> Figure 14
<p>To compose the mosaic, the spaces in the mosaic image are filled every 5 frames: (<b>a</b>) Once the mosaic is full, the first frame in the mosaic is removed from position 1, and all frames are shifted to the left, this is, the frame in the position 2 moves to the position 1, and the frame in the position 3 moves to the position 2 and so on. (<b>b</b>) Illustrates the process for a mosaic composed of 4 images, note that the image in position 3 moves to position 2 when the shift occurs. (<b>c</b>,<b>d</b>) A similar process for when the mosaic is composed of 6 and 8 images, respectively.</p> "> Figure 15
<p>General control architecture in our approach based on an open-loop controller. The mosaic image corresponds to the observations of the world, in the racetrack. This image is passed to our CNN-based approach named DeepPilot to predict the flight commands <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>ϕ</mi> <mo>,</mo> <mi>θ</mi> <mo>,</mo> <mi>ψ</mi> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </semantics></math>, which are fed into the drone’s inner loop controller. As output, the drone will be commanded to fly towards the gate. This controller can be seen as a reactive controller.</p> "> Figure 16
<p>Zigzag racetrack in Gazebo to evaluate our proposed DeepPilot architecture and compared against PoseNet [<a href="#B6-sensors-20-04524" class="html-bibr">6</a>]. The track spans over a surface of <math display="inline"><semantics> <mrow> <mn>60</mn> <mi>m</mi> <mo>.</mo> <mo>×</mo> <mn>7</mn> <mi>m</mi> </mrow> </semantics></math>, with 11m. of space in between gates. We evaluated: (1) PoseNet trained to predict the 4 flight commands <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>ϕ</mi> <mo>,</mo> <mi>θ</mi> <mo>,</mo> <mi>ψ</mi> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </semantics></math>; (2) PoseNet trained to predict only <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>ϕ</mi> <mo>,</mo> <mi>θ</mi> <mo>)</mo> </mrow> </semantics></math>; (3) DeepPilot trained to predict the 4 flight commands. We used a mosaic image composed of 6 frames as image input to the networks.</p> "> Figure 17
<p>Top view (<b>a</b>) and side view (<b>b</b>) of the performance of PoseNet, trained to predict the 4 flight commands, for 10 runs in the zigzag racetrack. Note that PoseNet failed to cross the first gate in all runs. We used a mosaic image composed of 6 frames as image input to the network.</p> "> Figure 18
<p>Top view (<b>a</b>) and side view (<b>b</b>) of the performance of our proposed DeepPilot, trained to predict the 4 flight commands <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>ϕ</mi> <mo>,</mo> <mi>θ</mi> <mo>,</mo> <mi>ψ</mi> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </semantics></math>, for 10 runs in the zigzag track, each run was successful. The average time for the drone to complete the track was 2 min 31 s and the command prediction output ran at 25 fps.</p> "> Figure 19
<p>First racetrack to evaluate the mosaic with a total of 18 gates. The track spans over a surface of 62 m × 44 m, and the space in between gates from 7 m to 9 m. The gates’ height are as follows: 7 gates 2 m in height; 8 gates 2.5 m in height; and 3 gates 3 m in height, randomly positioned. Note that this track was not used for training. A video illustrating the performance of our proposed DeepPilot in this racetrack can be found at <a href="https://youtu.be/Qo48pRCxM40" target="_blank">https://youtu.be/Qo48pRCxM40</a>.</p> "> Figure 20
<p>Top view of 10 runs performed on the track shown in <a href="#sensors-20-04524-f019" class="html-fig">Figure 19</a>. Each sub-image shows the DeepPilot’s performance when using a mosaic image composed of: (<b>a</b>) 2 frames; (<b>b</b>) 4 frames; (<b>c</b>) 6 frames; (<b>d</b>) 8 frames. Note that (<b>c</b>,<b>d</b>) are successful and similar in performance.</p> "> Figure 21
<p>Side view of the 10 runs shown in <a href="#sensors-20-04524-f020" class="html-fig">Figure 20</a>. Note that using mosaic images of 2 or 4 frames are unstable and do not allow for the drone to finish the racetrack.</p> "> Figure 22
<p>Second racetrack to evaluate DeepPilot using a mosaic of 6 frames. The track spans over a surface of 50 m × 23 m, and the space in between gates goes from 5 m to 9 m. Note that this track was not used for training. (<b>a</b>) The track is formed by 10 gates: 4 gates of 1.2 m height and 6 gates of 2 m height. (<b>b</b>) Drone’s trajectories obtained for 5 runs for this racetrack. DeepPilot piloted the drone in the track providing the flight commands <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>ϕ</mi> <mo>,</mo> <mi>θ</mi> <mo>,</mo> <mi>ψ</mi> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </semantics></math>. A video illustrating the performance of our proposed DeepPilot in this racetrack can be found at <a href="https://youtu.be/Qo48pRCxM40" target="_blank">https://youtu.be/Qo48pRCxM40</a>.</p> "> Figure 23
<p>Third racetrack to evaluate DeepPilot using a mosaic of 6 frames. The track spans over a surface of 70 m × 40 m, and the space in between gates goes from 6 m to 15 m. Note that this track was not used for training. (<b>a</b>) The track is formed by 16 gates: 8 gates of 1.2 m high and 8 gates of 2 m high. (<b>b</b>) Drone’s trajectories obtained for 5 runs for this racetrack. DeepPilot piloted the drone in the track providing the flight commands <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>ϕ</mi> <mo>,</mo> <mi>θ</mi> <mo>,</mo> <mi>ψ</mi> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </semantics></math>. A video illustrating the performance of our proposed DeepPilot in this racetrack can be found at <a href="https://youtu.be/Qo48pRCxM40" target="_blank">https://youtu.be/Qo48pRCxM40</a>.</p> ">
Abstract
:1. Introduction
- We present a new CNN architecture to predict flight commands from either a single image or a single mosaic image, in both cases captured only with a monocular camera onboard the drone.
- We present a thorough evaluation in simulated racetracks where gates have different heights and variations in the yaw angle. Opposite to our previous work, where the number of frames to create the mosaic was arbitrarily fixed to 6, we have evaluated the mosaic for when it is composed of 2, 4, 6 or 8 frames and it has been compared against using a single image as input to the network. We aim at showing that temporality is essential for our approach to be successful, since a single image may not be enough for the CNN to predict the drone flight commands. In contrast, we argue that consecutive frames act as a memory that contains a motion trend that can be learned by the network, thus helping to predict adequate flight commands.
- We show that a successful prediction of the flight command can be achieved if we decouple yaw and altitude commands from roll and pitch. This implies that our model runs 3 parallel branches to infer roll and pitch together, but yaw and altitude separate. This has been inspired by those works on visual SLAM and visual odometry, where orientation and translation are decoupled in order to obtain better prediction results [8,9]. Our experiments indicate that this is also useful to be considered in our approach.
- The following items are publicly available at https://github.com/QuetzalCpp/DeepPilot.git: our training and test datasets; our trained DeepPilot model; a set of racetrack worlds in Gazebo; and scripts in python to run our model.
2. Related Work
3. Proposed Framework
3.1. Quadcopter Model
3.2. DeepPilot
3.3. Data Acquisition
3.4. Mosaic Generation
3.5. Noise Filter
4. Results and Discussion
4.1. DeepPilot Performance with a Validation Dataset
4.2. DeepPilot vs PoseNet in a Zigzag Track
4.3. Mosaic Evaluation
4.4. Discussion on Porting Over to the Physical Drone
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Moon, H.; Martinez-Carranza, J.; Cieslewski, T.; Faessler, M.; Falanga, D.; Simovic, A.; Scaramuzza, D.; Li, S.; Ozo, M.; De Wagter, C.; et al. Challenges and implemented technologies used in autonomous drone racing. Intell. Serv. Rob. 2019, 12, 137–148. [Google Scholar] [CrossRef]
- Jung, S.; Hwang, S.; Shin, H.; Shim, D.H. Perception, guidance, and navigation for indoor autonomous drone racing using deep learning. IEEE Rob. Autom. Lett. 2018, 3, 2539–2544. [Google Scholar] [CrossRef]
- Kaufmann, E.; Gehrig, M.; Foehn, P.; Ranftl, R.; Dosovitskiy, A.; Koltun, V.; Scaramuzza, D. Beauty and the beast: Optimal methods meet learning for drone racing. arXiv 2018, arXiv:1810.06224. [Google Scholar]
- Foehn, P.; Brescianini, D.; Kaufmann, E.; Cieslewski, T.; Gehrig, M.; Muglikar, M.; Scaramuzza, D. AlphaPilot: Autonomous Drone Racing. arXiv 2020, arXiv:2005.12813. [Google Scholar]
- Madaan, R.; Gyde, N.; Vemprala, S.; Brown, M.; Nagami, K.; Taubner, T.; Cristofalo, E.; Scaramuzza, D.; Schwager, M.; Kapoor, A. AirSim Drone Racing Lab. arXiv 2020, arXiv:2003.05654. [Google Scholar]
- Kendall, A.; Grimes, M.; Cipolla, R. Posenet: A convolutional network for real-time 6-dof camera relocalization. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 2938–2946. [Google Scholar]
- Rojas-Perez, L.O.; Martinez-Carranza, J. A Temporal CNN-based Approach for Autonomous Drone Racing. In Proceedings of the 2019 Workshop on Research, Education and Development of Unmanned Aerial Systems (RED UAS), Cranfield, UK, 25–27 November 2019; pp. 70–77. [Google Scholar]
- Silveira, G.; Malis, E.; Rives, P. An efficient direct approach to visual SLAM. IEEE Trans. Rob. 2008, 24, 969–979. [Google Scholar] [CrossRef]
- Xue, F.; Wang, Q.; Wang, X.; Dong, W.; Wang, J.; Zha, H. Guided feature selection for deep visual odometry. In Proceedings of the Asian Conference on Computer Vision, Perth, Australia, 2–6 December 2018; pp. 293–308. [Google Scholar]
- Jung, S.; Cho, S.; Lee, D.; Lee, H.; Shim, D.H. A direct visual servoing-based framework for the 2016 IROS Autonomous Drone Racing Challenge. J. Field Rob. 2018, 35, 146–166. [Google Scholar] [CrossRef]
- Moon, H.; Sun, Y.; Baltes, J.; Kim, S.J. The IROS 2016 Competitions [Competitions]. IEEE Rob. Autom Mag. 2017, 24, 20–29. [Google Scholar] [CrossRef]
- Illingworth, J.; Kittler, J. A Survey of the Hough Transform. Comput. Vision Graph. Image Process. 1988, 44, 87–116. [Google Scholar] [CrossRef]
- Rojas-Perez, L.O.; Martinez-Carranza, J. Metric monocular SLAM and colour segmentation for multiple obstacle avoidance in autonomous flight. In Proceedings of the 2017 Workshop on Research, Education and Development of Unmanned Aerial Systems (RED-UAS), Linkoping, Sweden, 3–5 October 2017; pp. 234–239. [Google Scholar] [CrossRef]
- Mur-Artal, R.; Tardós, J.D. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Rob. 2017, 33, 1255–1262. [Google Scholar] [CrossRef] [Green Version]
- Li, S.; Ozo, M.M.; De Wagter, C.; de Croon, G.C. Autonomous drone race: A computationally efficient vision-based navigation and control strategy. arXiv 2018, arXiv:1809.05958. [Google Scholar]
- Li, S.; van der Horst, E.; Duernay, P.; De Wagter, C.; de Croon, G.C. Visual Model-predictive Localization for Computationally Efficient Autonomous Racing of a 72-gram Drone. arXiv 2019, arXiv:1905.10110. [Google Scholar]
- De Croon, G.C.; De Wagter, C.; Remes, B.D.; Ruijsink, R. Sub-sampling: Real-time vision for micro air vehicles. Rob. Autom. Syst. 2012, 60, 167–181. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale ImagRecognition. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Neural Information Processing Systems Conference, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Li, Y.; Huang, H.; Xie, Q.; Yao, L.; Chen, Q. Research on a surface defect detection algorithm based on MobileNet-SSD. Appl. Sci. 2018, 8, 1678. [Google Scholar] [CrossRef] [Green Version]
- Kehl, W.; Manhardt, F.; Tombari, F.; Ilic, S.; Navab, N. SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1521–1529. [Google Scholar]
- Poirson, P.; Ammirato, P.; Fu, C.Y.; Liu, W.; Kosecka, J.; Berg, A.C. Fast single shot detection and pose estimation. In Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; pp. 676–684. [Google Scholar]
- Cabrera-Ponce, A.A.; Rojas-Perez, L.O.; Carrasco-Ochoa, J.A.; Martinez-Trinidad, J.F.; Martinez-Carranza, J. Gate Detection for Micro Aerial Vehicles using a Single Shot Detector. IEEE Lat. Am. Trans. 2019, 17, 2045–2052. [Google Scholar] [CrossRef]
- Kaufmann, E.; Loquercio, A.; Ranftl, R.; Dosovitskiy, A.; Koltun, V.; Scaramuzza, D. Deep drone racing: Learning agile flight in dynamic environments. arXiv 2018, arXiv:1806.08548. [Google Scholar]
- Cocoma-Ortega, J.A.; Martinez-Carranza, J. A CNN based Drone Localisation Approach for Autonomous Drone Racing. In Proceedings of the 11th International Micro Air Vehicle Competition and Conference, Madrid, Spain, 30 September–4 October 2019. [Google Scholar]
- Cocoma-Ortega, J.A.; Martínez-Carranza, J. Towards high-speed localisation for autonomous drone racing. In Mexican International Conference on Artificial Intelligence; Springer: Xalapa, Mexico, 2019; pp. 740–751. [Google Scholar]
- Bojarski, M.; Del Testa, D.; Dworakowski, D.; Firner, B.; Flepp, B.; Goyal, P.; Jackel, L.D.; Monfort, M.; Muller, U.; Zhang, J.; et al. End to end learning for self-driving cars. arXiv 2016, arXiv:1604.07316. [Google Scholar]
- Smolyanskiy, N.; Kamenev, A.; Smith, J.; Birchfield, S. Toward low-flying autonomous MAV trail navigation using deep neural networks for environmental awareness. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 4241–4247. [Google Scholar]
- Muller, M.; Li, G.; Casser, V.; Smith, N.; Michels, D.L.; Ghanem, B. Learning a Controller Fusion Network by Online Trajectory Filtering for Vision-based UAV Racing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019; pp. 573–581. [Google Scholar]
- Wang, L.; Xiong, Y.; Wang, Z.; Qiao, Y.; Lin, D.; Tang, X.; Van Gool, L. Temporal segment networks: Towards good practices for deep action recognition. In European Conference on Computer Vision; Springer: Amsterdam, The Netherlands, 2016; pp. 20–36. [Google Scholar]
- Huang, H.; Sturm, J. 2014. Available online: http://wiki.ros.org/tum_simulator (accessed on 17 February 2018).
DeepPilot Models | Parameters | Value |
---|---|---|
(1) Roll & pitch (2) Yaw (3) Altitude | Optimizer | Adam |
Epoch | 500 | |
Batch size | 32 | |
Activation function | ReLu | |
Learning rate | 0.001 |
Command | Training Flight Command | Validation Flight Command | ||||||
---|---|---|---|---|---|---|---|---|
Mean | Std | Max | Min | Mean | Std | Max | Min | |
Roll | 0.028 | 0.296 | ± 0.9 | ± 0.1 | 0.053 | 0.299 | ± 1 | ± 0.1 |
Pitch | 0.232 | 0.357 | + 1 | + 0.1 | 0.696 | 0.422 | + 1 | + 0.1 |
Yaw | 0.0 | 0.036 | ± 0.1 | ± 0.05 | 0.005 | 0.037 | ± 0.1 | ± 0.05 |
Altitude | 0.0023 | 0.1409 | ± 0.1 | ± 0.05 | 0.0013 | 0.0925 | ± 0.1 | ± 0.05 |
Datasets | & | h | Resolution | |
---|---|---|---|---|
mosaic composed of 2 frames | 5298 | 2928 | 938 | 1280 × 360 |
mosaic composed of 4 frames | 5296 | 2926 | 936 | 1280 × 720 |
mosaic composed of 6 frames | 5294 | 2924 | 934 | 1920 × 720 |
mosaic composed of 8 frames | 5292 | 2922 | 932 | 2560 × 720 |
Datasets | & | h | Resolution | |
---|---|---|---|---|
mosaic composed of 2 frames | 508 | 448 | 238 | 1280 × 360 |
mosaic composed of 4 frames | 506 | 446 | 236 | 1280 × 720 |
mosaic composed of 6 frames | 504 | 444 | 234 | 1920 × 720 |
mosaic composed of 8 frames | 502 | 442 | 232 | 2560 × 720 |
Image Input | MSE | Std | MSE | Std | MSE | Std | MSE h | Std |
---|---|---|---|---|---|---|---|---|
Mosaic_6 | 0.139 | 0.372 | 0.144 | 0.370 | 0.0135 | 0.0965 | 0.0029 | 0.0507 |
Mean fps | Std | |
---|---|---|
PoseNet - Mosaic_6 | 21.3 | 0.1761 |
DeepPilot - Mosaic_6 | 25.4 | 0.1100 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rojas-Perez, L.O.; Martinez-Carranza, J. DeepPilot: A CNN for Autonomous Drone Racing. Sensors 2020, 20, 4524. https://doi.org/10.3390/s20164524
Rojas-Perez LO, Martinez-Carranza J. DeepPilot: A CNN for Autonomous Drone Racing. Sensors. 2020; 20(16):4524. https://doi.org/10.3390/s20164524
Chicago/Turabian StyleRojas-Perez, Leticia Oyuki, and Jose Martinez-Carranza. 2020. "DeepPilot: A CNN for Autonomous Drone Racing" Sensors 20, no. 16: 4524. https://doi.org/10.3390/s20164524
APA StyleRojas-Perez, L. O., & Martinez-Carranza, J. (2020). DeepPilot: A CNN for Autonomous Drone Racing. Sensors, 20(16), 4524. https://doi.org/10.3390/s20164524