Deep RetinaNet-Based Detection and Classification of Road Markings by Visible Light Camera Sensors
<p>Road marking objects. (<b>a</b>) Bike. (<b>b</b>) Forward arrow. (<b>c</b>) Forward-left arrow. (<b>d</b>) Forward-right arrow. (<b>e</b>) Forward-left-right arrow. (<b>f</b>) Left arrow. (<b>g</b>) Left-right arrow. (<b>h</b>) Right arrow.</p> "> Figure 2
<p>Different shapes of arrow markings observed in the image from the front-view camera in the vehicle for different datasets. (<b>a</b>) Malaga urban dataset image captured in Spain [<a href="#B16-sensors-19-00281" class="html-bibr">16</a>]. (<b>b</b>) Daimler dataset image captured in Germany [<a href="#B17-sensors-19-00281" class="html-bibr">17</a>]. (<b>c</b>) Cambridge dataset image captured in UK [<a href="#B18-sensors-19-00281" class="html-bibr">18</a>].</p> "> Figure 3
<p>Proposed method for the detection and classification of road markings based on deep RetinaNet.</p> "> Figure 4
<p>Architecture of RetinaNet: (<b>a</b>) to produce the multi-scale convolutional feature pyramid using a residual network (ResNet) as an encoder (left) and a feature pyramid net (FPN) as a decoder (right). (<b>b</b>) Class subnet for classifying anchor boxes (top), and box subnet for regressing from anchors boxes to ground-truth object boxes (bottom).</p> "> Figure 5
<p>Architecture of deep RetinaNet with revised ResNet (block in orange box) and FPN (block in gray box). M3~5 means the feature maps obtained from Conv3~5, respectively, whereas P3~5 shows the feature maps for prediction.</p> "> Figure 6
<p>Lateral connections between the ResNet backbone and FPN, and top-down pathway merged by addition.</p> "> Figure 7
<p>Examples of images from the open datasets. (<b>a</b>) Cambridge dataset. (<b>b</b>) Daimler dataset. (<b>c</b>) Malaga urban dataset.</p> "> Figure 8
<p>Examples of augmented images. (<b>a</b>) Original images. (<b>b</b>) Flipped images.</p> "> Figure 9
<p>Convergence graphs depicting losses from the training process. (<b>a</b>) First set. (<b>b</b>) Second set.</p> "> Figure 10
<p>Faded bike markings on the road (red box). (<b>a</b>) Example 1 and (<b>b</b>) example 2.</p> "> Figure 11
<p>Examples of correct detection and classification cases. (<b>a</b>) Seq01TP, (<b>b</b>) Seq05VD, (<b>c</b>) Seq06R0, and (<b>d</b>) Seq16E5 from the Cambridge dataset; (<b>e</b>) Test2, (<b>f</b>) Train1, and (<b>g</b>) Train3 from the Daimler dataset; (<b>h</b>) Malaga urban dataset. In (<b>a</b>–<b>h</b>), true positive cases are shown by the boxes of various colors.</p> "> Figure 11 Cont.
<p>Examples of correct detection and classification cases. (<b>a</b>) Seq01TP, (<b>b</b>) Seq05VD, (<b>c</b>) Seq06R0, and (<b>d</b>) Seq16E5 from the Cambridge dataset; (<b>e</b>) Test2, (<b>f</b>) Train1, and (<b>g</b>) Train3 from the Daimler dataset; (<b>h</b>) Malaga urban dataset. In (<b>a</b>–<b>h</b>), true positive cases are shown by the boxes of various colors.</p> "> Figure 12
<p>Examples of incorrect detection and classifications. (<b>a</b>) Test2 of the Daimler dataset. (<b>b</b>) Seq06R0 of the Cambridge dataset. In (<b>a</b>,<b>b</b>), the red colored boxes with solid lines indicate false negatives whereas the boxes of other colors represent true positives.</p> "> Figure 13
<p>Comparison of road marking detection: (<b>a</b>) Proposed method. (<b>b</b>) Faster R-CNN. (<b>c</b>) YOLOv3.</p> "> Figure 14
<p>Examples of the original (<b>left</b> image) and its corresponding IPM images (<b>right</b> image).</p> "> Figure 15
<p>Examples of detection results by the original or IPM images. Result by (<b>a</b>) original image, and that by (<b>b</b>) IPM image of (<b>a</b>). Result by (<b>c</b>) original image, and that by (<b>d</b>) IPM image of (<b>c</b>).</p> "> Figure 16
<p>Jetson TX2 embedded system.</p> ">
Abstract
:1. Introduction
2. Related Works
3. Contributions
- -
- This is the first approach using one-stage deep CNN for the detection and classification of road markings. This method achieved high accuracies of detection and classification in complex conditions such as extreme illumination change, occlusion, and far distance.
- -
- The proposed system does not require any pre-processing including image rectification and enhancement, or post-processing for the detection and classification of road markings.
- -
- We determined that a converted bird’s eye view image cannot cover all drivable regions where some part of original road markings disappear. This negatively influences the training of the CNN model.
- -
- Considering the application of autonomous vehicles in real environments, we tested the trained CNN model not only on a desktop computer but also on an NVIDIA Jetson TX2 embedded system [24], which has been widely used as onboard platform in autonomous vehicles.
- -
- Finally, although the open databases used in our experiments have been widely used in previous studies, they do not provide annotated information of road markings. This increases the time and load for system implementation. Therefore, we provide the manually annotated information of road markings for the Malaga urban dataset, the Daimler dataset, and the Cambridge dataset as shown on the website [25]. We also provide the proposed train models based on different backbones with and without pre-trained weights to other researchers for fair comparison.
4. Proposed Method Using Deep RetinaNet
4.1. Overview of Proposed Method
4.2. Architecture of the Deep RetinaNet Model
5. Experimental Results
5.1. Experimental Dataset
5.2. Training Process
5.3. Testing of the Proposed Method
5.3.1. Accuracies According to Databases and Classes
5.3.2. Comparisons of Accuracies by Deep RetinaNet with Those by One-Stage and Two-Stage Methods
5.3.3. Comparisons of Accuracies Using Original Image with Those by Birds-Eye View Image
5.3.4. Measuring Processing Speed and Evaluation of Embedded Systems
6. Conclusions
Author Contributions
Acknowledgments
Conflicts of Interest
References
- Self-Driving Cars Will Change Your Life More Than You Can Ever Imagine. Available online: https://money.cnn.com/technology/our-driverless-future/self-driving-cars-will-change-your-life/ (accessed on 28 September 2018).
- Benligiray, B.; Topal, C.; Akinlar, C. Video-Based Lane Detection Using a Fast Vanishing Point Estimation Method. In Proceedings of the IEEE International Symposium on Multimedia, Irvine, CA, USA, 10–12 December 2012; pp. 348–351. [Google Scholar]
- Hoang, T.M.; Baek, N.R.; Cho, S.W.; Kim, K.W.; Park, K.R. Road Lane Detection Robust to Shadows Based on a Fuzzy System Using a Visible Light Camera Sensor. Sensors 2017, 17, 2475. [Google Scholar] [CrossRef] [PubMed]
- Suchitra, S.; Satzoda, R.K.; Srikanthan, T. Detection & Classification of Arrow Markings on Roads Using Signed Edge Signatures. In Proceedings of the IEEE Intelligent Vehicles Symposium, Alcala de Henares, Spain, 3–7 June 2012; pp. 796–801. [Google Scholar]
- Foucher, P.; Sebsadji, Y.; Tarel, J.-P.; Charbonnier, P.; Nicolle, P. Detection and Recognition of Urban Road Markings Using Images. In Proceedings of the 14th IEEE International Conference on Intelligent Transportation Systems, Washington, DC, USA, 5–7 October 2011; pp. 1747–1752. [Google Scholar]
- Li, Z.; Cai, Z.-X.; Xie, J.; Ren, X.-P. Road Markings Extraction Based on Threshold Segmentation. In Proceedings of the 9th International Conference on Fuzzy Systems and Knowledge Discovery, Sichuan, China, 29–31 May 2012; pp. 1924–1928. [Google Scholar]
- Yoo, H.; Yang, U.; Sohn, K. Gradient-Enhancing Conversion for Illumination-Robust Lane Detection. IEEE Trans. Intell. Transp. Syst. 2013, 14, 1083–1094. [Google Scholar] [CrossRef]
- Sun, T.-Y.; Tsai, S.-J.; Chan, V. HSI Color Model Based Lane-Marking Detection. In Proceedings of the IEEE Intelligent Transportation Systems Conference, Toronto, ON, Canada, 17–20 September 2006; pp. 1168–1172. [Google Scholar]
- Gurghian, A.; Koduri, T.; Bailur, S.V.; Carey, K.J.; Murali, V.N. DeepLanes: End-To-End Lane Position Estimation Using Deep Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 38–45. [Google Scholar]
- Li, J.; Mei, X.; Prokhorov, D.; Tao, D. Deep Neural Network for Structural Prediction and Lane Detection in Traffic Scene. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 690–703. [Google Scholar] [CrossRef] [PubMed]
- Vokhidov, H.; Hong, H.G.; Kang, J.K.; Hoang, T.M.; Park, K.R. Recognition of Damaged Arrow-Road Markings by Visible Light Camera Sensor Based on Convolutional Neural Network. Sensors 2016, 16, 2160. [Google Scholar] [CrossRef] [PubMed]
- Lee, S.; Kim, J.; Yoon, J.S.; Shin, S.; Bailo, O.; Kim, N.; Lee, T.-H.; Hong, H.S.; Han, S.-H.; Kweon, I.S. VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1965–1973. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2999–3007. [Google Scholar]
- The Málaga Stereo and Laser Urban Data Set—MRPT. Available online: https://www.mrpt.org/MalagaUrbanDataset (accessed on 1 October 2018).
- Daimler Urban Segmentation Dataset. Available online: http://www.6d-vision.com/scene-labeling (accessed on 1 October 2018).
- Cambridge-Driving Labeled Video Database (CamVid). Available online: http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/ (accessed on 1 October 2018).
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
- ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Available online: http://www.image-net.org/challenges/LSVRC/ (accessed on 2 October 2018).
- Chen, T.; Chen, Z.; Shi, Q.; Huang, X. Road Marking Detection and Classification Using Machine Learning Algorithms. In Proceedings of the IEEE Intelligent Vehicles Symposium, Seoul, Korea, 28 June–1 July 2015; pp. 617–621. [Google Scholar]
- He, B.; Ai, R.; Yan, Y.; Lang, X. Accurate and Robust Lane Detection Based on Dual-View Convolutional Neutral Network. In Proceedings of the IEEE Intelligent Vehicles Symposium, Gothenburg, Sweden, 19–22 June 2016; pp. 1041–1046. [Google Scholar]
- Huval, B.; Wang, T.; Tandon, S.; Kiske, J.; Song, W.; Pazhayampallil, J.; Andriluka, M.; Rajpurkar, P.; Migimatsu, T.; Cheng-Yue, R.; et al. An Empirical Evaluation of Deep Learning on Highway Driving. ArXiv, 2015; arXiv:1504.01716v3. [Google Scholar]
- Al-Qizwini, M.; Barjasteh, I.; Al-Qassab, H.; Radha, H. Deep Learning Algorithm for Autonomous Driving Using GoogLeNet. In Proceedings of the IEEE Intelligent Vehicles Symposium, Redondo Beach, CA, USA, 11–14 June 2017; pp. 89–96. [Google Scholar]
- Bailo, O.; Lee, S.; Rameau, F.; Yoon, J.S.; Kweon, I.S. Robust Road Marking Detection and Recognition Using Density-Based Grouping and Machine Learning Techniques. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Santa Rosa, CA, USA, 24–31 March 2017; pp. 760–768. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. ArXiv, 2018; arXiv:1804.02767v1. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. ArXiv, 2016; arXiv:1512.02325v5. [Google Scholar]
- Chira, I.M.; Chibulcutean, A.; Danescu, R.G. Real-Time Detection of Road Markings for Driving Assistance Applications. In Proceedings of the International Conference on Computer Engineering & Systems, Cairo, Egypt, 30 November–2 December 2010; pp. 158–163. [Google Scholar]
- Jetson TX2 Module. Available online: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems-dev-kits-modules/ (accessed on 1 October 2018).
- Soviany, P.; Ionescu, R.T. Optimizing the Trade-off Between Single-Stage and Two-Stage Object Detectors Using Image Difficulty Prediction. ArXiv, 2018; arXiv:1803.08707v3. [Google Scholar]
- Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- NVIDIA GeForce® GTX 1070. Available online: https://www.nvidia.com/en-us/geforce/products/10series/geforce-gtx-1070/ (accessed on 1 October 2018).
- Keras: The Python Deep Learning Library. Available online: https://keras.io/ (accessed on 1 October 2018).
- F1 Score. Available online: https://en.wikipedia.org/wiki/F1_score (accessed on 1 October 2018).
- Henon, Y. Keras-Faster R-CNN. Available online: https://github.com/yhenon/keras-frcnn (accessed on 15 October 2018).
- Darknet: Open Source Neural Networks in C. Available online: https://pjreddie.com/darknet/ (accessed on 15 October 2018).
- Understanding Convolutional Layers in Convolutional Neural Networks (CNNs). Available online: http://machinelearninguru.com/computer_vision/basics/convolution/convolution_layer.html (accessed on 24 October 2018).
- CS231n Convolutional Neural Networks for Visual Recognition. Available online: http://cs231n.github.io/convolutional-networks/#conv (accessed on 24 October 2018).
- Muad, A.M.; Hussain, A.; Samad, S.A.; Mustaffa, M.M.; Majlis, B.Y. Implementation of Inverse Perspective Mapping Algorithm for the Development of an Automatic Lane Tracking System. In Proceedings of the IEEE Region 10 Conference TENCON, Chiang Mai, Thailand, 24 November 2004; pp. 207–210. [Google Scholar]
- Dongguk RetinaNet for Detecting Road Marking Objects with Algorithms and Annotated Files for Open Databases. Available online: http://dm.dgu.edu/link.html (accessed on 24 October 2018).
- Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-scale Image Recognition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015; pp. 1–14. [Google Scholar]
- Smooth L1 Loss. Available online: https://stats.stackexchange.com/questions/351874/how-to-interpret-smooth-l1-loss?rq=1 (accessed on 14 November 2018).
- Ubuntu 16.04. Available online: https://en.wikipedia.org/wiki/Ubuntu_version_history#Ubuntu_16.04_LTS_.28Xenial_Xerus.29 (accessed on 16 November 2018).
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef] [Green Version]
- Oliveira, M.; Santos, V.; Sappa, A.D. Multimodal Inverse Perspective Mapping. Inf. Fusion 2015, 24, 108–121. [Google Scholar] [CrossRef]
- Pendleton, S.D.; Andersen, H.; Du, X.; Shen, X.; Meghjani, M.; Eng, Y.H.; Rus, D.; Ang, M.H., Jr. Perception, Planning, Control, and Coordination for Autonomous Vehicles. Machines 2017, 5, 6. [Google Scholar] [CrossRef]
- Tayara, H.; Chong, K.T. Object Detection in Very High-Resolution Aerial Images Using One-Stage Densely Connected Feature Pyramid Network. Sensors 2018, 18, 3341. [Google Scholar] [CrossRef] [PubMed]
Category | Method | Advantage | Disadvantage |
---|---|---|---|
Handcrafted features-based | Uses color space different from RGB [7,8] Local adaptive threshold and edge detector [6] Line segment detector [2,3] Uses marking pixel extraction and pattern comparison [5] | No extensive training is required Algorithm is simple with low processing time | Performs well only for specific conditions Intensive pre- and post-processing is required Poor performance under extreme conditions Original image must be converted to bird’s eye view to detect straight edge line segment |
Deep features-based | VPGNet [12] used for vanishing point detection and detection and classification of road markings BING and PCANet [19] DVCNN [20] Deep CNN [9,11,21], CNN with RNN [10], and GLAD [22] Uses density-based grouping and shallow CNN [23] | Outperforms handcrafted features-based methods Work well with various shapes and types of road markings in extreme weather conditions | Additional pre-processing is required [10,11,19,20,23] Evaluations were not performed on multiple datasets collected from different countries, including various shapes and types of road markings [9,10,12,19,20,21,22,23] |
Uses one-stage deep CNN (proposed method) | Does not require pre- and post-processing Performs evaluation on multiple datasets collected from different countries, including various shapes and types of road markings | Requires intensive training for deeper CNN model than previous deep features-based methods |
Layer Name | #of Iterations | Kernel Size | #of Filters | Stride | Size of Feature Map (Height × Width × Channel) |
---|---|---|---|---|---|
Input Layer | 720 × 960 × 3 | ||||
Conv1 | 1 | 7 × 7 × 3 | 64 | 2 | 360 × 480 × 64 |
Max pool | 1 | 3 × 3 | 1 | 2 | 180 × 240 × 64 |
Conv2_x | x3 | 1 × 1 × 64 | 64 | 1 | 180 × 240 × 64 |
3 × 3 × 64 | 64 | 1 | 180 × 240 × 64 | ||
1 × 1 × 64 | 256 | 1 | 180 × 240 × 256 | ||
Conv3_x | x4 | 1 × 1 × 256 | 128 | 2/1* | 90 × 120 × 128 |
3 × 3 × 128 | 128 | 1 | 90 × 120 × 128 | ||
1 × 1 × 128 | 512 | 1 | 90 × 120 × 512 | ||
Conv4_x | x6 | 1 × 1 × 512 | 256 | 2/1* | 45 × 60 × 256 |
3 × 3 × 256 | 256 | 1 | 45 × 60 × 256 | ||
1 × 1 × 256 | 1024 | 1 | 45 × 60 × 1024 | ||
Conv5_x | x3 | 1 × 1 × 1024 | 512 | 2/1* | 23 × 30 × 512 |
3 × 3 × 512 | 512 | 1 | 23 × 30 × 512 | ||
1 × 1 × 512 | 2048 | 1 | 23 × 30 × 2048 |
Dataset | Sub Dataset | Image Size (Pixels) | Number of Images (Frames) | Total (Frames) |
---|---|---|---|---|
Cambridge | Seq01TP | 960 × 720 | 216 | 3572 |
Seq05VD | 162 | |||
Seq06R0 | 1518 | |||
Seq16E5 | 1676 | |||
Daimler | Test2 | 1012 × 328 | 470 | 898 |
Train1 | 362 | |||
Train3 | 66 | |||
Malaga urban | 800 × 600 | 9120 | 9120 |
Dataset | Original Training Set | Augmented Training Set | Testing Set |
---|---|---|---|
Cambridge | 1786 | 32,148 | 1786 |
Daimler | 449 | 8082 | 449 |
Malaga urban | 4560 | 82,080 | 4560 |
Dataset | Sub-Dataset | Precision | Recall | Accuracy | F_score |
---|---|---|---|---|---|
Cambridge | Seq01TP | 1.000 | 1.000 | 1.000 | 1.000 |
Seq05VD | 0.988 | 0.904 | 0.895 | 0.944 | |
Seq06R0 | 0.999 | 0.864 | 0.863 | 0.926 | |
Seq16E5 | 0.999 | 0.953 | 0.952 | 0.976 | |
Daimler | Test2 | 0.989 | 0.750 | 0.744 | 0.853 |
Train1 | 1.000 | 0.955 | 0.955 | 0.977 | |
Train3 | 1.000 | 1.000 | 1.000 | 1.000 | |
Malaga | 0.993 | 0.973 | 0.966 | 0.983 | |
Average | 0.996 | 0.925 | 0.922 | 0.957 |
Classes | Precision | Recall | Accuracy | F_score | |
---|---|---|---|---|---|
Bike (B) | 0.999 | 0.753 | 0.752 | 0.859 | |
Arrow | Forward (F) | 0.995 | 0.873 | 0.869 | 0.930 |
Forward–left (FL) | 0.993 | 0.989 | 0.982 | 0.991 | |
Forward–left–right (FLR) | 0.987 | 1.000 | 0.987 | 0.993 | |
Forward–right (FR) | 0.997 | 0.945 | 0.942 | 0.970 | |
Left (L) | 1.000 | 0.982 | 0.982 | 0.991 | |
Left–right (LR) | 0.967 | 1.000 | 0.967 | 0.983 | |
Right (R) | 0.992 | 0.948 | 0.940 | 0.970 | |
Average | 0.991 | 0.936 | 0.928 | 0.961 |
Criterion | Methods | Cambridge | Daimler | Malaga | Avg. | |||||
---|---|---|---|---|---|---|---|---|---|---|
Seq 01TP | Seq 05VD | Seq 06R0 | Seq 16E5 | Test2 | Train1 | Train3 | ||||
Precision | Ours (Retina_1) | 1.000 | 0.988 | 0.999 | 0.999 | 0.989 | 1.000 | 1.000 | 0.993 | 0.996 |
Ours (Retina_2) | 0.985 | 0.988 | 0.995 | 0.997 | 0.996 | 1.000 | 1.000 | 0.993 | 0.994 | |
Ours (Retina_3) | 0.992 | 0.988 | 1.000 | 0.997 | 0.996 | 1.000 | 1.000 | 0.993 | 0.996 | |
Faster R-CNN [28,45] | 0.966 | 0.988 | 0.985 | 0.974 | 0.966 | 0.931 | 0.955 | 0.979 | 0.968 | |
YOLOv3 [30,46] | 0.682 | 0.628 | 0.869 | 0.623 | 0.841 | 0.543 | 0.719 | 0.771 | 0.710 | |
Recall | Ours (Retina_1) | 1.000 | 0.904 | 0.864 | 0.953 | 0.750 | 0.955 | 1.000 | 0.973 | 0.925 |
Ours (Retina_2) | 0.992 | 0.94 | 0.862 | 0.953 | 0.739 | 0.972 | 1.000 | 0.973 | 0.929 | |
Ours (Retina_3) | 0.985 | 0.904 | 0.867 | 0.953 | 0.744 | 0.955 | 1.000 | 0.973 | 0.923 | |
Faster R-CNN [28,45] | 0.851 | 0.883 | 0.782 | 0.810 | 0.859 | 0.874 | 1.000 | 0.498 | 0.820 | |
YOLOv3 [30,46] | 1.000 | 0.989 | 0.999 | 0.997 | 0.894 | 0.919 | 1.000 | 0.985 | 0.973 | |
Accuracy | Ours (Retina_1) | 1.000 | 0.895 | 0.863 | 0.952 | 0.744 | 0.955 | 1.000 | 0.966 | 0.922 |
Ours (Retina_2) | 0.977 | 0.884 | 0.859 | 0.950 | 0.737 | 0.972 | 1.000 | 0.966 | 0.918 | |
Ours (Retina_3) | 0.978 | 0.895 | 0.867 | 0.951 | 0.742 | 0.955 | 1.000 | 0.966 | 0.919 | |
Faster R-CNN [28,45] | 0.826 | 0.874 | 0.771 | 0.793 | 0.834 | 0.821 | 0.955 | 0.493 | 0.796 | |
YOLOv3 [30,46] | 0.682 | 0.624 | 0.868 | 0.622 | 0.765 | 0.518 | 0.719 | 0.763 | 0.695 | |
F_score | Ours (Retina_1) | 1.000 | 0.944 | 0.926 | 0.976 | 0.853 | 0.977 | 1.000 | 0.983 | 0.958 |
Ours (Retina_2) | 0.989 | 0.963 | 0.924 | 0.975 | 0.848 | 0.986 | 1.000 | 0.983 | 0.958 | |
Ours (Retina_3) | 0.989 | 0.944 | 0.929 | 0.975 | 0.852 | 0.977 | 1.000 | 0.983 | 0.956 | |
Faster R-CNN [28,45] | 0.905 | 0.932 | 0.870 | 0.884 | 0.909 | 0.901 | 0.977 | 0.661 | 0.880 | |
YOLOv3 [30,46] | 0.811 | 0.769 | 0.929 | 0.767 | 0.867 | 0.683 | 0.837 | 0.865 | 0.816 |
Criterion | Methods | Cambridge | Daimler | Malaga | Avg. | |||||
---|---|---|---|---|---|---|---|---|---|---|
Seq 01TP | Seq 05VD | Seq 06R0 | Seq 16E5 | Test2 | Train1 | Train3 | ||||
Precision | Original image | 1.000 | 0.988 | 0.999 | 0.999 | 0.989 | 1.000 | 1.000 | 0.993 | 0.996 |
IPM image [48] | 0.957 | 0.959 | 0.994 | 0.991 | 0.973 | 0.964 | 0.927 | 0.990 | 0.969 | |
Recall | Original image | 1.000 | 0.904 | 0.864 | 0.953 | 0.750 | 0.955 | 1.000 | 0.973 | 0.925 |
IPM image [48] | 0.827 | 0.823 | 0.848 | 0.847 | 0.712 | 0.802 | 0.821 | 0.816 | 0.812 | |
Accuracy | Original image | 1.000 | 0.895 | 0.863 | 0.952 | 0.744 | 0.955 | 1.000 | 0.966 | 0.922 |
IPM image [48] | 0.797 | 0.795 | 0.844 | 0.840 | 0.698 | 0.779 | 0.771 | 0.809 | 0.792 | |
F_score | Original image | 1.000 | 0.944 | 0.926 | 0.976 | 0.853 | 0.977 | 1.000 | 0.983 | 0.958 |
IPM image [48] | 0.887 | 0.886 | 0.916 | 0.913 | 0.822 | 0.876 | 0.871 | 0.895 | 0.883 |
Dataset | Sub Dataset | Processing Time | ||
---|---|---|---|---|
Proposed Method | Faster R-CNN [28,45] | YOLOv3 [30,46] | ||
Cambridge | Seq01TP | 50 | 291 | 49 |
Seq05VD | 50 | 297 | 50 | |
Seq06R0 | 47 | 279 | 47 | |
Seq16E5 | 50 | 319 | 50 | |
Daimler | Test2 | 31 | 656 | 40 |
Train1 | 35 | 631 | 40 | |
Train3 | 37 | 668 | 40 | |
Malaga | 42 | 278 | 39 | |
Average | 42.75 | 427.38 | 44.38 |
Jetson TX2 Embedded System | |
GPU | NVIDIA PascalTM, 256 CUDA cores |
CPU | HMP Dual Denver 2 (2 MB L2) + Quad ARM® A57 (2 MB) |
Memory | 8 GB |
Data storage | 32 GB |
Operating system | Linux for Tegra R28.1 (L4T 28.1) |
Dimensions (width × height × depth) | 50 mm × 87 mm × 10.4 mm |
Dataset | Sub Dataset | Processing Time | ||
---|---|---|---|---|
Proposed Method | Faster R-CNN [24,37] | YOLOv3 [27,38] | ||
Cambridge | Seq01TP | 50 | 297 | 50 |
Seq05VD | 57 | 297 | 54 | |
Seq06R0 | 54 | 286 | 50 | |
Seq16E5 | 54 | 319 | 52 | |
Daimler | Test2 | 38 | 662 | 42 |
Train1 | 39 | 638 | 45 | |
Train3 | 38 | 675 | 43 | |
Malaga | 44 | 281 | 43 | |
Average | 46.75 | 431.875 | 47.375 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hoang, T.M.; Nguyen, P.H.; Truong, N.Q.; Lee, Y.W.; Park, K.R. Deep RetinaNet-Based Detection and Classification of Road Markings by Visible Light Camera Sensors. Sensors 2019, 19, 281. https://doi.org/10.3390/s19020281
Hoang TM, Nguyen PH, Truong NQ, Lee YW, Park KR. Deep RetinaNet-Based Detection and Classification of Road Markings by Visible Light Camera Sensors. Sensors. 2019; 19(2):281. https://doi.org/10.3390/s19020281
Chicago/Turabian StyleHoang, Toan Minh, Phong Ha Nguyen, Noi Quang Truong, Young Won Lee, and Kang Ryoung Park. 2019. "Deep RetinaNet-Based Detection and Classification of Road Markings by Visible Light Camera Sensors" Sensors 19, no. 2: 281. https://doi.org/10.3390/s19020281
APA StyleHoang, T. M., Nguyen, P. H., Truong, N. Q., Lee, Y. W., & Park, K. R. (2019). Deep RetinaNet-Based Detection and Classification of Road Markings by Visible Light Camera Sensors. Sensors, 19(2), 281. https://doi.org/10.3390/s19020281