[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Object Detection Using Deep Learning Methods in Traffic Scenarios

Published: 05 March 2021 Publication History

Abstract

The recent boom of autonomous driving nowadays has made object detection in traffic scenes a hot topic of research. Designed to classify and locate instances in the image, this is a basic but challenging task in the computer vision field. With its powerful feature extraction abilities, which are vital for object detection, deep learning has expanded its application areas to this field during the past several years and thus achieved breakthroughs. However, even with such powerful approaches, traffic scenarios have their own specific challenges, such as real-time detection, changeable weather, and complex lighting conditions. This survey is dedicated to summarizing research and papers on applying deep learning to the transportation environment in recent years. More than 100 research papers are covered, and different aspects such as key generic object detection frameworks, categorized object detection applications in traffic scenario, evaluation metrics, and classified datasets are included. Some open research fields are also provided. We believe that it is the first survey focusing on deep learning-based object detection in traffic scenario.

References

[1]
La Route Automatisée. 2019. Traffic Lights Recognition (TLR) public benchmarks. Retrieved from http://www.lara.prd.fr/benchmarks/trafficlightsrecognition.
[2]
Martin Bach, Daniel Stumper, and Klaus Dietmayer. 2018. Deep convolutional traffic light recognition for automated driving. In Proceedings of the 21st International Conference on Intelligent Transportation Systems (ITSC’18). IEEE, 851--858.
[3]
Karsten Behrendt and Libor Novak. [n.d.]. A deep learning approach to traffic lights: Detection, tracking, and classification. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’17). IEEE.
[4]
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2018. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 4 (2018), 834--848.
[5]
Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, and Raquel Urtasun. 2016. Monocular 3D object detection for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2147--2156.
[6]
Xiaozhi Chen, Kaustav Kundu, Yukun Zhu, Huimin Ma, Sanja Fidler, and Raquel Urtasun. 2017. 3D object proposals using stereo imagery for accurate object class detection. IEEE Trans. Pattern Anal. Mach. Intell. 40, 5 (2017), 1259--1272.
[7]
Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia. 2017. Multi-view 3D object detection network for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1907--1915.
[8]
Peng Cheng, Wu Liu, Yifan Zhang, and Huadong Ma. 2018. LOCO: Local context-based faster R-CNN for small traffic sign detection. In Proceedings of the International Conference on Multimedia Modeling. Springer, 329--341.
[9]
Wenqing Chu, Yao Liu, Chen Shen, Deng Cai, and Xian-Sheng Hua. 2018. Multi-task vehicle detection with region-of-interest voting. IEEE Trans. Image Process. 27, 1 (2018), 432--441.
[10]
Embedded Computing Lab. 2019. WPI traffic light dataset. Retrieved from http://computing.wpi.edu/dataset.html.
[11]
Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3213--3223.
[12]
Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2019. Cityscapes dataset. Retrieved from https://www.cityscapes-dataset.com/.
[13]
Thomas Cover and Peter Hart. 1967. Nearest neighbor pattern classification. IEEE Trans. Info. Theory 13, 1 (1967), 21--27.
[14]
David R. Cox. 1958. The regression analysis of binary sequences. J. Roy. Stat. Soc.: Ser. B (Methodol.) 20, 2 (1958), 215--232.
[15]
Autti CrowdAI. 2017. Udacity labeled dataset. Retrieved from https://github.com/udacity/self-driving-car/tree/master/annotations.
[16]
Yaodong Cui, Ren Chen, Wenbo Chu, Long Chen, Daxin Tian, and Dongpu Cao. 2020. Deep learning for image and point cloud fusion in autonomous driving: A review. Retrieved from https://Arxiv:2004.05224.
[17]
Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. 2016. R-fcn: Object detection via region-based fully convolutional networks. In Advances in Neural Information Processing Systems. MIT Press, 379--387.
[18]
Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In Proceedings of the International Conference on Computer Vision & Pattern Recognition (CVPR’05), Vol. 1. IEEE, 886--893.
[19]
Navneet Dalal and Bill Triggs. 2019. INRIA Person Dataset. Retrieved from http://pascal.inrialpes.fr/data/human/.
[20]
Jesse Davis and Mark Goadrich. 2006. The relationship between precision-recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning. ACM, 233--240.
[21]
Grupo de Tratamiento de Imagenes (GTI). 2012. GTI vehicle image database. Retrieved from http://www.gti.ssr.upm.es/data/Vehicle_database.html.
[22]
Moises Diaz, Pietro Cerri, Giuseppe Pirlo, Miguel A. Ferrer, and Donato Impedovo. 2015. A survey on traffic light detection. In Proceedings of the International Conference on Image Analysis and Processing. Springer, 201--208.
[23]
P. Dollár, C. Wojek, B. Schiele, and P. Perona. 2009. Pedestrian detection: A benchmark. In Proceedings of the International Conference on Computer Vision & Pattern Recognition (CVPR’09).
[24]
Piotr Dollar, Christian Wojek, Bernt Schiele, and Pietro Perona. 2011. Pedestrian detection: An evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34, 4 (2011), 743--761.
[25]
Piotr Dollár, Christian Wojek, Bernt Schiele, and Pietro Perona. 2012. Pedestrian detection: An evaluation of the state of the art. PAMI 34 (2012).
[26]
Piotr Dollár, Christian Wojek, Bernt Schiele, and Pietro Perona. 2019. Caltech Pedestrian Detection Benchmark. Retrieved from http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/.
[27]
Alexey Dosovitskiy, Philipp Fischer, Eddy Ilg, Philip Hausser, Caner Hazirbas, Vladimir Golkov, Patrick Van Der Smagt, Daniel Cremers, and Thomas Brox. 2015. Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision. 2758--2766.
[28]
Xianzhi Du, Mostafa El-Khamy, Jungwon Lee, and Larry Davis. 2017. Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’17). IEEE, 953--961.
[29]
Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, and Qi Tian. 2019. CenterNet: Keypoint triplets for object detection. Retrieved from https://Arxiv:1904.08189.
[30]
Markus Enzweiler and Dariu M. Gavrila. 2008. Monocular pedestrian detection: Survey and experiments. IEEE Trans. Pattern Anal. Mach. Intell. 31, 12 (2008), 2179--2195.
[31]
Andreas Ess, Bastian Leibe, and Luc Van Gool. 2007. Depth and appearance for mobile scene analysis. In Proceedings of the IEEE 11th International Conference on Computer Vision. IEEE, 1--8.
[32]
Andreas Ess, Bastian Leibe, and Luc Van Gool. 2019. Robust Multi-Person Tracking from Mobile Platforms. Retrieved from https://data.vision.ee.ethz.ch/cvl/aess/dataset/.
[33]
Mark Everingham, Luc van Gool, Chris Williams, John Winn, and Andrew Zisserman. 2005. Visual Object Classes Challenge 2012 (VOC2012). Retrieved from http://host.robots.ox.ac.uk/pascal/VOC/voc2012/.
[34]
Quanfu Fan, Lisa Brown, and John Smith. 2016. A closer look at faster R-CNN for vehicle detection. In Proceedings of the IEEE Intelligent Vehicles Symposium (IV’16). IEEE, 124--129.
[35]
Pedro Felzenszwalb, David McAllester, and Deva Ramanan. 2008. A discriminatively trained, multiscale, deformable part model. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.
[36]
Heidelberg Collaboratory for Image Processing. 2019. Bosch Small Traffic Lights Dataset. Retrieved from https://hci.iwr.uni-heidelberg.de/node/6132.
[37]
The Laboratory for Intelligent and Safe Automobiles. 2010. Vehicle Detection Dataset. Retrieved from http://cvrr.ucsd.edu/LISA/vehicledetection.html.
[38]
Vision for Intelligent Vehicles and Applications. 2006. VIVA traffic light detection benchmark. Retrieved from http://cvrr.ucsd.edu/vivachallenge/index.php/traffic-light/traffic-light-detection/.
[39]
Yoav Freund and Robert E. Schapire. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 1 (1997), 119--139.
[40]
Jannik Fritsch, Tobias Kuehnl, and Andreas Geiger. 2013. A new performance measure and evaluation benchmark for road detection algorithms. In Proceedings of the International Conference on Intelligent Transportation Systems (ITSC’13).
[41]
Meng-Yin Fu and Yuan-Shui Huang. 2010. A survey of traffic sign recognition. In Proceedings of the International Conference on Wavelet Analysis and Pattern Recognition. IEEE, 119--124.
[42]
Mingfei Gao, Ruichi Yu, Ang Li, Vlad I. Morariu, and Larry S. Davis. 2018. Dynamic zoom-in network for fast object detection in large images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6926--6935.
[43]
Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’12).
[44]
David Geronimo, Antonio M. Lopez, Angel D. Sappa, and Thorsten Graf. 2009. Survey of pedestrian detection for advanced driver assistance systems. IEEE Trans. Pattern Anal. Mach. Intell. 32, 7 (2009), 1239--1258.
[45]
Ross Girshick. 2015. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 1440--1448.
[46]
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 580--587.
[47]
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2016. Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38, 1 (2016), 142--158.
[48]
Jack Greenhalgh and Majid Mirmehdi. 2012. Real-time detection and recognition of road traffic signs. IEEE Trans. Intell. Transport. Syst. 13, 4 (2012), 1498--1506.
[49]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Machine Intell. 37, 9 (2015), 1904--1916.
[50]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
[51]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity mappings in deep residual networks. In Proceedings of the European Conference on Computer Vision. Springer, 630--645.
[52]
Duyoung Heo, Eunju Lee, and Byoung Chul Ko. 2018. Pedestrian detection at night using deep neural networks and saliency maps. Electron. Imag. 2018, 17 (2018), 1--9.
[53]
Congrui Hetang, Hongwei Qin, Shaohui Liu, and Junjie Yan. 2017. Impression network for video object detection. Retrieved from https://Arxiv:1712.05896.
[54]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780.
[55]
Sebastian Houben, Johannes Stallkamp, Jan Salmen, Marc Schlipsing, and Christian Igel. 2013. Detection of traffic signs in real-world images: The German traffic sign detection benchmark. In Proceedings of the International Joint Conference on Neural Networks.
[56]
Sebastian Houben, Johannes Stallkamp, Jan Salmen, Marc Schlipsing, and Christian Igel. 2013. German Traffic Sign Detection Benchmark. Retrieved from http://benchmark.ini.rub.de/?section=gtsdb&subsection==news.
[57]
Chaowei Hu, Yunpeng Wang, Guizhen Yu, Zhangyu Wang, Ao Lei, and Zhehua Hu. 2018. Embedding CNN-based Fast Obstacles Detection for Autonomous Vehicles. Technical Report. SAE Technical Paper.
[58]
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4700--4708.
[59]
Hairu Huang, Ming Yang, Chunxiang Wang, and Bing Wang. 2018. A unified hierarchical convolutional neural network for fine-grained traffic sign detection. In Proceedings of the Chinese Automation Congress (CAC’18). IEEE, 2733--2738.
[60]
ILSVRC. 2019. ImageNet Large Scale Visual Recognition Challenge. Retrieved from http://www.image-net.org/challenges/LSVRC/.
[61]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Retrieved from https://Arxiv:1502.03167.
[62]
Morten B. Jensen, Kamal Nasrollahi, and Thomas B. Moeslund. 2017. Evaluating state-of-the-art object detector on challenging traffic light data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 9--15.
[63]
Morten Bornø Jensen, Mark Philip Philipsen, Andreas Møgelmose, Thomas Baltzer Moeslund, and Mohan Manubhai Trivedi. 2016. Vision for looking at traffic lights: Issues, survey, and perspectives. IEEE Trans. Intell. Transport. Syst. 17, 7 (2016), 1800--1815.
[64]
Licheng Jiao, Fan Zhang, Fang Liu, Shuyuan Yang, Lingling Li, Zhixi Feng, and Rong Qu. 2019. A survey of deep learning-based object detection. IEEE Access 7 (2019), 128837--128868.
[65]
KangUn Jo, JungHyuk Im, Jingu Kim, and Dae-Shik Kim. 2017. A real-time multi-class multi-object tracker using YOLOv2. In Proceedings of the IEEE International Conference on Signal and Image Processing Applications (ICSIPA’17). IEEE, 507--511.
[66]
Narendra Kumar Kamila. 2015. Handbook of Research on Emerging Perspectives in Intelligent Pattern Recognition, Analysis, and Image Processing. IGI Global.
[67]
Makoto Kawano, Kazuhiro Mikami, Satoshi Yokoyama, Takuro Yonezawa, and Jin Nakazawa. 2017. Road marking blur detection with drive recorder. In Proceedings of the IEEE International Conference on Big Data (BigData’17). IEEE, 4092--4097.
[68]
Huieun Kim, Youngwan Lee, Byeounghak Yim, Eunsoo Park, and Hakil Kim. 2016. On-road object detection using deep neural network. In Proceedings of the IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia). IEEE, 1--4.
[69]
Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, and Jianbo Shi. 2019. FoveaBox: Beyond anchor-based object detector. Retrieved from https://Arxiv:1904.03797.
[70]
Philipp Krähenbühl and Vladlen Koltun. 2011. Efficient inference in fully connected crfs with gaussian edge potentials. In Advances in Neural Information Processing Systems. MIT Press, 109--117.
[71]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. MIT Press, 1097--1105.
[72]
Jason Ku, Melissa Mozifian, Jungwook Lee, Ali Harakeh, and Steven L. Waslander. 2018. Joint 3D proposal generation and object detection from view aggregation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’18). IEEE, 1--8.
[73]
Abhijit Kundu, Yin Li, and James M. Rehg. 2018. 3D-RCNN: Instance-level 3D object reconstruction via render-and-compare. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3559--3568.
[74]
Fredrik Larsson, Michael Felsberg, and Per-Erik Forssen. 2011. Correlating fourier descriptors of local patches for road sign recognition. IET Comput. Vision 5, 4 (2011), 244--254.
[75]
Fredrik Larsson, Michael Felsberg, and Per-Erik Forssen. 2019. Swedish Traffic Signs Dataset. Retrieved from http://www.cvl.isy.liu.se/research/datasets/traffic-signs-dataset/.
[76]
Hei Law and Jia Deng. 2018. Cornernet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer Vision (ECCV’18). 734--750.
[77]
Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 2. IEEE, 2169--2178.
[78]
Yann LeCun et al. 2015. LeNet-5, convolutional neural networks. Retrieved from http://yann. lecun. com/exdb/lenet.
[79]
Yann LeCun, Corinna Cortes, and Christopher J. C. Burges. 1998. The MNIST database of handwritten digits. Retrieved from https://http://yann.lecun.com/exdb/mnist/.
[80]
Bo Li. 2017. 3D fully convolutional network for vehicle detection in point cloud. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’17). IEEE, 1513--1518.
[81]
Bo Li, Tianlei Zhang, and Tian Xia. 2016. Vehicle detection from 3D lidar using fully convolutional network. Retrieved from https://Arxiv:1608.07916.
[82]
Dong Li, Dongbin Zhao, Yaran Chen, and Qichao Zhang. 2018. Deepsign: Deep learning-based traffic sign recognition. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’18). IEEE, 1--6.
[83]
Jianan Li, Xiaodan Liang, ShengMei Shen, Tingfa Xu, Jiashi Feng, and Shuicheng Yan. 2018. Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimedia 20, 4 (2018), 985--996.
[84]
Peiliang Li, Xiaozhi Chen, and Shaojie Shen. 2019. Stereo R-CNN-based 3D object detection for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7644--7652.
[85]
Qingpeng Li, Lichao Mou, Qizhi Xu, Yun Zhang, and Xiao Xiang Zhu. 2018. R3-Net: A deep network for multi-oriented vehicle detection in aerial images and videos. Retrieved from https://Arxiv:1808.05560.
[86]
Xiang Li, Jun Li, Xiaolin Hu, and Jian Yang. 2019. Line-CNN: End-to-end traffic line detection with line proposal unit. IEEE Trans. Intell. Transport. Syst. (2019).
[87]
Ming Liang, Bin Yang, Shenlong Wang, and Raquel Urtasun. 2018. Deep continuous fusion for multi-sensor 3D object detection. In Proceedings of the European Conference on Computer Vision (ECCV’18). 641--656.
[88]
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2117--2125.
[89]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision. Springer, 740--755.
[90]
Li Liu, Wanli Ouyang, Xiaogang Wang, Paul Fieguth, Jie Chen, Xinwang Liu, and Matti Pietikäinen. 2018. Deep learning for generic object detection: A survey. Retrieved from Arxiv:1809.02165.
[91]
Li Liu, Wanli Ouyang, Xiaogang Wang, Paul Fieguth, Jie Chen, Xinwang Liu, and Matti Pietikäinen. 2020. Deep learning for generic object detection: A survey. Int. J. Comput. Vision 128, 2 (2020), 261--318.
[92]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision. Springer, 21--37.
[93]
David Fernández Llorca, R. Arroyo, and Miguel Angel Sotelo. 2013. Vehicle logo recognition in traffic images using HOG features and SVM. In Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems (ITSC’13). IEEE, 2229--2234.
[94]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431--3440.
[95]
Jiri Matas, Ondrej Chum, Martin Urban, and Tomás Pajdla. 2004. Robust wide-baseline stereo from maximally stable extremal regions. Image Vision Comput. 22, 10 (2004), 761--767.
[96]
Zibo Meng, Xiaochuan Fan, Xin Chen, Min Chen, and Yan Tong. 2017. Detecting small signs from large images. In Proceedings of the IEEE International Conference on Information Reuse and Integration (IRI’17). IEEE, 217--224.
[97]
Ala Mhalla, Thierry Chateau, Sami Gazzah, and Najoua Essoukri Ben Amara. 2018. An embedded computer-vision system for multi-object detection in traffic surveillance. IEEE Trans. Intell. Transport. Syst. (2018).
[98]
Zhao Min, Jia Jian, Sun Dihua, and Tang Yi. 2018. Vehicle detection method based on deep learning and multi-layer feature fusion. In Proceedings of the Chinese Control And Decision Conference (CCDC’18). IEEE, 5862--5867.
[99]
Hans Moravec. 1988. Moravec’s paradox. Retrieved from https://en.wikipedia.org/wiki/Moravec%27s_paradox#CITEREFMoravec1988.
[100]
Arsalan Mousavian, Dragomir Anguelov, John Flynn, and Jana Kosecka. 2017. 3D bounding box estimation using deep learning and geometry. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7074--7082.
[101]
Julian Müller and Klaus Dietmayer. 2018. Detecting traffic lights by single shot detection. In Proceedings of the 21st International Conference on Intelligent Transportation Systems (ITSC’18). IEEE, 266--273.
[102]
The Chinese University of Hong Kong Multimedia Laboratory. 2018. CULane Dataset. Retrieved from https://xingangpan.github.io/projects/CULane.html.
[103]
Pauline C. Ng and Steven Henikoff. 2003. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 13 (2003), 3812--3814.
[104]
Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago. 2012. KITTI vision benchmark suite. Retrieved from http://www.cvlibs.net/datasets/kitti/eval_object.php.
[105]
Gabriel L. Oliveira, Wolfram Burgard, and Thomas Brox. 2016. Efficient deep models for monocular road segmentation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’16). IEEE, 4885--4891.
[106]
Xingang Pan, Jianping Shi, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2018. Spatial as deep: Spatial cnn for traffic scene understanding. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.
[107]
Yanwei Pang, Yuan Yuan, Xuelong Li, and Jing Pan. 2011. Efficient HOG human detection. Signal Process. 91, 4 (2011), 773--781.
[108]
Prashant W. Patil and Subrahmanyam Murala. 2018. MSFgNet: A novel compact end-to-end deep network for moving object detection. IEEE Trans. Intell. Transport. Syst. (2018).
[109]
Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su, and Leonidas J. Guibas. 2018. Frustum pointnets for 3D object detection from RGB-D data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 918--927.
[110]
Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017. Pointnet: Deep learning on point sets for 3D classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 652--660.
[111]
Charles R. Qi, Hao Su, Matthias Nießner, Angela Dai, Mengyuan Yan, and Leonidas J. Guibas. 2016. Volumetric and multi-view cnns for object classification on 3D data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5648--5656.
[112]
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J. Guibas. 2017. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in Neural Information Processing Systems. MIT Press, 5099--5108.
[113]
Rongqiang Qian, Qianyu Liu, Yong Yue, Frans Coenen, and Bailing Zhang. 2016. Road surface traffic sign detection with hybrid region proposal and fast R-CNN. In Proceedings of the 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD’16). IEEE, 555--559.
[114]
Hongquan Qu, Tongyang Yuan, Zhiyong Sheng, and Yuan Zhang. 2018. A pedestrian detection method based on YOLOv3 model and image enhanced by retinex. In Proceedings of the 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI). IEEE, 1--5.
[115]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 779--788.
[116]
Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7263--7271.
[117]
Joseph Redmon and Ali Farhadi. 2018. Yolov3: An incremental improvement. Retrieved from https://Arxiv:1804.02767.
[118]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. MIT Press, 91--99.
[119]
Frank Rosenblatt. 1957. The Perceptron, a Perceiving and Recognizing Automaton Project Para. Cornell Aeronautical Laboratory.
[120]
Khaled Saleh, Mohammed Hossny, Ahmed Hossny, and Saeid Nahavandi. 2017. Cyclist detection in LIDAR scans using faster R-CNN and synthetic depth images. In Proceedings of the IEEE 20th International Conference on Intelligent Transportation Systems (ITSC’17). IEEE, 1--6.
[121]
Bernhard Scholkopf and Alexander J. Smola. 2001. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press.
[122]
Kiwoo Shin, Youngwook Paul Kwon, and Masayoshi Tomizuka. 2019. Roarnet: A robust 3D object detection based on region approximation refinement. In Proceedings of the IEEE Intelligent Vehicles Symposium (IV’19). IEEE, 2510--2515.
[123]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from https://Arxiv:1409.1556.
[124]
Sayanan Sivaraman and Mohan Manubhai Trivedi. 2010. A general active-learning framework for on-road vehicle recognition and tracking. IEEE Trans. Intell. Transport. Syst. 11, 2 (2010), 267--276.
[125]
Sayanan Sivaraman and Mohan Manubhai Trivedi. 2013. Looking at vehicles on the road: A survey of vision-based vehicle detection, tracking, and behavior analysis. IEEE Trans. Intell. Transport. Syst. 14, 4 (2013), 1773--1795.
[126]
Shuran Song and Jianxiong Xiao. 2016. Deep sliding shapes for amodal 3D object detection in RGB-D images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 808--816.
[127]
Johannes Stallkamp, Marc Schlipsing, Jan Salmen, and Christian Igel. 2012. German Traffic Sign Recognition Benchmark. Retrieved from http://benchmark.ini.rub.de/?section=gtsrb&subsection=news.
[128]
Johannes Stallkamp, Marc Schlipsing, Jan Salmen, and Christian Igel. 2012. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Netw. 32 (2012), 323--332.
[129]
Farhana Sultana, Abu Sufian, and Paramartha Dutta. 2020. A review of object detection models based on convolutional neural network. In Intelligent Computing: Image Processing Based Applications. Springer, 1--16.
[130]
Zehang Sun, George Bebis, and Ronald Miller. 2006. On-road vehicle detection: A review. IEEE Trans. Pattern Anal. Mach. Intell. 28, 5 (2006), 694--711.
[131]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). http://arxiv.org/abs/1409.4842
[132]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2818--2826.
[133]
Tianyu Tang, Shilin Zhou, Zhipeng Deng, Huanxin Zou, and Lin Lei. 2017. Vehicle detection in aerial images based on region convolutional neural networks and hard negative example mining. Sensors 17, 2 (2017), 336.
[134]
Jing Tao, Hongbo Wang, Xinyu Zhang, Xiaoyu Li, and Huawei Yang. 2017. An object detection system based on YOLO in traffic scene. In Proceedings of the 6th International Conference on Computer Science and Network Technology (ICCSNT’17). IEEE, 315--319.
[135]
Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. 2019. FCOS: Fully convolutional one-stage object detection. Retrieved from https://Arxiv:1904.01355.
[136]
Tusimple. 2017. Tusimple Benchmark. Retrieved from http://benchmark.tusimple.ai/#/.
[137]
Jasper R. R. Uijlings, Koen E. A. Van De Sande, Theo Gevers, and Arnold W. M. Smeulders. 2013. Selective search for object recognition. Int. J. Comput. Vision 104, 2 (2013), 154--171.
[138]
Wiebe Van Ranst, Floris De Smedt, Jonathan Berte, Toon Goedemé, and Technologiepark-Zwijnaarde Robo-Vision. 2018. Fast simultaneous people detection and re-identification in a single shot network. In Proceedings of the IEEE International Conference on Advanced Video and Signal-based Surveillance. IEEE.
[139]
Safat B. Wali, Mohammad A. Hannan, Aini Hussain, and Salina A. Samad. 2015. Comparative survey on traffic sign detection and recognition: A review. Przeglad Elektrotechn. 1, 12 (2015), 40--44.
[140]
Shiyao Wang, Hongchao Lu, Pavel Dmitriev, and Zhidong Deng. 2018. Fast object detection in compressed video. Retrieved from http://arxiv.org/abs/1811.11057.
[141]
Shiyao Wang, Yucong Zhou, Junjie Yan, and Zhidong Deng. 2018. Fully motion-aware network for video object detection. In Proceedings of the European Conference on Computer Vision (ECCV’18). 542--557.
[142]
Xiaogang Wang, Xiaoxu Ma, and W. Eric L. Grimson. 2019. MIT Traffic Data Set. Retrieved from http://www.ee.cuhk.edu.hk/ xgwang/MITtraffic.html.
[143]
Michael Weber, Matthias Huber, and J. Marius Zöllner. 2018. HDTLR: A CNN-based hierarchical detector for traffic lights. In Proceedings of the 21st International Conference on Intelligent Transportation Systems (ITSC’18). IEEE, 255--260.
[144]
Bichen Wu, Forrest Iandola, Peter H. Jin, and Kurt Keutzer. 2017. Squeezedet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 129--137.
[145]
Chao-Yuan Wu, Manzil Zaheer, Hexiang Hu, R. Manmatha, Alexander J. Smola, and Philipp Krähenbühl. 2018. Compressed video action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6026--6035.
[146]
Lele Xie, Tasweer Ahmad, Lianwen Jin, Yuliang Liu, and Sheng Zhang. 2018. A new CNN-based method for multi-directional car license plate detection. IEEE Trans. Intell. Transport. Syst. 19, 2 (2018), 507--517.
[147]
Bin Xu and Zhenzhong Chen. 2018. Multi-level fusion-based 3D object detection from monocular images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2345--2353.
[148]
Danfei Xu, Dragomir Anguelov, and Ashesh Jain. 2018. Pointfusion: Deep sensor fusion for 3D bounding box estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 244--253.
[149]
Yongzheng Xu, Guizhen Yu, Yunpeng Wang, Xinkai Wu, and Yalong Ma. 2017. Car detection from low-altitude UAV imagery with the faster R-CNN. J. Adv. Transport. 2017 (2017).
[150]
Bin Yang, Wenjie Luo, and Raquel Urtasun. 2018. Pixor: Real-time 3D object detection from point clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7652--7660.
[151]
Tingting Yang, Xiang Long, Arun Kumar Sangaiah, Zhigao Zheng, and Chao Tong. 2018. Deep detection network for real-life traffic sign in vehicular networks. Comput. Netw. 136 (2018), 95--104.
[152]
Wei Yang, Ji Zhang, Hongyuan Wang, and Zhongbao Zhang. 2018. A vehicle real-time detection algorithm based on YOLOv2 framework. In Real-Time Image and Video Processing 2018, Vol. 10670. International Society for Optics and Photonics, 106700N.
[153]
Hui Zhang, Yu Du, Shurong Ning, Yonghua Zhang, Shuo Yang, and Chen Du. 2017. Pedestrian detection method based on faster R-CNN. In Proceedings of the 13th International Conference on Computational Intelligence and Security (CIS’17). IEEE, 427--430.
[154]
Jianming Zhang, Manting Huang, Xiaokang Jin, and Xudong Li. 2017. A real-time chinese traffic sign detection algorithm based on modified YOLOv2. Algorithms 10, 4 (2017), 127.
[155]
Shanshan Zhang, Rodrigo Benenson, and Bernt Schiele. 2017. Citypersons: A diverse dataset for pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3213--3221.
[156]
Shanshan Zhang, Rodrigo Benenson, and Bernt Schiele. 2019. Citypersons Data Set. Retrieved from https://bitbucket.org/shanshanzhang/citypersons.
[157]
Shifeng Zhang, Longyin Wen, Xiao Bian, Zhen Lei, and Stan Z. Li. 2018. Occlusion-aware R-CNN: Detecting pedestrians in a crowd. In Proceedings of the European Conference on Computer Vision (ECCV’18). 637--653.
[158]
Xin Zhang, Yee-Hong Yang, Zhiguang Han, Hui Wang, and Chao Gao. 2013. Object class detection: A survey. ACM Comput. Surveys 46, 1 (2013), 1--53.
[159]
Xiaotong Zhao, Wei Li, Yifang Zhang, T. Aaron Gulliver, Shuo Chang, and Zhiyong Feng. 2016. A faster RCNN-based pedestrian detection system. In Proceedings of the IEEE 84th Vehicular Technology Conference (VTC-Fall’16). IEEE, 1--5.
[160]
Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, and Xindong Wu. 2019. Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 30, 11 (2019), 3212--3232.
[161]
Wang Zhiqiang and Liu Jun. 2017. A review of object detection based on convolutional neural network. In Proceedings of the 36th Chinese Control Conference (CCC’17). IEEE, 11104--11109.
[162]
Xingyi Zhou, Jiacheng Zhuo, and Philipp Krähenbühl. 2019. Bottom-up object detection by grouping extreme and center points. Retrieved from https://Arxiv:1901.08043.
[163]
Yin Zhou and Oncel Tuzel. 2018. Voxelnet: End-to-end learning for point cloud-based 3D object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4490--4499.
[164]
Chenchen Zhu, Yihui He, and Marios Savvides. 2019. Feature selective anchor-free module for single-shot object detection. Retrieved from https://Arxiv:1903.00621.
[165]
Xizhou Zhu, Jifeng Dai, Lu Yuan, and Yichen Wei. 2018. Towards high performance video object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7210--7218.
[166]
Xizhou Zhu, Jifeng Dai, Xingchi Zhu, Yichen Wei, and Lu Yuan. 2018. Towards high performance video object detection for mobiles. Retrieved from http://arxiv.org/abs/1804.05830.
[167]
Xizhou Zhu, Yujie Wang, Jifeng Dai, Lu Yuan, and Yichen Wei. 2017. Flow-guided feature aggregation for video object detection. In Proceedings of the IEEE International Conference on Computer Vision. 408--417.
[168]
Xizhou Zhu, Yuwen Xiong, Jifeng Dai, Lu Yuan, and Yichen Wei. 2017. Deep feature flow for video recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2349--2358.
[169]
Yingying Zhu, Minghui Liao, Mingkun Yang, and Wenyu Liu. 2018. Cascaded segmentation-detection networks for text-based traffic sign detection. IEEE Trans. Intell. Transport. Syst. 19, 1 (2018), 209--219.
[170]
Zhe Zhu, Dun Liang, Songhai Zhang, Xiaolei Huang, Baoli Li, and Shimin Hu. 2016. Traffic-sign detection and classification in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).
[171]
Zhe Zhu, Dun Liang, Songhai Zhang, Xiaolei Huang, Baoli Li, and Shimin Hu. 2016. tsinghua-tencent 100k dataset. Retrieved from https://cg.cs.tsinghua.edu.cn/traffic-sign/.
[172]
Zhongrong Zuo, Kai Yu, Qiao Zhou, Xu Wang, and Ting Li. 2017. Traffic signs detection based on faster R-CNN. In Proceedings of the IEEE 37th International Conference on Distributed Computing Systems Workshops (ICDCSW’17). IEEE, 286--288.

Cited By

View all
  • (2024)基于深度学习的小目标检测技术研究进展(特邀)Infrared and Laser Engineering10.3788/IRLA2024025353:9(20240253)Online publication date: 2024
  • (2024)Leveraging technological advances to assess dyadic visual cognition during infancy in high- and low-resource settingsFrontiers in Psychology10.3389/fpsyg.2024.137655215Online publication date: 30-May-2024
  • (2024)Adaptive Pruning of Channel Spatial Dependability in Convolutional Neural NetworksProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681419(6073-6082)Online publication date: 28-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 54, Issue 2
March 2022
800 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3450359
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 March 2021
Accepted: 01 November 2020
Revised: 01 September 2020
Received: 01 September 2019
Published in CSUR Volume 54, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Object detection
  2. autonomous driving system
  3. convolutional neural networks
  4. deep learning
  5. vehicle detection

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • NSERC-SPG, NSERC-DISCOVERY, Canada Research Chairs Program, and NSERCCREATE TRANSIT Funds

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)317
  • Downloads (Last 6 weeks)38
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)基于深度学习的小目标检测技术研究进展(特邀)Infrared and Laser Engineering10.3788/IRLA2024025353:9(20240253)Online publication date: 2024
  • (2024)Leveraging technological advances to assess dyadic visual cognition during infancy in high- and low-resource settingsFrontiers in Psychology10.3389/fpsyg.2024.137655215Online publication date: 30-May-2024
  • (2024)Adaptive Pruning of Channel Spatial Dependability in Convolutional Neural NetworksProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681419(6073-6082)Online publication date: 28-Oct-2024
  • (2024)A Deep Dive into Robot Vision - An Integrative Systematic Literature Review Methodologies and Research Endeavor PracticesACM Computing Surveys10.1145/364835756:9(1-33)Online publication date: 1-Mar-2024
  • (2024)Normalization attention and lightweight convolution-based network for traffic sign detectionFourth International Conference on Telecommunications, Optics, and Computer Science (TOCS 2023)10.1117/12.3026568(41)Online publication date: 7-May-2024
  • (2024)A Coarse-to-Fine Deep Learning Based Framework for Traffic Light RecognitionIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.339057625:10(13887-13899)Online publication date: Oct-2024
  • (2024)AttentionTrack: Multiple Object Tracking in Traffic Scenarios Using Features AttentionIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.331522225:2(1661-1674)Online publication date: 1-Feb-2024
  • (2024)A Low-Latency FPGA Accelerator for YOLOv3-Tiny With Flexible Layerwise Mapping and DataflowIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2023.333594971:3(1158-1171)Online publication date: Mar-2024
  • (2024)Infrared Multiobject Contrast Enhancement and Detection Based on Layered Visual Transformer Network for Autonomous DrivingIEEE Sensors Journal10.1109/JSEN.2024.346639724:22(38244-38255)Online publication date: 15-Nov-2024
  • (2024)A Semi-Supervised Learning Framework Combining CNN and Multiscale Transformer for Traffic Sign Detection and RecognitionIEEE Internet of Things Journal10.1109/JIOT.2024.336789911:11(19500-19519)Online publication date: 1-Jun-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media