Authors:
Gurjeet Singh
1
;
Sunmiao
2
;
Shi Shi
2
and
Patrick Chiang
1
;
3
;
2
Affiliations:
1
Dept. of EECS, Oregon State University, Corvallis, U.S.A.
;
2
State Key Laboratory of ASIC & System, Fudan University, Shanghai, China
;
3
PhotonIC Technologies, Shanghai, China
Keyword(s):
Object Detection, 3D Data, Hardware, Depth Sensors.
Abstract:
Object detection and classification is one of the most crucial computer vision problems. Ever since the introduction of deep learning, we have witnessed a dramatic increase in the accuracy of this object detection problem. However, most of these improvements have occurred using conventional 2D image processing. Recently, low-cost 3D-image sensors, such as the Microsoft Kinect (Time-of-Flight) or the Apple FaceID (Structured-Light), can provide 3D-depth or point cloud data that can be added to a convolutional neural network, acting as an extra set of dimensions. We are proposing a hardware-based approach for Object Detection by moving region of interest identification closer to sensor node in the hardware. Due to this approach, we do not need a large dataset with depth images to retrain the network. Our 2D + 3D system takes the 3D-data to determine the object region followed by any conventional 2D-DNN, such as AlexNet. In this method, our approach can readily dissociate the informa
tion collected from the Point Cloud and 2D-Image data and combine both operations later. Hence, our system can use any existing trained 2D network on a large image dataset and does not require a large 3D-depth dataset for new training. Experimental object detection results across 30 images show an accuracy of 0.67, whereas 0.54 and 0.51 for FasterRCNN and YOLO, respectively.
(More)