Object Detection and Classification by Decision-Level Fusion for Intelligent Vehicle Systems
<p>Overview of our work. Red arrows denote the processing of unary classifier for each sensor, and green arrows denote the fusion processing.</p> "> Figure 2
<p>Procedure from pre-processing to semantic grouping on CCD image data. (<b>a</b>) Input image data; (<b>b</b>) color-flattened image; (<b>c</b>) segmented image using the graph-segmentation method; (<b>d</b>) semantic grouping using the dissimilarity cost function.</p> "> Figure 3
<p>Segment generation on 3D point clouds. (<b>a</b>) 2D occupancy grid mapping results and (<b>b</b>) segmentation result on 3D point clouds.</p> "> Figure 4
<p>Proposed network architecture as unary classifiers.</p> "> Figure 5
<p>Architecture of the fusion network. Bbox denotes bounding box (<a href="#sec6dot2-sensors-17-00207" class="html-sec">Section 6.2</a>).</p> "> Figure 6
<p>Qualitative results of our proposed method. We projected the classification results on the image data. (<b>a</b>) The results of CCD unary classifier. (<b>b</b>) The results of LiDAR unary classifier. (<b>c</b>) The results of <math display="inline"> <semantics> <mrow> <mi>m</mi> <mi>o</mi> <mi>d</mi> <mi>e</mi> <msub> <mi>l</mi> <mn>17</mn> </msub> </mrow> </semantics> </math>. (<b>d</b>) The results of proposed method. Each box indicates the following: yellow box: correctly-detected and -classified objects; red box: failures; green box: un-detected objects.</p> ">
:1. Introduction
2. Related Work
3. Overview
4. Pre-Processing
4.1. Norm-Based Color Flattening
Algorithm 1 Split Bregman for color-flattening. |
4.2. The 3D Occupancy Voxel Spaces
5. Object-Region Proposal Generation
5.1. Object-Region Proposal from the CCD Sensor
5.2. Object-Region Proposal from the LiDAR Sensor
6. Classifying Object-Region Proposals
6.1. Unary Classifier
6.2. Fusion Classifier
7. Experimental Results
7.1. Setup
7.2. Evaluation
8. Conclusions and Future Works
Author Contributions
Conflicts of Interest
Section | Parameters or Functions | Descriptions |
Energy function to generate color-flattening, . | ||
Data term of energy function for pixel-wise intrinsic similarity. | ||
A concatenated vector of all pixel values in transformed image . | ||
z | A concatenated vector of all pixel values in original image I. | |
4.1 | Smoothness term of energy function. | |
A 3-dimensional vector of the RGB values at pixel position of transformed image . | ||
Weights to the difference between of pixel position and of the neighboring pixel of . | ||
A 3-dimensional vector of the CIELab color space of . | ||
κ | A constant related to the luminance variations. | |
M | The matrix consists of and . | |
and | Intermediate variables of the split Bregman method. | |
, the i-th 3D point data of 3D point clouds. | ||
4.2 | The i-th voxel includes the reflected particles with a size of . | |
The number of voxels in a 3D point cloud. | ||
The possible number of reflectance particles in voxel . | ||
A set of segmented partition of the color-flatted image. | ||
The number of segmented partitions. | ||
Set of spatially-connected neighborhood partitions of the partition. | ||
The dissimilarity function to group the adjacent partitions. | ||
; the color dissimilarity between the adjacent partitions. | ||
A weight constant for the color dissimilarity. | ||
5.1 | 75-bin color histogram measured from the mean image . | |
The texture dissimilarity between the adjacent partitions. | ||
A weight constant for the texture dissimilarity. | ||
240-bin SIFT histogram of original image I. | ||
A threshold value for grouping adjacent partitions. | ||
S and | The ground truth of the segmented and inferred segmentation images from the proposed method. | |
The number of training images to find . | ||
The structural loss between the ground truth and the inferred segmented partition. | ||
The classification results of each bounding box provided from the image. | ||
6.2 | The classification results of each bounding box provided from the 3D point clouds. | |
The association component between and . |
Model | Proposal Generator | Representation | Representation Usage | Modality | Fusion Scheme |
Sliding Window | VGG16 | ConvCube | CCD + LiDAR | CNN | |
CIOP | VGG16 | ConvCube | CCD + LiDAR | CNN | |
Objectness | VGG16 | ConvCube | CCD + LiDAR | CNN | |
Selective Search | VGG16 | ConvCube | CCD + LiDAR | CNN | |
CPMC | VGG16 | ConvCube | CCD + LiDAR | CNN | |
MCG | VGG16 | ConvCube | CCD + LiDAR | CNN | |
EdgeBox | VGG16 | ConvCube | CCD + LiDAR | CNN | |
Proposed Generator | AlexNet | ConvCube | CCD + LiDAR | CNN | |
Proposed Generator | VGG16 | conv1 | CCD + LiDAR | CNN | |
Proposed Generator | VGG16 | conv5 | CCD + LiDAR | CNN | |
Proposed Generator | VGG16 | fc7 | CCD + LiDAR | CNN | |
Proposed Generator | VGG16 | conv5 + fc7 | CCD + LiDAR | CNN | |
Proposed Generator | VGG16 | ConvCube | CCD | × | |
Proposed Generator | VGG16 | ConvCube | LiDAR | × | |
Proposed Generator | VGG16 | ConvCube | CCD + LiDAR | Decision-TBM | |
Proposed Generator | VGG16 | ConvCube | CCD + LiDAR | Decision-CRF | |
3DOP | 3DOP | 3DOP | CCD + LiDAR | Feature-3DOP | |
Proposed Generator | VGG16 | ConvCube | CCD + LiDAR | CNN |
Method | Recall | # of Bbox | ||
Cars | Pedestrians | Cyclists | ||
Sliding window | 100 | 100 | 100 | |
CIOP | 64.4 | 59.8 | 59.9 | |
66.9 | 60.4 | 60.1 | ||
Selective search | 70.4 | 66.8 | 68.7 | |
CPMC | 71.7 | 67.4 | 68.6 | |
MCG | 76.6 | 78.9 | 74.8 | |
EdgeBox | 85.2 | 84.3 | 82.5 | |
Ours (CCD) | 88.4 | 85.4 | 84.8 | |
Ours (LiDAR) | 71.8 | 63.3 | 64.2 | 70 |
Ours | 90.8 | 88.7 | 86.5 | 500 |
Model | Cars | Pedestrians | Cyclists | ||||||
Easy | Moderate | Hard | Easy | Moderate | Hard | Easy | Moderate | Hard | |
90.98 | 88.64 | 79.88 | 82.84 | 69.55 | 66.42 | 82.12 | 71.48 | 64.55 | |
90.7 | 83.67 | 79.78 | 80.54 | 68.07 | 65.23 | 80.86 | 68.59 | 63.54 | |
91.34 | 85.28 | 77.42 | 81.71 | 68.54 | 61.19 | 78.21 | 68.77 | 63.77 | |
85.88 | 87.74 | 79.01 | 79.59 | 68.45 | 62.66 | 82.65 | 65.12 | 61.38 | |
91.39 | 87.78 | 75.7 | 75.25 | 66.35 | 61.27 | 76.24 | 66.93 | 63.39 | |
89.42 | 82.94 | 77.1 | 80.94 | 67.93 | 61.58 | 79.07 | 66.67 | 63.27 | |
85.68 | 87.82 | 79.57 | 81.53 | 65.02 | 65.94 | 78.67 | 67.89 | 60.81 | |
87.43 | 84.44 | 75.42 | 73.2 | 65.28 | 64.55 | 77.51 | 66.74 | 60.15 | |
86.29 | 81.26 | 73.52 | 72.86 | 63.04 | 60.31 | 74.69 | 61.30 | 56.16 | |
74.87 | 80.98 | 75.85 | 77.57 | 60.61 | 62.79 | 70.12 | 62.49 | 59.21 | |
77.00 | 82.37 | 75.50 | 77.54 | 60.43 | 56.30 | 73.37 | 64.23 | 56.84 | |
88.59 | 83.08 | 77.30 | 79.17 | 64.54 | 64.34 | 75.69 | 66.35 | 59.58 | |
88.84 | 84.77 | 73.81 | 77.92 | 68.81 | 59.33 | 72.60 | 67.32 | 57.21 | |
70.32 | 67.97 | 59.62 | 64.96 | 59.29 | 37.28 | 63.45 | 58.34 | 30.22 | |
84.25 | 81.66 | 74.48 | 69.49 | 67.81 | 62.14 | 70.81 | 68.11 | 60.25 | |
83.48 | 82.71 | 70.55 | 78.34 | 68.97 | 60.38 | 72.84 | 68.42 | 61.01 | |
93.04 | 88.64 | 79.1 | 81.78 | 67.47 | 64.7 | 78.39 | 68.94 | 61.37 | |
94.88 | 89.34 | 81.42 | 83.71 | 70.84 | 68.67 | 83.95 | 72.98 | 66.47 |
Fusion | Sensor | Cars | Pedestrians | Cyclists | |||||||
Easy | Moderate | Hard | Easy | Moderate | Hard | Easy | Moderate | Hard | |||
Vote3D [59] | × | L | 56.80 | 47.99 | 42.57 | 44.48 | 35.74 | 33.72 | 41.43 | 31.24 | 28.62 |
LSVM-MDPM [60] | × | C | 68.02 | 56.48 | 44.18 | 47.74 | 39.36 | 35.95 | 35.04 | 27.50 | 26.21 |
SquaresICF [61] | × | C | - | 57.33 | 44.42 | 40.08 | - | ||||
MDPM-un-BB [62] | × | C | 71.19 | 62.16 | 48.48 | - | - | ||||
DPM-C8B1 [63] | × | S | 74.33 | 60.99 | 47.16 | 38.96 | 29.03 | 25.61 | 43.49 | 29.04 | 26.20 |
DPM-VOC+ VP [64] | × | C | 74.95 | 64.71 | 48.76 | 59.48 | 44.86 | 40.37 | 42.43 | 31.08 | 28.23 |
OC-DPM [65] | × | C | 74.94 | 65.95 | 53.86 | - | - | ||||
AOG [66] | × | C | 84.36 | 71.88 | 59.27 | - | - | ||||
SubCat [67] | × | C | 84.14 | 75.46 | 59.71 | 54.67 | 42.34 | 37.95 | - | ||
DA-DPM [68] | × | C | - | 56.36 | 45.51 | 41.08 | - | ||||
Faster R-CNN [34] | × | C | 86.71 | 81.84 | 71.12 | 78.86 | 65.90 | 61.18 | 72.26 | 63.35 | 55.90 |
FilteredICF [69] | × | C | - | 61.14 | 53.98 | 49.29 | - | ||||
pAUCEnsT [70] | × | C | - | 65.26 | 54.49 | 48.60 | 51.62 | 38.03 | 33.38 | ||
3DVP [71] | × | C | 87.46 | 75.77 | 65.38 | - | - | ||||
Regionlets [72] | × | C | 84.75 | 76.45 | 59.70 | 73.14 | 61.15 | 55.21 | 70.41 | 58.72 | 51.83 |
uickitti | × | C | 90.83 | 89.23 | 79.46 | 83.49 | 71.84 | 67.00 | 78.40 | 70.90 | 62.54 |
Fusion-DPM [73] | Decision | L + C | - | 59.51 | 46.67 | 42.05 | - | ||||
MV-RGBD-RF [74] | Early | L + C | 76.40 | 69.92 | 57.47 | 73.30 | 56.59 | 49.63 | 52.97 | 42.61 | 37.42 |
3DOP [58] | Early | S + C | 93.04 | 88.64 | 79.10 | 81.78 | 67.47 | 64.70 | 78.39 | 68.94 | 61.37 |
Ours (CCD) | × | C | 88.84 | 84.77 | 73.81 | 77.92 | 68.81 | 59.33 | 72.60 | 67.32 | 57.21 |
Ours (LiDAR) | × | L | 70.32 | 67.97 | 59.62 | 64.96 | 59.29 | 37.28 | 63.45 | 58.34 | 30.22 |
Ours (TBM) | Decision | L + C | 84.25 | 81.66 | 74.48 | 69.49 | 67.81 | 62.14 | 70.81 | 68.11 | 60.25 |
Ours (CRF) | Decision | L + C | 83.48 | 82.71 | 70.55 | 78.34 | 68.97 | 60.38 | 72.84 | 68.42 | 61.01 |
Ours | Decision | L + C | 94.88 | 89.34 | 81.42 | 83.71 | 70.84 | 68.67 | 83.95 | 72.98 | 66.47 |
