[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN110674829B - A 3D Object Detection Method Based on Graph Convolutional Attention Network - Google Patents

A 3D Object Detection Method Based on Graph Convolutional Attention Network Download PDF

Info

Publication number
CN110674829B
CN110674829B CN201910918980.6A CN201910918980A CN110674829B CN 110674829 B CN110674829 B CN 110674829B CN 201910918980 A CN201910918980 A CN 201910918980A CN 110674829 B CN110674829 B CN 110674829B
Authority
CN
China
Prior art keywords
convolution
layer
point
feature map
voxel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910918980.6A
Other languages
Chinese (zh)
Other versions
CN110674829A (en
Inventor
夏桂华
何芸倩
苏丽
朱齐丹
张智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN201910918980.6A priority Critical patent/CN110674829B/en
Publication of CN110674829A publication Critical patent/CN110674829A/en
Application granted granted Critical
Publication of CN110674829B publication Critical patent/CN110674829B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2136Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on sparsity criteria, e.g. with an overcomplete basis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a three-dimensional target detection method based on a graph convolution attention network. (1) voxelized partitioning and random downsampling are carried out on point clouds; (2) performing local feature extraction in each grid voxel; (3) extracting a high-order feature map by middle layer convolution; (4) The area suggests the frame, category, and direction of the network predicted target. In order to enhance the connection relation between each point and adjacent points, the invention provides a feature extraction module which is based on an edge convolution form and introduces an attention mechanism, and meanwhile, a attention mechanism module with the same principle is introduced after an intermediate convolution layer, so that features of each channel of the feature map are reselected, and a more reasonable high-order feature map is obtained. The invention improves the target detection accuracy of the point cloud, and has good performance especially under the condition of serious shielding.

Description

一种基于图卷积注意网络的三维目标检测方法A 3D object detection method based on graph convolutional attention network

技术领域Technical Field

本发明涉及的是一种计算机视觉三维点云处理方法,具体地说是一种三维目标检测方法。The present invention relates to a computer vision three-dimensional point cloud processing method, specifically a three-dimensional target detection method.

背景技术Background Art

目标检测是一种传统的可视化任务,可以同时识别和定位目标,这是实现智能场景的先决条件。如今二维检测已经达到了前所未有的繁荣,但是在地图绘制、室内机器人和增强现实等领域,三维检测明显优于二维。它可以提供更多的位置姿态信息,同时也是自动驾驶环境感知的基本任务之一。RGB图像曾经是目标检测任务的主流数据形式,但随着3D传感器的发展,激光雷达近年来已成为一种越来越流行的检测工具。Object detection is a traditional visualization task that can simultaneously identify and locate objects, which is a prerequisite for realizing intelligent scenes. Today, two-dimensional detection has reached an unprecedented prosperity, but in areas such as mapping, indoor robots, and augmented reality, three-dimensional detection is significantly better than two-dimensional. It can provide more position and posture information, and is also one of the basic tasks of autonomous driving environment perception. RGB images used to be the mainstream data form for object detection tasks, but with the development of 3D sensors, lidar has become an increasingly popular detection tool in recent years.

现在,一些基于激光雷达和相机的方法融合了点云数据和图像数据一同获得更高的准确度。但融合方法也面临着计算成本过大的问题,所以单一传感器方法仍具有竞争力。许多研究表明,点云是描述物体形状更适当数据形式。点云可以更好地表示欧式距离并没有多尺度问题。然而,点云是一种稀疏数据,这使二维方法很难直接应用。Now, some methods based on LiDAR and cameras fuse point cloud data and image data together to achieve higher accuracy. But the fusion method also faces the problem of excessive computational cost, so the single sensor method is still competitive. Many studies have shown that point cloud is a more appropriate data form to describe the shape of objects. Point cloud can better represent Euclidean distance and does not have multi-scale problems. However, point cloud is a sparse data, which makes it difficult to directly apply two-dimensional methods.

在提取特征时,大部分方法使用逐点处理点的方式,并使用对称函数来提取全局特征,这种思路忽视了点与点之间的连接和关系。而与图片数据相比,点云是一种天然的易于构建链接的图结构。有一些研究利用了图网络的思想,考虑了相邻点和边之间的关系有助于增强局部特征的表达,提出了边卷积的方法。在三维卷积时,考虑到在定义的体素范围内,由于点的稀疏性,很多体素为空,使用稀疏卷积的方式,可以在不影响卷积效果的同时提升计算速度并且减小显存损耗。When extracting features, most methods process points one by one and use symmetric functions to extract global features. This approach ignores the connection and relationship between points. Compared with image data, point cloud is a natural graph structure that is easy to build links. Some studies have used the idea of graph networks, considering that the relationship between adjacent points and edges helps to enhance the expression of local features, and proposed edge convolution methods. In three-dimensional convolution, considering that within the defined voxel range, many voxels are empty due to the sparsity of points, the use of sparse convolution can improve the calculation speed and reduce the loss of video memory without affecting the convolution effect.

发明内容Summary of the invention

本发明的目的在于提供一种能够提升点云目标检测准确率,在遮挡严重的情况下仍能有良好性能的基于图卷积注意网络的三维目标检测方法。The purpose of the present invention is to provide a three-dimensional target detection method based on a graph convolutional attention network, which can improve the accuracy of point cloud target detection and still have good performance under severe occlusion.

本发明的目的是这样实现的:The object of the present invention is achieved in that:

(1)对点云进行体素化划分与随机降采样;(1) Voxelize and randomly downsample the point cloud;

(2)在每个栅格体素中进行局部特征提取;(2) extracting local features in each grid voxel;

(3)中间层卷积提取高阶特征图;(3) The middle layer convolution extracts high-order feature maps;

(4)区域建议网络预测目标的标框、类别以及方向。(4) The region proposal network predicts the target’s bounding box, category, and orientation.

本发明还可以包括:The present invention may also include:

1.所述的对点云进行体素化划分与随机降采样具体包括:使用体素网格的结构对原始点云进行划分,舍弃规定范围外的离群点,将点云划分至栅格中,并在每个体素栅格中进行随机降采样,然后对每个栅格进行编号,并进行存储。1. The voxelization and random downsampling of the point cloud specifically include: using the voxel grid structure to divide the original point cloud, discarding outliers outside the specified range, dividing the point cloud into grids, and performing random downsampling in each voxel grid, and then numbering each grid and storing it.

所述的存储是使用哈希表存储。The storage is performed using a hash table.

2.所述的在每个栅格体素中进行局部特征提取具体包括:在每一个体素的栅格内,使用图注意网络模块对对应点进行特征提取。2. The local feature extraction in each grid voxel specifically includes: within each voxel grid, using a graph attention network module to extract features of corresponding points.

所述的使用图注意网络模块对对应点进行特征提取具体为:首先将每个点与周围相邻的点之间连边,形成一个以欧氏距离为判断标准的图结构,同时将每个点与这个点本身连一条边,提取每条边的两端点坐标等信息作为边的初始特征,然后对边进行卷积操作,最后经过对称函数的选择,处理得到体素级特征。The method of using the graph attention network module to extract features of corresponding points is as follows: first, each point is connected to the surrounding adjacent points to form a graph structure with Euclidean distance as the judgment standard, and at the same time, each point is connected to the point itself by an edge, and the coordinates of the two endpoints of each edge and other information are extracted as the initial features of the edge, and then a convolution operation is performed on the edge, and finally, after the selection of a symmetric function, voxel-level features are obtained.

在边卷积操作之前,使用注意机制对初始特征进行选择。Before the side convolution operation, an attention mechanism is used to select initial features.

3.所述的中间层卷积提取高阶特征图具体包括:使用稀疏卷积的方法,将特征图压缩为一个致密的结构,进行卷积后,再映射回原本稀疏的空间表示;在经过卷积抽象后,利用注意机制对不同通道进行权重的重新分配,得到一个与特征图相对应的注意力图,将注意力图叠加到卷积得到的高阶特征图上,得到最终的三维特征图。3. The intermediate layer convolution extracts the high-order feature map specifically including: using the sparse convolution method to compress the feature map into a dense structure, and then mapping it back to the original sparse spatial representation after convolution; after convolution abstraction, using the attention mechanism to redistribute the weights of different channels to obtain an attention map corresponding to the feature map, and superimposing the attention map on the high-order feature map obtained by convolution to obtain the final three-dimensional feature map.

4.所述的区域建议网络预测目标的标框、类别以及方向具体包括:将经过多层卷积的高阶特征图经特征提取后,利用三个分别的全连接层计算各个锚点所对应的边界框、类别、方向的预测值。4. The region proposal network predicts the target's box, category, and direction specifically by extracting features from the high-order feature map that has undergone multiple layers of convolution, and then using three separate fully connected layers to calculate the predicted values of the bounding box, category, and direction corresponding to each anchor point.

本发明的基于图卷积注意网络的三维目标检测方法的特点是加强点云局部关系表达与优化特征选择过程。其中本发明将能够表述相邻点间关系的边卷积方法用于目标检测的特征提取,在初始点的特征选择阶段,使用注意机制来选择对特征表达更为重要的初始物理特征,从而得到更优的提取特征。在中间层卷积的过程中,同样产生了多通道的特征数据,本发明利用注意机制的思想,优化卷积结果,强化了有主要影响力的通道比重,得到更有表示力的特征图。The three-dimensional target detection method based on graph convolutional attention network of the present invention is characterized by strengthening the expression of local relationships of point clouds and optimizing the feature selection process. The present invention uses the edge convolution method that can express the relationship between adjacent points for feature extraction of target detection. In the feature selection stage of the initial point, the attention mechanism is used to select the initial physical features that are more important for feature expression, so as to obtain better extracted features. In the process of convolution of the middle layer, multi-channel feature data is also generated. The present invention uses the idea of attention mechanism to optimize the convolution results, strengthen the proportion of channels with major influence, and obtain a more expressive feature map.

通常的一副场景的点云数据包含超过100k个点,因此考虑使用特定的数据结构对点云进行预处理,即体素化。首先将原始点划分为体素并首先提取点状特征,然后下采样的体素信号进入卷积和区域建议以获得三维边界框。The point cloud data of a typical scene contains more than 100k points, so we consider using a specific data structure to preprocess the point cloud, namely voxelization. First, the original points are divided into voxels and the point features are extracted first, and then the downsampled voxel signals enter the convolution and region proposal to obtain the 3D bounding box.

本发明考虑到在特征提取过程中加强底层原始点之间的关系表示,在特征提取时利用了图网络的思想,同时,为了更好的加强特征表达,考虑了一种模仿人类认知敏锐度的注意机制,从而使特征的多通道选择更为智能化。本发明将注意机制分别应用在图网络边卷积初始特征选择之前与稀疏卷积特征图处理之后,在提升神经网络模块表述力的同时,令每个阶段的特征表达更具解释性。The present invention considers strengthening the relationship representation between the underlying original points in the feature extraction process, and utilizes the idea of graph network in feature extraction. At the same time, in order to better strengthen the feature expression, an attention mechanism that imitates human cognitive acuity is considered, so that the multi-channel selection of features is more intelligent. The present invention applies the attention mechanism before the initial feature selection of the graph network edge convolution and after the sparse convolution feature map processing, which improves the expression power of the neural network module and makes the feature expression of each stage more interpretable.

本发明具有以下优点:The present invention has the following advantages:

1.本发明在每个体素的特征表示过程中使用了注意机制的图卷积方法,能够更好的描述点云的每个点之间的关系,提取更有表述力的特征。1. The present invention uses a graph convolution method with an attention mechanism in the feature representation process of each voxel, which can better describe the relationship between each point in the point cloud and extract more expressive features.

2.本发明在经过中间层卷积后,对得到的高阶特征图利用注意机制进行权重的重新分配,得了更合理的高阶特征图。2. After the intermediate layer convolution, the present invention uses the attention mechanism to redistribute the weights of the obtained high-order feature map, thereby obtaining a more reasonable high-order feature map.

3.以上两个改进共同作用,本发明能够提升三维目标检测在车辆的检测中的准确率。3. The above two improvements work together, and the present invention can improve the accuracy of three-dimensional target detection in vehicle detection.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1:基于图网络注意结构的特征提取模块,其中e表示边,x表示点,i与j表示点的编号;Figure 1: Feature extraction module based on graph network attention structure, where e represents edge, x represents point, i and j represent point numbers;

图2:体素特征提取;Figure 2: Voxel feature extraction;

图3:引入注意机制的中间层稀疏卷积;Figure 3: Sparse convolution in the middle layer with attention mechanism;

图4:总体流程。Figure 4: Overall process.

具体实施方式DETAILED DESCRIPTION

下面举例对本发明做更详细的描述。The present invention is described in more detail with reference to the following examples.

步骤一:点云的体素化划分聚类Step 1: Voxelization and clustering of point clouds

使用体素化的方式对原始超过100k个点的点云数据进行结构化和降采样,首先裁去一定范围以外的点,仅保留范围在x,y,z轴下D,H,W之内的点。由于一副点云的点数量过大,在提取范围之内,利用尺寸为vd,vh,vw的小体素网格对整体点云进行划分。The original point cloud data with more than 100k points is structured and downsampled by voxelization. First, points outside a certain range are cut off, and only points within the range of D, H, and W under the x, y, and z axes are retained. Since the number of points in a point cloud is too large, within the extraction range, the entire point cloud is divided using small voxel grids of size v d , v h , and v w .

为了解决点在每个体素中分布不均的问题,本实施方式使用了随即降采样的方法,使每个体素中的点不超过T个。最后将处理好的体素结构编号,并使用哈希表的方式存储,从而消除内部点为空的体素。In order to solve the problem of uneven distribution of points in each voxel, this embodiment uses a random downsampling method to ensure that the number of points in each voxel does not exceed T. Finally, the processed voxel structures are numbered and stored in a hash table to eliminate voxels with empty internal points.

步骤二:体素中的点云特征提取Step 2: Point cloud feature extraction in voxels

在将原始点云体素化后,为了得到体素级特征,本实施方式对每一个体素使用图注意网络模块进行特征提取。After voxelizing the original point cloud, in order to obtain voxel-level features, this embodiment uses a graph attention network module to extract features for each voxel.

点云是一个天然的图结构,在对点云的特征提取中,常规将每个点单独考虑而忽视了点与点之间的联系,定义

Figure BDA0002216962470000031
为一个图,其中包括了n个点组成的点集
Figure BDA0002216962470000032
以及点之间的边集
Figure BDA0002216962470000033
例如,本发明定义一个d维的临近图,对于每个点xi,在
Figure BDA0002216962470000041
中包含了(i,ji1),...,(i,jik)形式的边集,其中i与j都为点的编号,于是定义边特征为
Figure BDA0002216962470000042
其中hθ与下列公式中的H为对称函数。Point cloud is a natural graph structure. In the feature extraction of point cloud, each point is usually considered separately and the connection between points is ignored.
Figure BDA0002216962470000031
is a graph consisting of a set of n points
Figure BDA0002216962470000032
and the set of edges between the points
Figure BDA0002216962470000033
For example, the present invention defines a d-dimensional proximity graph, where for each point x i ,
Figure BDA0002216962470000041
contains an edge set of the form (i, j i1 ), ..., (i, j ik ), where i and j are both point numbers, so the edge feature is defined as
Figure BDA0002216962470000042
Where h θ and H in the following formula are symmetric functions.

Figure BDA0002216962470000043
Figure BDA0002216962470000043

通常来说,点云具有三个维度来表示其在真实世界的坐标,在本实施方式中,描述两点之间的边时,结合了中心点xi和与其用h操作连接的点

Figure BDA0002216962470000044
的信息作为初始的特征选择。此时,边特征的每一个通道对于总特征表述的贡献是不同的,于是,添加了一种注意机制方法。在边卷积的多层感知操作之后,使用一种对称操作H对边级的特征进行提取,得到对应的点级的特征。随后,通过将点级特征X={x′1,...,x′n}进行另一个对称操作提取得到最终的体素级特征。Generally speaking, a point cloud has three dimensions to represent its coordinates in the real world. In this embodiment, when describing the edge between two points, the center point xi and the points connected to it by the h operation are combined.
Figure BDA0002216962470000044
The information is used as the initial feature selection. At this time, each channel of the edge feature contributes differently to the overall feature representation, so an attention mechanism is added. After the multi-layer perception operation of the edge convolution, a symmetry operation H is used to extract the edge-level features to obtain the corresponding point-level features. Subsequently, the final voxel-level features are extracted by performing another symmetry operation on the point-level features X = {x′ 1 , ..., x′ n }.

步骤三:中间层稀疏卷积Step 3: Sparse convolution in the middle layer

本实施方式使用三维稀疏卷积运算作为卷积中间层。假设ConvMD(cin,cout,k,s,p)是一个卷积运算符,其中cin和cout是输入的数量和输出通道,k,s,p分别对应于内核大小,步幅大小和填充大小。每个卷积运算包含3D卷积,BatchNormal层和Relu层。最后,在将稀疏映射转换为密集映射后,得到了一个高级特征映射,并在此添加了一个注意模块。This implementation uses 3D sparse convolution operations as convolution intermediate layers. Assume that ConvMD(c in , c out , k, s, p) is a convolution operator, where c in and c out are the number of input and output channels, and k, s, p correspond to the kernel size, stride size, and padding size, respectively. Each convolution operation contains 3D convolution, BatchNormal layer, and Relu layer. Finally, after converting the sparse map to a dense map, a high-level feature map is obtained, and an attention module is added to it.

卷积操作期间有许多不同的比例特征图。很明显,每个维度的特征对贡整个特征的贡献都具有不同的重要性。为了改进特征图的描述,令其更为合理,本发明将注意图添加到原始特征图中。There are many different scale feature maps during the convolution operation. Obviously, the contribution of each dimension of feature to the overall feature has different importance. In order to improve the description of the feature map and make it more reasonable, the present invention adds the attention map to the original feature map.

Figure BDA0002216962470000045
Figure BDA0002216962470000045

本实施方式使用SE注意模块用于生成注意特征图。首先,让致密的特征映射输入为

Figure BDA0002216962470000051
其中H为特征图高度,W为特征图宽度,C为通道数。然后使用avg-pool ing操作来提取每个通道从而得到一个提取特征,因此获得统计得到通道权重
Figure BDA0002216962470000052
然后使用多层感知来获得每个维度的一些高级特征,于是最终的注意图为sc=Fe(zc,W),其中Fe为提取函数。This implementation uses the SE attention module to generate attention feature maps. First, let the dense feature map input be
Figure BDA0002216962470000051
Where H is the feature map height, W is the feature map width, and C is the number of channels. Then use the avg-pooling operation to extract each channel to obtain an extracted feature, so as to obtain the channel weight statistically
Figure BDA0002216962470000052
Then, multi-layer perception is used to obtain some high-level features of each dimension, so the final attention map is sc = Fe ( zc , W), where Fe is the extraction function.

Figure BDA0002216962470000053
Figure BDA0002216962470000053

在缩放函数Fscale后,将注意特征图添加到原始图中以获取最终输出的综合特征图

Figure BDA0002216962470000054
After scaling function F, the attention feature map is added to the original map to obtain the comprehensive feature map of the final output
Figure BDA0002216962470000054

此注意力机制操作添加在中间层之后,可以将高级信息聚合到最终的中间层特征图中,从而为后续的区域建议提供更多信息。This attention mechanism operation is added after the intermediate layer, which can aggregate high-level information into the final intermediate layer feature map, thereby providing more information for subsequent region proposals.

步骤四:区域建议网络Step 4: Region Proposal Network

区域建议网络(RPN)已经成为许多检测框架中的典型嵌入模块。在本实施方式中,使用类似SSD的端到端的形式作为区域建议架构。区域建议层的输入是中间层提取的特征图,一个区域建议层包含卷积层,BatchNormal层和Relu层。在每个单独的RPN层之后,将特征贴图上采样到相同的固定大小,并将这些图连接在一起。最后,使用三个1×1卷积来生成边界框,类和方向的预测值。Region Proposal Network (RPN) has become a typical embedding module in many detection frameworks. In this implementation, an end-to-end form similar to SSD is used as the region proposal architecture. The input of the region proposal layer is the feature map extracted by the intermediate layer. A region proposal layer contains a convolutional layer, a BatchNormal layer, and a Relu layer. After each individual RPN layer, the feature map is upsampled to the same fixed size and the maps are concatenated together. Finally, three 1×1 convolutions are used to generate predictions for the bounding box, class, and direction.

Claims (1)

1. A three-dimensional target detection method based on a graph convolution attention network is characterized by comprising the following steps of:
step one: voxel division clustering of point cloud
Structuring and downsampling original point cloud data in a voxelized mode, discarding outliers outside a specified range, dividing the point cloud into grids, randomly downsampling in each voxel grid, numbering each grid, and storing; using a method of random downsampling, so that the number of points in each voxel is not more than T; finally numbering the processed voxel structure, and storing the voxel structure in a hash table mode, so that voxels with empty internal points are eliminated;
step two: point cloud feature extraction in voxels
After voxelization of the original point cloud, extracting features of each voxel by using a graph annotation network module in order to obtain voxel level features;
the point cloud is a natural graph structure, and each point is conventionally considered independently and negligibly in the feature extraction of the point cloudDefinition of Point-to-Point relationship
Figure FDA0003908093360000011
Is a graph comprising a set of n points
Figure FDA0003908093360000012
And edge set between points ++>
Figure FDA0003908093360000013
The point cloud has three dimensions to represent its real world coordinates, and after the multi-layer perceptive operation of edge convolution, a symmetry operation H is used to extract edge-level features, which are obtained by extracting the corresponding point-level features by using the point-level features x= { X' 1 ,...,x′ n Performing another symmetrical operation extraction to obtain final voxel level characteristics;
step three: middle layer sparse convolution
Using three-dimensional sparse convolution operation as a convolution intermediate layer, assume ConvMD (c in ,c out K, s, p) is a convolution operator, where c in And c out The number of inputs and the output channels, k, s, p correspond to kernel size, stride size, and fill size, respectively; each convolution operation includes a 3D convolution, a batch normal layer and a Relu layer; after the sparse mapping is converted into dense mapping, an advanced feature mapping is obtained, and an attention module is added;
using SE attention module for generating attention feature map, first, let dense feature map input as
Figure FDA0003908093360000014
Wherein H is the height of the feature map, W is the width of the feature map, and C is the number of channels; each channel is then extracted using an avg-mapping operation to obtain an extracted feature, thus obtaining statistically derived channel weights
Figure FDA0003908093360000015
Multilayer perceptions are then used to obtain some advanced features for each dimension, with the final attention being given to s c =F e (z c W), where F e Is an extraction function;
Figure FDA0003908093360000016
at the scaling function F scale Thereafter, the attention feature map is added to the original map to obtain a final output integrated feature map
Figure FDA0003908093360000021
Step four: regional advice network
Using an end-to-end form similar to SSD as a region suggestion architecture, wherein the input of a region suggestion layer is a feature map extracted by an intermediate layer, and one region suggestion layer comprises a convolution layer, a Batchnormal layer and a Relu layer; after each individual RPN layer, upsampling the feature maps to the same fixed size and concatenating the maps together; finally, three 1×1 convolutions are used to generate the predicted values for bounding boxes, classes and directions.
CN201910918980.6A 2019-09-26 2019-09-26 A 3D Object Detection Method Based on Graph Convolutional Attention Network Active CN110674829B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910918980.6A CN110674829B (en) 2019-09-26 2019-09-26 A 3D Object Detection Method Based on Graph Convolutional Attention Network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910918980.6A CN110674829B (en) 2019-09-26 2019-09-26 A 3D Object Detection Method Based on Graph Convolutional Attention Network

Publications (2)

Publication Number Publication Date
CN110674829A CN110674829A (en) 2020-01-10
CN110674829B true CN110674829B (en) 2023-06-02

Family

ID=69079355

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910918980.6A Active CN110674829B (en) 2019-09-26 2019-09-26 A 3D Object Detection Method Based on Graph Convolutional Attention Network

Country Status (1)

Country Link
CN (1) CN110674829B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340189B (en) * 2020-02-21 2023-11-24 之江实验室 An implementation method of spatial pyramid graph convolution network
CN111401190A (en) * 2020-03-10 2020-07-10 上海眼控科技股份有限公司 Vehicle detection method, device, computer equipment and storage medium
CN111583263B (en) * 2020-04-30 2022-09-23 北京工业大学 A point cloud segmentation method based on joint dynamic graph convolution
CN111476843B (en) * 2020-05-08 2023-03-24 中国科学院合肥物质科学研究院 Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network
CN111539949B (en) * 2020-05-12 2022-05-13 河北工业大学 Point cloud data-based lithium battery pole piece surface defect detection method
EP4145338A4 (en) 2020-05-13 2023-06-21 Huawei Technologies Co., Ltd. Target detection method and apparatus
CN113766228B (en) * 2020-06-05 2023-01-13 Oppo广东移动通信有限公司 Point cloud compression method, encoder, decoder, and storage medium
CN113971694A (en) * 2020-07-22 2022-01-25 商汤集团有限公司 Method and device for processing point cloud data
CN113971712A (en) * 2020-07-22 2022-01-25 上海商汤临港智能科技有限公司 A method, device, electronic device and storage medium for processing point cloud data
CN112270289A (en) * 2020-07-31 2021-01-26 广西科学院 An intelligent monitoring method based on graph convolutional attention network
CN112184867B (en) * 2020-09-23 2024-09-17 中国第一汽车股份有限公司 Point cloud feature extraction method, device, equipment and storage medium
CN112115954B (en) * 2020-09-30 2022-03-29 广州云从人工智能技术有限公司 Feature extraction method and device, machine readable medium and equipment
CN112257852B (en) * 2020-11-04 2023-05-19 清华大学深圳国际研究生院 Method for classifying and dividing point cloud
CN112633376A (en) * 2020-12-24 2021-04-09 南京信息工程大学 Point cloud data ground feature classification method and system based on deep learning and storage medium
CN112446385B (en) * 2021-01-29 2021-04-30 清华大学 A scene semantic segmentation method, device and electronic device
CN112862719B (en) * 2021-02-23 2022-02-22 清华大学 Laser radar point cloud cell feature enhancement method based on graph convolution
CN113900119B (en) * 2021-09-29 2024-01-30 苏州浪潮智能科技有限公司 Method, system, storage medium and equipment for laser radar vehicle detection
CN114266992B (en) * 2021-12-13 2024-10-15 北京超星未来科技有限公司 Target detection method and device and electronic equipment
CN115273645B (en) * 2022-08-09 2024-04-09 南京大学 Map making method for automatically clustering indoor surface elements

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573731A (en) * 2015-02-06 2015-04-29 厦门大学 Rapid target detection method based on convolutional neural network
CN109685813A (en) * 2018-12-27 2019-04-26 江西理工大学 A kind of U-shaped Segmentation Method of Retinal Blood Vessels of adaptive scale information

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108520275A (en) * 2017-06-28 2018-09-11 浙江大学 A connection information regularization system, graph feature extraction system, graph classification system and method based on adjacency matrix
US11556777B2 (en) * 2017-11-15 2023-01-17 Uatc, Llc Continuous convolution and fusion in neural networks
CN109255791A (en) * 2018-07-19 2019-01-22 杭州电子科技大学 A kind of shape collaboration dividing method based on figure convolutional neural networks
CN109934826B (en) * 2019-02-28 2023-05-12 东南大学 An Image Feature Segmentation Method Based on Graph Convolutional Network
CN110222653B (en) * 2019-06-11 2020-06-16 中国矿业大学(北京) A Behavior Recognition Method of Skeleton Data Based on Graph Convolutional Neural Network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573731A (en) * 2015-02-06 2015-04-29 厦门大学 Rapid target detection method based on convolutional neural network
CN109685813A (en) * 2018-12-27 2019-04-26 江西理工大学 A kind of U-shaped Segmentation Method of Retinal Blood Vessels of adaptive scale information

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Can Chen等.GAPNet: Graph Attention based Point Neural Network for Exploiting Local Feature of Point Cloud.《arXiv》.2019,第1-11页. *
Zongji Wang等.VoxSegNet: Volumetric CNNs for Semantic Part Segmentation of 3D Shapes.《IEEE Transactions on Visualization and Computer Graphics》.2019,第26卷(第9期),第2919 - 2930页. *

Also Published As

Publication number Publication date
CN110674829A (en) 2020-01-10

Similar Documents

Publication Publication Date Title
CN110674829B (en) A 3D Object Detection Method Based on Graph Convolutional Attention Network
CN110458939B (en) Indoor scene modeling method based on visual angle generation
CN111160214B (en) 3D target detection method based on data fusion
CN110543858A (en) 3D Object Detection Method Based on Multimodal Adaptive Fusion
WO2021174904A1 (en) Image processing method, path planning method, apparatus, device, and storage medium
CN106713923A (en) Compression of a three-dimensional modeled object
CN111898172A (en) Experiential Learning in the Virtual World
CN113034581B (en) Relative pose estimation method of space targets based on deep learning
CN111898173A (en) Experiential Learning in the Virtual World
CN110659664B (en) A method for recognizing small objects with high precision based on SSD
CN110910452B (en) A Pose Estimation Method for Low-Texture Industrial Parts Based on Deep Learning
CN110827295A (en) 3D Semantic Segmentation Method Based on Coupling of Voxel Model and Color Information
EP3872761A2 (en) Analysing objects in a set of frames
CN110909615B (en) Target detection method based on multi-scale input mixed perception neural network
CN112750201A (en) Three-dimensional reconstruction method and related device and equipment
CN115115805A (en) Three-dimensional reconstruction model training method, device, equipment and storage medium
CN116563488A (en) A 3D Object Detection Method Based on Point Cloud Columnarization
CN113160382B (en) Single-view vehicle reconstruction method and device based on implicit template mapping
CN104796624B (en) A kind of light field editor transmission method
CN116310368A (en) A LiDAR 3D target detection method
CN116363329B (en) Three-dimensional image generation method and system based on CGAN and LeNet-5
CN116433904A (en) Cross-modal RGB-D semantic segmentation method based on shape perception and pixel convolution
TWI728791B (en) Image semantic segmentation method, device and storage medium thereof
CN114820344A (en) Depth map enhancement method and device
US20230177722A1 (en) Apparatus and method with object posture estimating

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant