CN118505909A - Map-assisted incomplete cloud completion method and system - Google Patents
Map-assisted incomplete cloud completion method and system Download PDFInfo
- Publication number
- CN118505909A CN118505909A CN202410957907.0A CN202410957907A CN118505909A CN 118505909 A CN118505909 A CN 118505909A CN 202410957907 A CN202410957907 A CN 202410957907A CN 118505909 A CN118505909 A CN 118505909A
- Authority
- CN
- China
- Prior art keywords
- point cloud
- sketch
- features
- point
- incomplete
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 230000004927 fusion Effects 0.000 claims abstract description 39
- 230000008569 process Effects 0.000 claims abstract description 7
- 230000001502 supplementing effect Effects 0.000 claims abstract 3
- 239000013598 vector Substances 0.000 claims description 68
- 238000005070 sampling Methods 0.000 claims description 33
- 230000007246 mechanism Effects 0.000 claims description 29
- 230000000295 complement effect Effects 0.000 claims description 14
- 239000011159 matrix material Substances 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 238000012952 Resampling Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000002776 aggregation Effects 0.000 claims description 2
- 238000004220 aggregation Methods 0.000 claims description 2
- 230000007547 defect Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 241000282414 Homo sapiens Species 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000004141 dimensional analysis Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000002310 reflectometry Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/002—Image coding using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Processing Or Creating Images (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method and a system for supplementing residual point cloud assisted by a sketch, wherein the method uses the assisted sketch as a guide, combines point cloud data obtained by scanning with the sketch drawn interactively, effectively combines information of two modes of the point cloud and the sketch in a local potential space, and outputs three-dimensional point cloud data with more complete geometric information. The invention selects weak supervision setting, assists the sketch to provide supervision signals for the training process by using the micro-renderers on the complete point cloud to measure the fidelity in the image space, realizes multi-mode information fusion by the sketch information and the information of the incomplete point cloud, and can generate the complete point cloud which is more reliable and more in line with the user intention.
Description
Technical Field
The invention relates to the technical fields of computer vision technology and three-dimensional point cloud model completion, in particular to a method and a system for map-assisted residual point cloud completion.
Background
3D data is used in many different fields including autopilot, robotics, etc. The point cloud has a very uniform structure, avoiding composition irregularities and complexity. In practical applications, however, the collected point cloud data is often incomplete due to occlusion of objects, differences in reflectivity of the target surface material, and limitations in resolution and viewing angle of the vision sensor. The generated geometrical and semantic information loss influences the subsequent 3D task, so incomplete data are used for complementing the complete point cloud, the original shape of the point cloud is restored, and the point cloud has important significance for the downstream task.
Over the years, researchers have tried many ways to solve this problem in the field of deep learning. Early attempts at point cloud completion attempted to migrate a mature method from a 2D completion task to a 3D point cloud through voxelization and three-dimensional convolution. Such computation costs are high, and until Pointnet and Pointnet ++ appear, direct processing of three-dimensional coordinates is the main stream of three-dimensional analysis based on point clouds, and the architecture of encoder and decoder is gradually relied on to complement the incomplete point clouds.
However, most of the existing point cloud completion methods are based on single-mode information, the shape prior is used for directly deducing the defect, and because the single-mode defect point cloud information is limited, large uncertainty exists when the point cloud is completed, and the inherent sparsity of the point cloud data causes that the blank part and the defect part of the model are difficult to distinguish. Human beings are very good at understanding two-dimensional and three-dimensional models, and judge incomplete parts of point clouds through visual concepts, and sketches are convenient, quick and easy-to-acquire media for expressing interactive intention, and the information of the incomplete parts can be well supplemented through sketches drawn by users. Therefore, we have devised a sketch-aided point cloud completion method that enables multi-modal input.
Disclosure of Invention
The invention aims to provide a map-assisted incomplete point cloud completion method aiming at the defects of the prior art, and aims to improve the defects of the current single-mode point cloud model and directly deducing the incomplete defects by using shape prior. The user expresses the complement intention by taking the sketch as a medium, the network acquires the key information of the missing point cloud from the sketch, and the complement of the missing point cloud is realized through an effective cross-modal and cross-layer fusion framework.
The aim of the invention is realized by the following technical scheme: a map-assisted incomplete point cloud completion method comprises the following steps:
(2) The residual point cloud and the sketch are respectively input into encoders of different modes, and coding features of the two modes, namely sketch features and point cloud features, are extracted;
(3) Fusing the coding features of the two modes obtained in the step (2);
(4) And (3) learning and reconstructing a complete point cloud, and decoding the fused features by using a decoder which simultaneously maintains global and local features to complete the point cloud.
Further, the step (1) is specifically to obtain sketch input of a user, wherein the sketch can be collected through hand drawing of the user; and displaying the view of the current incomplete cloud model in real time, and enabling a user to directly sketch and outline the incomplete cloud which is hoped to be completed on the view.
Further, in the step (2), the residual point cloud is input into a DGCNN encoder according to the different data forms, and the hand-drawn sketch is input into a ResNet encoder for feature extraction.
Specifically, the feature extraction uses two feature extractors specific to different modes, one is used for capturing local features of a sketch and summarized as a pixel N s, and the other is used for capturing local features of a point cloud and summarized as a point N x; the ResNet is used as an encoder of a sketch to extract the characteristics, so that the network has higher convergence speed, and the characteristic extraction can be ensured; representing partial point cloud inputThe sketch input is expressed asThe complete point cloud is expressed asThe point cloud complement to be performed is to give incomplete point cloud and sketch to predict a complete point cloud; The point cloud encoder extracts features from the partial shape X, maintains the locality of the features, adopts DGCNN frames, and the frames are composed of a series of graph convolution layers through interleaving pooling operation, so that the cardinality of the point cloud is reduced;;
wherein, Is thatConstructing a partial graph through a point and surrounding adjacent points through the coded features, extracting convolution of each edge in the graph, and obtaining the features of the center point through a weighted average method; wherein ≡represents a channel symmetric aggregation operation, h Θ is a nonlinear learning function whose result is taken as the midpointIs characterized by (2); through pooling operation, the receiving domain can be enlarged, more global information is contained, and meanwhile, the complexity of cross attention operation of subsequent fusion of two modes is reduced; encoding sketches asEncoding a incomplete point cloud as。
Further, the step (3) specifically includes fusing the sketch features and the point cloud features obtained by encoding by the encoder through a cross attention mechanism and a self attention mechanism.
Further, the fusion is performed through a cross attention mechanism and a self attention mechanism, specifically: the method comprises the steps of collecting local information from two modes through an attention mechanism for searching a corresponding relation between the characteristics of a point cloud area and a sketch area, and fusing the local information, wherein a multi-head attention mechanism of a transducer is used in an attention layer of a framework of the attention mechanism;
(6.1) in the process of using a cross attention mechanism, projecting the residual fault cloud features to form a query vector, projecting the sketch features to form a key vector and a value vector, and after three vectors exist, fusing the residual fault cloud features and the sketch features extracted to the associated area by a feature extractor by the attention mechanism to realize feature fusion among different modal inputs;
Obtaining a query vector of the point cloud through the product of the obtained point cloud characteristics and the weight vector, and respectively obtaining a key vector and a value vector of the sketch through the product of the obtained sketch characteristics and the weight vector; the softmax normalization is used again on the basis of the existing three vectors, and the features are fused;
(6.2) adding a self-attention layer after cross-attention fusion is used in the framework, and realizing the arrangement invariant transformation of the features with the global acceptance domain so as to correct the data which are not correctly acquired in the sketch; the principle of the self-attention layer is the same as that in (6.1), except that the input features are different, and the self-attention layer adopts the same mixed features to carry out vector Is calculated by the following steps;
(6.3) the framework uses a mode of combining a cross attention layer and a self attention layer to finish the fusion of the whole characteristics, so that the characteristic fusion of two mode data is realized; at the end of the whole fusion module, a special cross attention layer is used, and the information of the end and the beginning of the fusion module is combined, so that the high-level features cross and participate in the fusion of the low-level features.
Specifically, feature fusion between different modal inputs is realized in the step (6.1); the characteristic fusion expression is as follows:
;
;
wherein H X and H S are respectively the coding feature vectors of the point cloud, W is a weight matrix, Respectively a query vector, a key vector and a value vector; is the transpose of the key vector and, In order to query the dimensions of the vector,Encoding a query vector of data for the point cloud; Encoding key vectors of the data for the point cloud; Encoding a value vector of data for the point cloud; And The weight matrix of the query vector, the key vector and the value vector, respectively.
Specifically, the decoder estimates the position of the point to be complemented, and fuses the two points acquired in the mode of furthest point sampling, so that the frame focuses on the part where the point cloud is missing; uniformly sampling a sample by using a mode of sampling the furthest point, wherein an initial point of sampling the furthest point is selected as a random point, and different sampling results are ensured each time, and the distance form of sampling the furthest point is selected to use Euclidean distance to measure the absolute distance of two points in a multidimensional space;
Upsampling the feature domain by a decoder, performing feature fusion allows higher level features to be fused; the specific operation is realized by a mechanism based on attention, and the encoder provides the characteristics that Each K n branches into;
;
;
Wherein the method comprises the steps ofA multi-layer perceptron with different weights for each branch, projects features into the K n subspace, and generates self-attention weights for the resampling process,The method comprises the steps that the method is a projection matrix of a three-dimensional space, and finally, outputs of all decoder branches are connected with a part of the furthest point sampling in series to generate a complete point cloud;
;
wherein the method comprises the steps of Is a predicted complete point cloud;
the furthest point sampling is adopted, the FPS sampling points and the points estimated by the decoder are fused, the fidelity of the existing partial point cloud is maintained, and the frame only pays attention to the point cloud of the missing part; the whole system is flexibly complemented by adjusting the mixing proportion of the sampling point and the estimation point according to the requirement; loss function using L1 chamfer distance between generated shape and true value shape Performing supervision training;
;
Wherein the first term sums the operations Representative ofAny one ofTo the point ofSum of minimum distances of (2), second term sum operationThen representsAny one ofTo the point ofIs the sum of the minimum distances of (a); if the distance is large, the difference between the two groups of point clouds is large, and the distance is inversely related to the complement effect; the actually input sketch contains the supplementary information related to the point cloud, and the supplementary point cloud can be completed.
Specifically, the distance form of the furthest point sampling selects Euclidean distance; the Euclidean distance expression is as follows:
;
The Euclidean distance is a basic distance measure, the absolute distance between two points in the multidimensional space is measured, and the calculation is performed The distance between two points, whereinIs thatIs the i-th coordinate of (c).
The invention also provides a map-assisted incomplete cloud completion system, which comprises the following modules:
and the information acquisition module is used for: based on the existing incomplete point cloud, sketch input of a user is obtained, and outline information of the incomplete point cloud is interactively supplemented;
the characteristic acquisition module is used for: the residual point cloud and the sketch are respectively input into encoders of different modes, and coding features of the two modes, namely sketch features and point cloud features, are extracted;
the feature fusion module is used for fusing the coding features of the two modes obtained by the feature acquisition module;
and the point cloud complement module is used for: and (3) learning and reconstructing a complete point cloud, and decoding the fused features by using a decoder which simultaneously maintains global and local features to complete the point cloud.
The beneficial effects of the invention are as follows:
compared with the existing method for complementing the single-mode residual point cloud, the method can only complement by means of shape prior, and can not realize appointed and accurate complementation according to the mind of a user; according to the invention, through sketch analysis of the user, global structure information is obtained, and multi-mode information fusion is realized through sketch information and information of the incomplete point cloud, so that a more reliable complete point cloud which meets the user intention better is generated.
Drawings
FIG. 1 is a flow chart of a point cloud completion method provided by the invention;
FIG. 2 is a network block diagram of the point cloud completion method provided by the invention;
FIG. 3 is a block diagram of a point cloud completion method provided by the invention;
FIG. 4 is a cross-attention schematic of the present invention;
FIG. 5 is a point cloud completion effect diagram of the present invention-an aircraft class diagram;
fig. 6 is a point cloud completion effect diagram-automobile class diagram of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating and illustrating the invention, are not intended to limit the invention.
The invention provides a map-assisted residual point cloud completion method.
As shown in fig. 1, the present invention includes the following modules:
(1) Based on the existing incomplete point cloud, sketch input of a user is obtained, and outline information of the incomplete point cloud is interactively supplemented;
(2) The residual point cloud and the sketch are respectively input into encoders of different modes, and coding features of the two modes, namely sketch features and point cloud features, are extracted;
(3) Fusing the coding features of the two modes obtained in the step (2);
(4) And (3) learning and reconstructing a complete point cloud, and decoding the fused features by using a decoder which simultaneously maintains global and local features to complete the point cloud.
S1, acquiring sketch input of a user as shown in FIG. 2, acquiring sketch through user hand drawing, acquiring a current residual point cloud model view in real time, and directly drawing sketch contours of residual point clouds which are required to be completed on the view.
S2, as shown in FIG. 3, according to different data formats, the residual point cloud is input into an encoder of the DGCNN network frame, and the hand-drawn sketch is input into an encoder of the ResNet network frame for feature extraction.
S21, two feature extractors specific to different modes, one for capturing local features of a sketch and summarized as a small number of pixels, and the other for capturing local features of a point cloud and summarized as a small number of points; the ResNet is used as an encoder of a sketch to extract the characteristics, so that the network has higher convergence speed, and the characteristic extraction effect can be ensured;
s22, representing part of point cloud input as The sketch input is expressed asThe complete point cloud is expressed asThe point cloud complement to be performed is to give incomplete point cloud and sketch to predict a complete point cloud; The point cloud encoder extracts features from the partial shape X, maintains a certain degree of locality, adopts DGCNN frames which are composed of a series of graph convolution layers through interleaving pooling operation, and reduces the cardinality of the point cloud; (1)
wherein, Is thatThe coded characteristics are constructed by a method of constructing a partial graph through a point and surrounding adjacent points, convolution of each edge in the graph is extracted, and then the characteristics of a central point are obtained by a weighted average method; where ≡represents a channel-by-channel symmetric operation, h Θ is a non-linear, learnable function.
S23, the result is taken as the midpointIs characterized by (2); through pooling operation, the receiving domain can be enlarged, more global information can be contained, and meanwhile, the complexity of cross attention operation of the subsequent fusion of two modes can be reduced; encoding sketches asEncoding a incomplete point cloud as。
S3, as shown in FIG. 4, fusing the acquired features, namely fusing sketch features and point cloud features obtained by encoding by an encoder through a cross attention mechanism and a self attention mechanism;
The fusion module based on the attention mechanism collects local information from two modes and needs to be fused, and the attention mechanism is very suitable for searching the corresponding relation between the characteristics of the point cloud area and the sketch area, so that the attention mechanism is used in the module of the characteristic fusion, and the multi-head attention mechanism of the Transformer is used in the attention layer of the framework;
S31, in the process of using a cross attention mechanism, the residual fault cloud features are projected to form a query tensor, the sketch features are projected to form a key vector and a value vector, and after three vectors exist, the attention mechanism fuses the residual fault cloud features and the sketch features extracted to the associated area through the feature extractor, so that feature fusion among different modal inputs is realized;
(2)
(3)
wherein H X and H S are respectively the coding feature vectors of the point cloud, W is a weight matrix, Respectively a query vector, a key vector and a value vector; is the transpose of the key vector and, In order to query the dimensions of the vector,Encoding a query vector of data for the point cloud; the key vector is used for encoding data for the point cloud; Encoding a value vector of data for the point cloud; And The weight matrix of the query vector, the key vector and the value vector, respectively.
S32, obtaining a query vector of the point cloud through the product of the obtained point cloud characteristics and the weight vector, and respectively obtaining a key vector and a value vector of the sketch through the product of the obtained sketch characteristics and the weight vector; the softmax normalization is used again on the basis of the existing three vectors, and the features are fused;
S33, adding a self-attention layer after cross attention fusion is used in the framework, and realizing the arrangement unchanged transformation of the features with the global acceptance domain so as to correct the data which are not correctly acquired in the sketch; the principle of the self-focusing layer is the same as that in the formulas (2) and (3), except that the input characteristics are different, and the self-focusing layer adopts the same mixed characteristics to perform Vector operation;
S34, the framework completes the fusion of the whole features by combining a cross attention layer and a self attention layer, and realizes the feature fusion of two mode data; at the end of the whole sequence, a special cross attention layer is used, and the information of the end and the beginning of the sequence is combined, so that the high-level features can cross-participate in the fusion of the low-level features, and the method has better flexibility in determining the required abstract level.
S4, decoding the features, embedding the joint features, learning and reconstructing a complete point cloud, and simultaneously maintaining the decoder of the global and local features.
S41, the encoder estimates the positions of some points, is connected to the sampling version of the point cloud of the input part in the mode of the furthest point sampling, and only estimates the missing part of the point cloud; the method adopts the mode of sampling the furthest point, ensures that the sample is uniformly sampled, and the initial point of sampling the furthest point is selected as a random point, so that the difference of sampling results each time can be ensured, and the distance form of sampling the furthest point selects Euclidean distance to measure the absolute distance of two points in a multidimensional space; the expression is as follows:
(4)
The Euclidean distance is a basic distance measure, the absolute distance between two points in the multidimensional space is measured, and the calculation is performed The distance between two points, whereinIs thatIs the i-th coordinate of (c).
S42, up-sampling the feature domain through a decoder, wherein the potential space for performing feature fusion is more local, so as to reduce complexity and allow higher-level features to be fused; the specific operations may be implemented based on a mechanism of attention, the encoder of which provides features that areEach K n branches into;
(5)
(6)
Wherein the method comprises the steps ofA multi-layer perceptron with different weights for each branch, projects features into the K n subspace, and generates self-attention weights for the resampling process,The method comprises the steps that the method is a projection matrix of a three-dimensional space, and finally, outputs of all decoder branches are connected with a part of the furthest point sampling in series to generate a complete point cloud;
s43, finally, the outputs of all decoder branches are connected in series with the part of the farthest point sampling to generate a complete point cloud; the result of the complementation is shown in fig. 5 and 6;
(7)
wherein the method comprises the steps of Is a predicted complete point cloud;
the FPS is the furthest point sampling, and the FPS sampling points and the points estimated by the decoder can be connected, so that the fidelity of the existing partial point cloud can be maintained, and the scheme of estimating only the incomplete part is realized; the mixing proportion of the sampling point and the estimation point can be adjusted as required, so that the flexibility of the whole system is improved; loss function using L1 chamfer distance between generated shape and true value shape Performing supervision training;
(8)
Wherein the first term sums the operations Representative ofAny one ofTo the point ofSum of minimum distances of (2), second term sum operationThen representsAny one ofTo the point ofIs the sum of the minimum distances of (a); if the distance is larger, the difference between the two groups of point clouds is larger, and if the distance is smaller, the complementation effect is better; the problem of multi-mode complement is a solution of weak supervised learning, and actually, the sketch as input contains the supplementary information related to the point cloud, so that the point cloud can be well assisted to complement.
The invention also provides a map-assisted incomplete cloud completion system, which comprises the following modules:
and the information acquisition module is used for: based on the existing incomplete point cloud, sketch input of a user is obtained, and outline information of the incomplete point cloud is interactively supplemented;
the characteristic acquisition module is used for: the residual point cloud and the sketch are respectively input into encoders of different modes, and coding features of the two modes, namely sketch features and point cloud features, are extracted;
the feature fusion module is used for fusing the coding features of the two modes obtained by the feature acquisition module;
and the point cloud complement module is used for: and learning and reconstructing a complete point cloud, and simultaneously maintaining a decoder of global and local characteristics, and decoding the fused characteristics to complete point cloud completion.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof.
Claims (10)
1. The map-assisted incomplete point cloud completion method is characterized by comprising the following steps of:
(1) Based on the existing incomplete point cloud, sketch input of a user is obtained, and outline information of the incomplete point cloud is interactively supplemented;
(2) The residual point cloud and the sketch are respectively input into encoders of different modes, and coding features of the two modes, namely sketch features and point cloud features, are extracted;
(3) Fusing the coding features of the two modes obtained in the step (2);
(4) And (3) learning and reconstructing a complete point cloud, and decoding the fused features by using a decoder which simultaneously maintains global and local features to complete the point cloud.
2. The sketch-assisted incomplete cloud completion method according to claim 1, wherein the step (1) is specifically that sketch input of a user is obtained, and sketch can be collected through hand drawing of the user; and displaying the view of the current incomplete cloud model in real time, and enabling a user to directly sketch and outline the incomplete cloud which is hoped to be completed on the view.
3. The method for supplementing the residual point cloud with the aid of a sketch according to claim 1, wherein in the step (2), the residual point cloud is input into a DGCNN coder, and the sketch is input into a ResNet coder for feature extraction according to different data forms.
4. A sketch-assisted incomplete cloud completion method according to claim 3, wherein the feature extraction is performed by using two feature extractors specific to different modalities, one for capturing local features of the sketch, summarized as pixels N s, and one for capturing local features of the point cloud, summarized as points N x; the ResNet is used as an encoder of a sketch to extract the characteristics, so that the network has higher convergence speed, and the characteristic extraction can be ensured; representing partial point cloud inputThe sketch input is expressed asThe complete point cloud is expressed asThe point cloud complement to be performed is to give incomplete point cloud and sketch to predict a complete point cloud; The point cloud encoder extracts features from the partial shape X, maintains the locality of the features, adopts DGCNN frames, and the frames are composed of a series of graph convolution layers through interleaving pooling operation, so that the cardinality of the point cloud is reduced;;
wherein, Is thatConstructing a partial graph through a point and surrounding adjacent points through the coded features, extracting convolution of each edge in the graph, and obtaining the features of the center point through a weighted average method; wherein ≡represents a channel symmetric aggregation operation, h Θ is a nonlinear learning function whose result is taken as the midpointIs characterized by (2); through pooling operation, the receiving domain can be enlarged, more global information is contained, and meanwhile, the complexity of cross attention operation of subsequent fusion of two modes is reduced; encoding sketches asEncoding a incomplete point cloud as。
5. The method for supplementing the map-assisted residual point cloud according to claim 1, wherein the step (3) specifically comprises fusing the map features and the point cloud features obtained by encoding by the encoder through a cross attention mechanism and a self attention mechanism.
6. The sketch-assisted incomplete cloud completion method according to claim 5, wherein the fusion is performed by a cross-attention mechanism and a self-attention mechanism, specifically: the method comprises the steps of collecting local information from two modes through an attention mechanism for searching a corresponding relation between the characteristics of a point cloud area and a sketch area, and fusing the local information, wherein a multi-head attention mechanism of a transducer is used in an attention layer of a framework of the attention mechanism;
(6.1) in the process of using a cross attention mechanism, projecting the residual fault cloud features to form a query vector, projecting the sketch features to form a key vector and a value vector, and after three vectors exist, fusing the residual fault cloud features and the sketch features extracted to the associated area by a feature extractor by the attention mechanism to realize feature fusion among different modal inputs;
Obtaining a query vector of the point cloud through the product of the obtained point cloud characteristics and the weight vector, and respectively obtaining a key vector and a value vector of the sketch through the product of the obtained sketch characteristics and the weight vector; the softmax normalization is used again on the basis of the existing three vectors, and the features are fused;
(6.2) adding a self-attention layer after cross-attention fusion is used in the framework, and realizing the arrangement invariant transformation of the features with the global acceptance domain so as to correct the data which are not correctly acquired in the sketch; the principle of the self-attention layer is the same as that in (6.1), except that the input features are different, and the self-attention layer adopts the same mixed features to carry out vector Is calculated by the following steps;
(6.3) the framework uses a mode of combining a cross attention layer and a self attention layer to finish the fusion of the whole characteristics, so that the characteristic fusion of two mode data is realized; at the end of the whole fusion module, a special cross attention layer is used, and the information of the end and the beginning of the fusion module is combined, so that the high-level features cross and participate in the fusion of the low-level features.
7. The sketch-assisted incomplete cloud completion method according to claim 6, wherein feature fusion between different modal inputs is achieved in the step (6.1); the characteristic fusion expression is as follows:
;
;
wherein H X and H S are respectively the coding feature vectors of the point cloud, W is a weight matrix, Respectively a query vector, a key vector and a value vector; is the transpose of the key vector and, In order to query the dimensions of the vector,Encoding a query vector of data for the point cloud; Encoding key vectors of the data for the point cloud; Encoding a value vector of data for the point cloud; And The weight matrix of the query vector, the key vector and the value vector, respectively.
8. The map-assisted incomplete cloud completion method according to claim 7, wherein the decoder estimates the positions of points to be completed, and fuses the two points acquired by the most distant point sampling method to focus on the missing part of the point cloud; uniformly sampling a sample by using a mode of sampling the furthest point, wherein an initial point of sampling the furthest point is selected as a random point, and different sampling results are ensured each time, and the distance form of sampling the furthest point is selected to use Euclidean distance to measure the absolute distance of two points in a multidimensional space;
Upsampling the feature domain by a decoder, performing feature fusion allows higher level features to be fused; the specific operation is realized by a mechanism based on attention, and the encoder provides the characteristics that Each K n branches into;
;
;
Wherein the method comprises the steps ofA multi-layer perceptron with different weights for each branch, projects features into the K n subspace, and generates self-attention weights for the resampling process,The method comprises the steps that the method is a projection matrix of a three-dimensional space, and finally, outputs of all decoder branches are connected with a part of the furthest point sampling in series to generate a complete point cloud;
;
wherein the method comprises the steps of Is a predicted complete point cloud;
the furthest point sampling is adopted, the FPS sampling points and the points estimated by the decoder are fused, the fidelity of the existing partial point cloud is maintained, and the frame only pays attention to the point cloud of the missing part; the whole system is flexibly complemented by adjusting the mixing proportion of the sampling point and the estimation point according to the requirement; loss function using L1 chamfer distance between generated shape and true value shape Performing supervision training;
;
Wherein the first term sums the operations Representative ofAny one ofTo the point ofSum of minimum distances of (2), second term sum operationThen representsAny one ofTo the point ofIs the sum of the minimum distances of (a); if the distance is large, the difference between the two groups of point clouds is large, and the distance is inversely related to the complement effect; the actually input sketch contains the supplementary information related to the point cloud, and the supplementary point cloud can be completed.
9. A sketch-assisted incomplete cloud completion method according to claim 8, wherein the distance form of the furthest point sample is selected to use euclidean distance; the Euclidean distance expression is as follows:
;
The Euclidean distance is a basic distance measure, the absolute distance between two points in the multidimensional space is measured, and the calculation is performed The distance between two points, whereinIs thatIs the i-th coordinate of (c).
10. The utility model provides a supplementary incomplete point cloud of sketch fills up system which characterized in that, this system includes following module:
and the information acquisition module is used for: based on the existing incomplete point cloud, sketch input of a user is obtained, and outline information of the incomplete point cloud is interactively supplemented;
the characteristic acquisition module is used for: the residual point cloud and the sketch are respectively input into encoders of different modes, and coding features of the two modes, namely sketch features and point cloud features, are extracted;
the feature fusion module is used for fusing the coding features of the two modes obtained by the feature acquisition module;
and the point cloud complement module is used for: and (3) learning and reconstructing a complete point cloud, and decoding the fused features by using a decoder which simultaneously maintains global and local features to complete the point cloud.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410957907.0A CN118505909B (en) | 2024-07-17 | 2024-07-17 | Map-assisted incomplete cloud completion method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410957907.0A CN118505909B (en) | 2024-07-17 | 2024-07-17 | Map-assisted incomplete cloud completion method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118505909A true CN118505909A (en) | 2024-08-16 |
CN118505909B CN118505909B (en) | 2024-10-11 |
Family
ID=92246876
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410957907.0A Active CN118505909B (en) | 2024-07-17 | 2024-07-17 | Map-assisted incomplete cloud completion method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118505909B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113160327A (en) * | 2021-04-09 | 2021-07-23 | 上海智蕙林医疗科技有限公司 | Method and system for realizing point cloud completion |
CN113160068A (en) * | 2021-02-23 | 2021-07-23 | 清华大学 | Point cloud completion method and system based on image |
CN115131245A (en) * | 2022-06-30 | 2022-09-30 | 中国人民解放军国防科技大学 | Point cloud completion method based on attention mechanism |
CN115619685A (en) * | 2022-11-08 | 2023-01-17 | 广州大学 | Transformer method for tracking structure for image restoration |
CN116503825A (en) * | 2023-04-07 | 2023-07-28 | 清华大学深圳国际研究生院 | Semantic scene completion method based on fusion of image and point cloud in automatic driving scene |
WO2023241097A1 (en) * | 2022-06-16 | 2023-12-21 | 山东海量信息技术研究院 | Semantic instance reconstruction method and apparatus, device, and medium |
CN117274764A (en) * | 2023-11-22 | 2023-12-22 | 南京邮电大学 | Multi-mode feature fusion three-dimensional point cloud completion method |
-
2024
- 2024-07-17 CN CN202410957907.0A patent/CN118505909B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113160068A (en) * | 2021-02-23 | 2021-07-23 | 清华大学 | Point cloud completion method and system based on image |
CN113160327A (en) * | 2021-04-09 | 2021-07-23 | 上海智蕙林医疗科技有限公司 | Method and system for realizing point cloud completion |
WO2023241097A1 (en) * | 2022-06-16 | 2023-12-21 | 山东海量信息技术研究院 | Semantic instance reconstruction method and apparatus, device, and medium |
CN115131245A (en) * | 2022-06-30 | 2022-09-30 | 中国人民解放军国防科技大学 | Point cloud completion method based on attention mechanism |
CN115619685A (en) * | 2022-11-08 | 2023-01-17 | 广州大学 | Transformer method for tracking structure for image restoration |
CN116503825A (en) * | 2023-04-07 | 2023-07-28 | 清华大学深圳国际研究生院 | Semantic scene completion method based on fusion of image and point cloud in automatic driving scene |
CN117274764A (en) * | 2023-11-22 | 2023-12-22 | 南京邮电大学 | Multi-mode feature fusion three-dimensional point cloud completion method |
Non-Patent Citations (3)
Title |
---|
LONG YANG: "点云模型的形状可控几何补全", THE VISUAL COMPUTER, 6 February 2016 (2016-02-06) * |
孙嘉徽;: "虚拟现实技术的三维图像重建系统", 现代电子技术, no. 09, 1 May 2020 (2020-05-01) * |
贝子勒;赵杰煜;: "一种基于深度学习的点云修复模型", 无线通信技术, no. 02, 15 June 2020 (2020-06-15) * |
Also Published As
Publication number | Publication date |
---|---|
CN118505909B (en) | 2024-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110458939B (en) | Indoor scene modeling method based on visual angle generation | |
CN111340867B (en) | Depth estimation method and device for image frame, electronic equipment and storage medium | |
CN112767554B (en) | Point cloud completion method, device, equipment and storage medium | |
CN114863573B (en) | Category-level 6D attitude estimation method based on monocular RGB-D image | |
CN111968217B (en) | SMPL parameter prediction and human body model generation method based on picture | |
Tu et al. | Consistent 3d hand reconstruction in video via self-supervised learning | |
CN112750198B (en) | Dense correspondence prediction method based on non-rigid point cloud | |
CN110910437B (en) | Depth prediction method for complex indoor scene | |
CN112562001B (en) | Object 6D pose estimation method, device, equipment and medium | |
CN114429555A (en) | Image density matching method, system, equipment and storage medium from coarse to fine | |
CN111724443A (en) | Unified scene visual positioning method based on generating type countermeasure network | |
Li et al. | Latent distribution-based 3D hand pose estimation from monocular RGB images | |
CN117315169A (en) | Live-action three-dimensional model reconstruction method and system based on deep learning multi-view dense matching | |
Yan et al. | Efficient implicit neural reconstruction using lidar | |
CN117745944A (en) | Pre-training model determining method, device, equipment and storage medium | |
CN117745932A (en) | Neural implicit curved surface reconstruction method based on depth fusion constraint | |
CN113763539B (en) | Implicit function three-dimensional reconstruction method based on image and three-dimensional input | |
Xiao et al. | Instance-aware monocular 3D semantic scene completion | |
Ye et al. | Online adaptation for implicit object tracking and shape reconstruction in the wild | |
CN116993926B (en) | Single-view human body three-dimensional reconstruction method | |
CN118505909B (en) | Map-assisted incomplete cloud completion method and system | |
CN118154770A (en) | Single tree image three-dimensional reconstruction method and device based on nerve radiation field | |
CN117711066A (en) | Three-dimensional human body posture estimation method, device, equipment and medium | |
CN117834839A (en) | Multi-view 3D intelligent imaging measurement system based on mobile terminal | |
Farooq et al. | A review of monocular depth estimation methods based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |