CN110490917A

CN110490917A - Three-dimensional rebuilding method and device

Info

Publication number: CN110490917A
Application number: CN201910741648.7A
Authority: CN
Inventors: 宋波
Original assignee: Beijing Yingpu Technology Co Ltd
Current assignee: Beijing Yingpu Technology Co Ltd
Priority date: 2019-08-12
Filing date: 2019-08-12
Publication date: 2019-11-22

Abstract

This application discloses a kind of three-dimensional rebuilding method and devices, this method comprises: render to the image data of input, obtain original point cloud data；The point cloud data stored with the data structure of Octree；Point cloud data is input to region convolutional neural networks, obtains the characteristic of area-of-interest；Pond is carried out to area-of-interest；Identify the three-dimensional reconstruction loss and coordinate transform loss of the target object in area-of-interest；Three-dimensional pose prediction and 3D shape prediction are carried out to target object, obtain the pose angle and 3D shape of target object；In conjunction with the pose angle of three-dimensional reconstruction loss, coordinate transform loss and target object and the 3D shape of target object, the three-dimensional feature data of target object are obtained, to realize the three-dimensional reconstruction of target object.

Description

Three-dimensional rebuilding method and device

Technical field

This application involves computer image processing technology fields, more particularly to a kind of three-dimensional rebuilding method and device.

Background technique

3D, which is rebuild, establishes corresponding computer representation to three dimensional object using suitable mathematical model, is in computer environment The basis of lower processing, operation and analysis destination properties is the key technology using computer expression objective world virtual reality.Figure The three-dimensional reconstruction of picture is a critically important scientific research field in CAD (CAD) and computer graphics, its reality Now rely on the Rendering based on image.Rendering based on image can be in no any three-dimensional geometric information or a small amount of Under the scene of geological information, the three-dimensional only can be drawn out by the original image for being directed to some three dimensional object or scene on a small quantity The viewpoint figure of object or scene.

Three-dimensional reconstruction process probably includes: that image obtains, pretreatment, puts cloud computing, point cloud registering, data fusion, surface Generating process.Pretreatment is in order to the image enhancement functions such as image denoising, reparation and acquisition subject depth.Point cloud computing is normal After searching matching by characteristic points such as SIFT SURF, with 8 methods and RANSAC (Random sample consensus, at random Sampling unification algorism), the fundamental matrix (Fundamental Matrix) between two figures is repeatedly calculated, is selected wherein best One, calculate the transformational relation between world coordinate system and image pixel coordinates system.The multi-purpose fine registration of point cloud registering, such as base Minimum processing is carried out to error function in least square method, ICP (Iterative ClosestPoint, iteration closest approach) is calculated Method obtains the accurate registration result of essence, and in addition there are SAA (Simulate Anneal Arithmetic, simulated annealing) calculations Method, GA (Genetic Algorithm, heredity) algorithm etc..Data fusion is then to still at random unordered in space after registration Point cloud data does fusion treatment, as TSDF (truncated signed distance function) algorithm uses grid cube Body represents three-dimensional space, and what is stored in each grid is its distance for arriving body surface.Positive and negative respectively represent of TSDF value is hidden Block face and visible face, and the point on surface is then through zero crossing；In addition, SDF (Signed Distance Field, effective distance field) Method implicitly template surface.The purpose of Surface Creation is the visual contour surface in order to construct object, Lorensen propose through Allusion quotation voxel grade algorithm for reconstructing: MC (Marching Cube, marching cube) method, the equivalence by merging all cubes are looked unfamiliar At complete three-dimensional surface.

With the continuous development of deep learning, the object dimensional reconstruction based on deep learning obtains remarkable break-throughs.Such as The 3D-R2N2 (3D recurrent reconstruction neural network) of the propositions such as Choy is a kind of based on mark The extended network structure of quasi- LSTM (Long Short-Term Memory, shot and long term memory network), the e-learning X-Y scheme Picture and the mapping between three-D profile obtain the image of one or more object instances with sequence form end to end.Pass through first The CNN (Convolutional Neural Network, convolutional neural networks) of one standard encodes source images, is proposed with it 3D-LSTM linked, 3D-LSTM neuron is arranged in three dimensional network structure, each unit receives one from encoder Obtained in feature vector, and they are transported in decoder.The result is that each 3D-LSTM neuron reconstruct output voxel A part, then by a standard deconvolution network decoding, two dimensional image and three-dimensional mould are established using such network structure The mapping of type.In addition, Ψ-the CNN that Huan Lei is proposed, proposes a kind of spherical shape with convolution translation invariance and asymmetry Convolution kernel.

In above-mentioned object dimensional reconstruction technique, data processing is done using point cloud data mostly, as in field of three dimension One crucial data source does not have set and opens up since point cloud data is mainly the magnanimity point set for characterizing target surface Information is flutterred, therefore, the discrete operations that generation is converted between image and point cloud grid can hinder the process of backpropagation, after influencing reconstruction Resolution ratio and precision.

Summary of the invention

Aiming to overcome that the above problem or at least being partially solved or extenuate for the application solves the above problems.

According to the one aspect of the application, a kind of three-dimensional rebuilding method is provided, comprising: carry out to the image data of input It goes to render, obtains the original point cloud data of image dense Region；One or more image is selected from the original point cloud data Point is used as current root node, centered on the current root node, is scanned and draws to the original point cloud according to predetermined radii Point, it is stored the picture point scanned as the leaf node of the current root node, then with each leaf node Continue to be scanned division to original point cloud, using the picture point scanned as each according to predetermined radii for current root node The leaf node of the current root node, circular recursion, until the point cloud data for obtaining storing with the data structure of Octree；It will It is interested that trained region convolutional neural networks progress is input to the point cloud data that the data structure of Octree stores The convolution algorithm of region Rol exports the spy in the point cloud data with trained model in the regional nerve convolutional network Levy the characteristic of similar area-of-interest, wherein the area-of-interest of the regional nerve convolutional network output Characteristic is using the center of the target object in the area-of-interest as coordinate origin；The region convolutional neural networks are defeated The characteristic of the area-of-interest out is input to pond layer and carries out pond, is characterized in the area-of-interest with reducing The characteristic of target object obtains the characteristic of the area-of-interest of Chi Huahou；Using being trained in advance to described Model prediction loss function, identify the target object in the area-of-interest three-dimensional reconstruction loss；Using pre- First define by object from the loss function for transforming to the coordinate transform centered on object centered on camera, identify the sense Caused by the characteristic of the target object in interest region is coordinately transformed in the region convolutional neural networks Coordinate transform loss；It obtains and the region of interest is not observed directly when capturing the image data of the input due to camera Perspective distortion caused by centered on the target object in domain；According to pre-defined algorithm, restore in the area-of-interest The pose centered on the camera of the target object；According to the position centered on the camera of the target object Appearance, coordinate transform loss and the perspective distortion, carry out three-dimensional pose prediction to the target object, obtain the target The pose angle of object；Image, coordinate transform loss and the perspective distortion formed according to the area-of-interest is right The target object carries out 3D shape prediction, obtains the 3D shape of the target object；Lost in conjunction with the three-dimensional reconstruction, The coordinate transform loss and the pose angle of the target object and the 3D shape of target object, obtain the target object Three-dimensional feature data.

Optionally, pond layer is input in the characteristic for the area-of-interest for exporting the region convolutional neural networks After carrying out pond, before the three-dimensional reconstruction loss for identifying the target object in the area-of-interest, the method is also Include: that the characteristic of the area-of-interest of Chi Huahou is input to depth convolutional neural networks, obtains the depth volume The characteristic of the area-of-interest of product neural network output.

Optionally, after obtaining the three-dimensional feature data of the target object, the method also includes: by the target The three-dimensional feature data of object and the true value of the pre-stored target object compare, to the three of the target object Dimensional feature data optimize.

Optionally, image, coordinate transform loss and the perspective distortion formed according to the area-of-interest is right The target object carries out 3D shape prediction, obtains the 3D shape of the target object, comprising: according to the region of interest Image, coordinate transform loss and the perspective distortion that domain is formed, carry out 3D shape prediction, obtain the target object 3D shape parameter；By each shape of the 3D shape parameter of the target object and the preset trained model Corresponding form parameter is compared, using the immediate shape of 3D shape parameter of form parameter and the target object as The 3D shape of the target object.

Optionally, which is characterized in that according to pre-defined algorithm, restore the target object in the area-of-interest with Pose centered on the camera, comprising: the central point C for choosing the target object in the area-of-interest, according to Under type obtains the pose P centered on the camera of the target object_E:

Wherein, R=R_C×R_V, t=R_C× [0,0, d]^T, d is the target object away from the camera Distance, wherein R_CFor the camera main shaft to the central point C across the target object spin matrix, andR_VFor the spin matrix form of vision point, k_CFor the internal reference of the camera.

According to further aspect of the application, a kind of three-dimensional reconstruction apparatus is provided, comprising: preprocessing module, configuration The image data inputted in pairs render, and obtains the original point cloud data of image dense Region；Octree constructs module, It is configured to select one or more picture point from the original point cloud data as current root node, with the current root node Centered on, division is scanned to the original point cloud according to predetermined radii, using the picture point scanned as the current root The leaf node of node is stored, then is continued according to predetermined radii to original with each leaf node for current root node Initial point cloud is scanned division, using the picture point scanned as the leaf node of each current root node, circular recursion, Until the point cloud data for obtaining storing with the data structure of Octree；Region convolution module is configured to using trained area The convolution that domain convolutional neural networks carry out area-of-interest Rol to the point cloud data of the data structure storage with Octree is transported It calculates, obtains region of interest similar with the feature of trained model in the regional nerve convolutional network in the point cloud data The characteristic in domain, wherein the characteristic of the area-of-interest of the regional nerve convolutional network output is with the sense The center of target object in interest region is coordinate origin；Pond layer module is configured to obtain the region convolution module The characteristic of the area-of-interest arrived carries out pond, to reduce the spy for characterizing the target object in the area-of-interest Levy data, the characteristic of the area-of-interest after output pool；First-loss computing module is configured in advance To the loss function of the trained model prediction, the three-dimensional reconstruction of the target object in the area-of-interest is identified Loss；Second costing bio disturbance module is configured with and predetermined is transformed to object with object from centered on camera Centered on coordinate transform loss function, identify the characteristic of the target object in the area-of-interest described Coordinate transform loss caused by being coordinately transformed in the convolutional neural networks of region；Pose prediction module, is configured to: obtaining Since camera does not have directly to observe the target object in the area-of-interest when capturing the image data of the input Centered on caused by perspective distortion；According to pre-defined algorithm, restore the target object in the area-of-interest with institute State the pose centered on camera；And the pose centered on the camera according to the target object, the coordinate transform Loss and the perspective distortion carry out three-dimensional pose prediction to the target object, obtain the pose angle of the target object； Shape Prediction module is configured to the image formed according to the area-of-interest, coordinate transform loss and the perspective Distortion carries out 3D shape prediction to the target object, obtains the 3D shape of the target object；It is modeled with Three-dimensional Gravity Block is configured to pose angle and institute in conjunction with three-dimensional reconstruction loss, the coordinate transform loss and the target object The 3D shape for stating target object obtains the three-dimensional feature data of the target object.

Optionally, further includes: depth convolution module is configured with depth convolutional neural networks to the pond layer mould The characteristic of the area-of-interest of block output makees depth convolution, the spy of the area-of-interest after exporting depth convolution Levy data.

Optionally, further includes: optimization module, be configured to by the three-dimensional feature data of the target object be stored in advance The true value of the target object compare, the three-dimensional feature data of the target object are optimized.

Optionally, the Shape Prediction module includes: form parameter acquiring unit, is configured to according to the region of interest Image, coordinate transform loss and the perspective distortion that domain is formed, carry out 3D shape prediction, obtain the target object 3D shape parameter；3D shape acquiring unit, be configured to by the 3D shape parameter of the target object with it is preset The corresponding form parameter of each shape of the trained model is compared, by form parameter and the three of the target object Tie up 3D shape of the immediate shape of form parameter as the target object.

Optionally, the pose prediction module obtain in the following manner the target object centered on the camera Pose: choose the central point C of the target object in the area-of-interest, obtain the target pair in the following way The pose P centered on the camera of elephant_E:

According to the another aspect of the application, a kind of calculating equipment is provided, including memory, processor and be stored in institute State the computer program that can be run in memory and by the processor, wherein the processor executes the computer program Shi Shixian above-mentioned method.

According to the another aspect of the application, provide a kind of computer readable storage medium, it is preferably non-volatile can Storage medium is read, is stored with computer program, the computer program realizes above-mentioned method when executed by the processor.

According to the another aspect of the application, a kind of computer program product, including computer-readable code are provided, when When the computer-readable code is executed by computer equipment, the computer equipment is caused to execute above-mentioned method.

In three-dimensional reconstruction scheme provided by the present application, the storage of point cloud data is carried out based on Octree, therefore, is reduced and is calculated With memory cost and improve processing speed, be capable of handling high-resolution three-dimensional data, expand practical ranges.

Further, it in the three-dimensional reconstruction scheme of the application, is realized using region convolutional neural networks, with the visual angle of object Reparameterization, and the characteristic pattern prediction pose and shape obtained from area-of-interest, can get an effective forward direction The target object of propagation improves the resolution ratio and precision of the target object after rebuilding.

According to the accompanying drawings to the detailed description of the specific embodiment of the application, those skilled in the art will be more Above-mentioned and other purposes, the advantages and features of the application are illustrated.

Detailed description of the invention

Some specific embodiments of the application are described in detail by way of example and not limitation with reference to the accompanying drawings hereinafter. Identical appended drawing reference denotes same or similar part or part in attached drawing.It should be appreciated by those skilled in the art that these What attached drawing was not necessarily drawn to scale.In attached drawing:

Fig. 1 is the flow chart according to the three-dimensional rebuilding method of the application one embodiment；

Fig. 2 is the signal flow schematic diagram according to the three-dimensional rebuilding method of the application one embodiment；

Fig. 3 is the structural schematic diagram according to the three-dimensional reconstruction apparatus of the application one embodiment；

Fig. 4 is the structural schematic diagram according to the calculating equipment of the application one embodiment；

Fig. 5 is the structural schematic diagram according to the computer readable storage medium of the application one embodiment.

Specific embodiment

Fig. 1 is the flow chart according to the three-dimensional rebuilding method of the application one embodiment.Fig. 2 is according to one reality of the application Apply the information flow direction schematic diagram of the three-dimensional rebuilding method of example.As depicted in figs. 1 and 2, which can generally wrap Include following steps:

Step 101, the image data of input render, obtain the three-dimensional ginseng of the original point cloud of image dense Region Number.

In a particular application, RGBD camera acquisition target object can be used and complete the foundation of three-dimensional scene models, it will Second-rate Object Segmentation comes out in scene after the completion of modeling, sweep object and database object picture number as input According to.

In a particular application, a cloud dense Region can be sampled using virtual scan technology in a step 101, In Choose the maximum point of normal vector direction change in sampled point and be used as characteristic point, using the normal vector of this feature point and curvature information as The low-level image feature in point cloud sector domain.For example, multiple virtual cameras can be placed on to a cloud truncation ball centre position, towards difference Direction, and launch multi beam parallel rays in each direction, when light intersects with spherome surface, complete to a cloud surface point Sampling, obtain original point cloud data.

Step 102, one or more picture point is selected from the original point cloud data as current root node, with described Centered on current root node, division is scanned to the original point cloud according to predetermined radii, using the picture point scanned as The leaf node of the current root node is stored, then with each leaf node for current root node, according to predetermined half Diameter continues to be scanned division to original point cloud, using the picture point scanned as the leaf section of each current root node Point, circular recursion, until the point cloud data for obtaining storing with the data structure of Octree.

In a particular application, original point cloud data can be input to existing packaged application interface in step 102 In, obtain the point cloud data stored with the data structure of Octree.

Alternatively, can also directly be divided to original point cloud data according to Octree.For example, can be by original point cloud mould Type is placed in the square bounding box of unit length, recursively segments square bounding box with breadth First order；Recurrence mistake Journey is as follows: recursive to access each node comprising model boundary and be divided into 8 when the tier I of traversal to Octree I+1 level of child nodes of the part as the node will not continue to divide when not including model in node.In addition, Octree creates It after the completion, is each layer in Octree to reduce the time required for directly carrying out convolution, down-sampling operation on Octree Multiple hash tables are established, store burl dot position information and mark information in each layer respectively, child node passes through key in hash table Value is quickly found out father node and brotgher of node position.

Wherein, hash table is divided into following two categories by type:

Hash Hash table: Hash table S is established for every node layer of Octree_l, the key assignments in Hash table indicates I node layer phase It is arranged in deposit Hash table compared with the relative position of its I-1 layers of father node, and according to ascending order, key assignments key (0) can be according in table Formula (1) acquires, wherein x_iy_iz_iIndicate the relative position between each child node and its father node:

Key (0) :=x₁y₁z₁x₂y₂z₂…x_ly_lz_l (1)

Mark Hash table: the key assignments p of the position L [j] is indicated in table, p-th of non-empty section that this layer of S [j] node is I layers Point, key assignments is 0 if node is sky；The child node of father node can be quickly obtained using label Hash table L；In Hash table into The formula of row convolutional calculation is as follows:

Wherein O_ijkRepresent the adjacent node being convolved, T⁽ⁿ⁾(*) represents O_ijkThe n-channel feature of storage in node to Amount,For the weight of convolutional layer, if O_ijkIt is not present, T (O_ijk) it is set as 0.

Step 103, the point cloud data stored with the data structure of Octree is input to trained region convolution Neural network (RCNN) carry out area-of-interest (Rol) convolution algorithm, export in the point cloud data with the regional nerve The characteristic of the similar area-of-interest of the feature of trained model in convolutional network, wherein the regional nerve convolution The characteristic of the area-of-interest of network output is using the center of the target object in the area-of-interest as coordinate original Point.

In a particular application, it can be trained in advance and the matched model of target object, according to the type of target object The characteristic that the model is stored in RCNN obtains trained RCNN, when in use, will be with the storage of the data structure of Octree Point cloud data be input in RCNN, the convolutional layer in RCNN extracts feature from the point cloud data of input, finds and trains Model the similar region of feature, i.e. area-of-interest obtains the characteristic of area-of-interest.

In a particular application, the convolutional layer of RCNN, will be from point after the convolution algorithm for carrying out Rol to the point cloud data of input Area-of-interest is found in cloud data, the characteristic of the area-of-interest obtained herein is the visual angle with camera (i.e. with camera For coordinate origin) be described.And the full articulamentum of RCNN will convert the coordinate of point cloud data, i.e., by region of interest Domain is transformed to using the characteristic of camera coordinates characterization using the center of the target object in area-of-interest as coordinate origin The characteristic of object coordinates characterization.

Step 104, the characteristic of the area-of-interest of region convolutional neural networks output is input to pond layer Pond is carried out, to reduce the characteristic for characterizing the target object in the area-of-interest, the sense for obtaining Chi Huahou is emerging The characteristic in interesting region.

In a particular application, pond can be carried out using maximum pondization, pond can also be carried out using average pondization, specifically It is not construed as limiting in the present embodiment.

In a particular application, the characteristic of the more depth of the target object in area-of-interest in order to obtain, in this reality It applies in an optional embodiment of example, as shown in Fig. 2, the characteristic of the area-of-interest of Chi Huahou can be input to depth It spends convolutional neural networks and carries out depth convolution algorithm, obtain the characteristic of the area-of-interest with more depth.

Step 105, using in advance to the loss function of the trained model prediction, the mesh in area-of-interest is identified Mark the three-dimensional reconstruction loss of object.

In a particular application, the three-dimensional reconstruction loss of target object includes but is not limited to: canonical loss part and data damage Lose part, wherein the loss function of data degradation part can be by being trained to obtain, for example, logical to target object in advance The average data for crossing each individually example of calculating loses to obtain the data degradation L of target object:

Wherein, N is the number of the training data of target object, L_iIt can be the intersection entropy loss of i-th of trained object.

And canonical loss part is then the biasing during neural network (RCNN and depth convolutional neural networks) regularization Loss, it is related to the regularization method used in neural network, for example, being walked if using L1 regularization in neural network Consider to be that L1 biasing loss is examined in step 105 if using other regularizations (such as L2) in neural network in rapid 105 Worry is L2 biasing loss.

Therefore, in step 105, can by the data degradation part of target object regularization biasing loss obtain The three-dimensional reconstruction of target object loses.

In a particular application, by taking L1 regularization as an example, as shown in Fig. 2, can in the bounding box of Rol entire object it is complete Affix L1 biasing is lost on the basis of connecting data, the characteristic for the entire object that available bounding box frames, the spy Characteristic of the sign data as the Rol for compensating for data degradation part and L1 biasing loss, that is, compensate for three-dimensional reconstruction loss Rol characteristic.

Step 106, object is become from the coordinate transformed to centered on object centered on camera using predetermined The loss function changed identifies the characteristic of the target object in the area-of-interest in the region convolutional Neural net Coordinate transform loss caused by being coordinately transformed in network.

As shown in Fig. 2, in a particular application, it, can be by being to object in Rol when considering coordinate transform loss Affix regularization biasing loss is coordinately transformed the compensation of loss on the character representation at center.

Step 107, obtain do not observed directly due to camera when capturing the image data of the input it is described interested Perspective distortion caused by centered on the target object in region；And according to pre-defined algorithm, restore the region of interest The pose centered on the camera of the target object in domain；According to the target object centered on the camera Pose, coordinate transform loss and the perspective distortion, three-dimensional pose prediction is carried out to the target object, is obtained described The pose angle of target object.

Perspective distortion belongs to the scope of three-dimensional distortion, in a particular application, can solve the problems, such as this using projective transformation, throw Shadow transformation is exactly that 3-D image is converted to the process of image, wherein most widely used affine transformation (two dimensional image to two dimension Image) special shape as projective transformation can be regarded as.It is corrected using image after projective transformation.For example, throwing can be generated Shadow transformation matrix, is then corrected using projective transformation matrix on the pending image.

In an optional embodiment of the present embodiment, restore Rol in target object centered on camera When pose, the central point C of the target object in the area-of-interest can be chosen, obtains the mesh in the following way Mark the pose P centered on the camera of object_E:

After obtaining the pose centered on the camera of target object, affix coordinate it can become on this basis It changes loss and perspective distortion carries out pose prediction, to obtain the pose angle of target object.

Step 108, image, coordinate transform loss and the perspective distortion formed according to the area-of-interest, 3D shape prediction is carried out to the target object, obtains the 3D shape of the target object.

In an optional embodiment of the present embodiment, as shown in Fig. 2, can be formed according to the area-of-interest Image, coordinate transform loss and the perspective distortion, carry out 3D shape prediction, obtain the three-dimensional shaped of the target object Shape parameter；Then the 3D shape parameter of the target object is corresponding with each shape of the preset trained model Form parameter be compared, using the immediate shape of 3D shape parameter of form parameter and the target object as described in The 3D shape of target object.

For the 3D shape of object instance, it is believed that be a classification problem, it is assumed that object type detector it is defeated It is given a, low-dimensional " shape space " constructed by one group of 3D CAD model, it is understood that be encoded with certain kinds out Object shapes.The 3D shape that one group of parameter carrys out coded object class can be used in this expression, therefore, shape can be estimated to ask Topic confines one group of low-dimensional form parameter appropriate to predict particular object instance.Therefore, in above-mentioned optional embodiment, In When predicting the shape of target object, the form parameter predicted is compared with corresponding parameter in preset " shape space ", Then the 3D shape of object is determined.

In a particular application, as shown in Fig. 2, the characteristic for the Rol that depth convolutional neural networks export can be combined Coordinate transform loss and perspective distortion carry out Shape Prediction, obtain the 3D shape parameter of target object.

Step 109, in conjunction with three-dimensional reconstruction loss, the pose angle of coordinate transform loss and target object and target object 3D shape obtains the three-dimensional feature data of target object.

It in a particular application, as shown in Fig. 2, can be by the spy for the entire object that bounding box obtained in step 105 frames Levy the pose angle and step of coordinate transform loss obtained in data, step 106, target object obtained in step 107 The shape of target object obtained in 108 carries out deconvolution, obtains the three-dimensional feature data of target object, realizes target object Three-dimensional reconstruction.

In order to advanced optimize the target object of reconstruction, in an optional embodiment of the present embodiment, mesh is being obtained Mark object three-dimensional feature data after, as shown in Fig. 2, can also by the three-dimensional feature data of target object with it is pre-stored The true value of target object compares, and optimizes to the three-dimensional feature data of target object.In a particular application, this can It selects in embodiment, example segmentation can be done to target object, compared with two-dimensional annotations such as depth maps, minimize all prediction mesh The sum of target loss, the target object after optimizing three-dimensional reconstruction, and to the target object of rendering output.

By three-dimensional rebuilding method provided by the present application, it is saved as using the storage organization of Octree to reduce calculating with interior This, is no longer limited by the calculating requirement of memory input volume data, can handle high-resolution 3D data, and expansion is actually answered Use range.Median angle error is reduced in terms of pose estimation, improves the extensive level of model.

Fig. 3 be according to the structural schematic diagram of the three-dimensional reconstruction apparatus of the application one embodiment, the three-dimensional reconstruction apparatus with Above-mentioned three-dimensional rebuilding method is corresponding, for realizing above-mentioned three-dimensional rebuilding method.

As shown in figure 3, the three-dimensional reconstruction apparatus specifically includes that preprocessing module 301, Octree building module 302, region Convolution module 303, pond layer module 304, first-loss computing module 305, the second costing bio disturbance module 306, pose predict mould Block 307, Shape Prediction module 308 and three-dimensional reconstruction module 309.Below mainly to the function of the modules of three-dimensional reconstruction apparatus It is described, other unaccomplished matters may refer to the above-mentioned description to three-dimensional rebuilding method, and details are not described herein.

Preprocessing module 301 is configured to render to the image data of input, obtains the original of image dense Region Beginning point cloud data；Octree constructs module 302, is configured to select one or more picture point from the original point cloud data As current root node, centered on the current root node, division is scanned to the original point cloud according to predetermined radii, Stored the picture point scanned as the leaf node of the current root node, then with each leaf node be work as Preceding root node continues to be scanned division to original point cloud, using the picture point scanned as each described according to predetermined radii The leaf node of current root node, circular recursion, until the point cloud data for obtaining storing with the data structure of Octree；Region volume Volume module 303 is configured to using trained region convolutional neural networks to described in the data structure storage with Octree Point cloud data carry out area-of-interest (Rol) convolution algorithm, obtain in the point cloud data with the regional nerve convolution net The characteristic of the similar area-of-interest of the feature of trained model in network, wherein the regional nerve convolutional network is defeated The characteristic of the area-of-interest out is using the center of the target object in the area-of-interest as coordinate origin；Chi Hua Layer module 304, the characteristic for being configured to the area-of-interest obtained to the region convolution module 303 carry out pond Change, the area-of-interest to reduce the characteristic for characterizing the target object in the area-of-interest, after output pool Characteristic；First-loss computing module 305 is connect with the output interface of pond layer module 304, is configured in advance To the loss function of the trained model prediction, the three-dimensional reconstruction of the target object in the area-of-interest is identified Loss；Second costing bio disturbance module 306 connect with the output interface of pond layer module 304, is configured with predetermined By object from the loss function for transforming to the coordinate transform centered on object centered on camera, the area-of-interest is identified In the target object characteristic be coordinately transformed in the region convolutional neural networks caused by coordinate become Change loss；Pose prediction module 307 connect with the output interface of pond layer module 304, is configured to: obtaining since camera exists Do not have directly to observe when capturing the image data of the input and be led centered on the target object in the area-of-interest The perspective distortion of cause；According to pre-defined algorithm, restore the target object in the area-of-interest is with the camera The pose of the heart；And it is lost according to the pose centered on the camera of the target object, the coordinate transform and described Perspective distortion carries out three-dimensional pose prediction to the target object, obtains the pose angle of the target object；Shape Prediction mould Block 308 is connect with the output interface of pond layer module 304, is configured to the image, described formed according to the area-of-interest Coordinate transform loss and the perspective distortion carry out 3D shape prediction to the target object, obtain the target object 3D shape；With three-dimensional reconstruction module 309, predicted with first-loss computing module 305, the second costing bio disturbance module 306, pose Module 307 and Shape Prediction module 308 connect, be configured to lose in conjunction with the three-dimensional reconstruction, coordinate transform loss and The pose angle of the target object and the 3D shape of the target object, obtain the three-dimensional feature number of the target object According to.

The three-dimensional reconstruction apparatus provided through this embodiment carries out the storage of point cloud data based on Octree, therefore, reduces Calculating and memory cost simultaneously improve processing speed, are capable of handling high-resolution three-dimensional data, expand practical ranges.Into one Step ground in the three-dimensional reconstruction apparatus of the application, realized using region convolutional neural networks, with the visual angle Reparameterization of object, And the characteristic pattern prediction pose and shape obtained from area-of-interest, the target pair of an effective propagated forward can be got As improving the resolution ratio and precision of the target object after rebuilding.

In an optional embodiment of the present embodiment, which can also include: depth convolution module, be connected on pond His layer module 304 and first-loss computing module 305, the second costing bio disturbance module 306, pose prediction module 307 and shape are pre- Survey module 308 between, be configured with depth convolutional neural networks the pond layer module 304 is exported it is described interested The characteristic in region makees depth convolution, the characteristic of the area-of-interest after exporting depth convolution.

In an optional embodiment of the present embodiment, the device further include: optimization module connects with three-dimensional reconstruction module It connects, is configured to carry out the true value of the three-dimensional feature data of the target object and the pre-stored target object pair Than being optimized to the three-dimensional feature data of the target object.In a particular application, optimization module can be to three-dimensional target Object carries out the operation such as rendering, with optimization aim object.

In an optional embodiment of the present embodiment, Shape Prediction module 308 may include: that form parameter obtains list Member is configured to the image formed according to the area-of-interest, coordinate transform loss and the perspective distortion, carries out three Shape Prediction is tieed up, the 3D shape parameter of the target object is obtained；3D shape acquiring unit is configured to the target The 3D shape parameter of object form parameter corresponding with each shape of the preset trained model is compared, will The 3D shape of form parameter and the immediate shape of 3D shape parameter of the target object as the target object.

In an optional embodiment of the present embodiment, pose prediction module 307 can obtain institute in the following manner State the pose centered on the camera of target object:

The central point C for choosing the target object in the area-of-interest, obtains the target in the following way The pose P centered on the camera of object_E:

The embodiment of the present application also provides a kind of calculating equipment, and referring to Fig. 4, which includes memory 1120, place It manages device 1110 and is stored in the computer program that can be run in the memory 1120 and by the processor 1110, the computer Program is stored in the space 1130 for program code in memory 1120, which executes by processor 1110 Shi Shixian is for executing any one steps of a method in accordance with the invention 1131.

The embodiment of the present application also provides a kind of computer readable storage mediums.Referring to Fig. 5, the computer-readable storage medium Matter includes the storage unit for program code, which is provided with the journey for executing steps of a method in accordance with the invention Sequence 1131 ', the program are executed by processor.

The embodiment of the present application also provides a kind of computer program products comprising instruction.When the computer program product exists When being run on computer, so that computer executes steps of a method in accordance with the invention.

In the above-described embodiments, can come wholly or partly by software, hardware, firmware or any combination thereof real It is existing.When implemented in software, it can entirely or partly realize in the form of a computer program product.The computer program Product includes one or more computer instructions.When computer loads and executes the computer program instructions, whole or portion Ground is divided to generate according to process or function described in the embodiment of the present application.The computer can be general purpose computer, dedicated computing Machine, computer network obtain other programmable devices.The computer instruction can store in computer readable storage medium In, or from a computer readable storage medium to the transmission of another computer readable storage medium, for example, the computer Instruction can pass through wired (such as coaxial cable, optical fiber, number from a web-site, computer, server or data center User's line (DSL)) or wireless (such as infrared, wireless, microwave etc.) mode to another web-site, computer, server or Data center is transmitted.The computer readable storage medium can be any usable medium that computer can access or It is comprising data storage devices such as one or more usable mediums integrated server, data centers.The usable medium can be with It is magnetic medium, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state hard disk Solid State Disk (SSD)) etc..

Professional should further appreciate that, described in conjunction with the examples disclosed in the embodiments of the present disclosure Unit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, hard in order to clearly demonstrate The interchangeability of part and software generally describes each exemplary composition and step according to function in the above description. These functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution. Professional technician can use different methods to achieve the described function each specific application, but this realization It is not considered that exceeding scope of the present application.

Those of ordinary skill in the art will appreciate that implement the method for the above embodiments be can be with By program come instruction processing unit completion, the program be can store in computer readable storage medium, and the storage is situated between Matter is non-transitory (English: non-transitory) medium, such as random access memory, read-only memory, flash Device, hard disk, solid state hard disk, tape (English: magnetic tape), floppy disk (English: floppy disk), CD (English: Optical disc) and any combination thereof.

The preferable specific embodiment of the above, only the application, but the protection scope of the application is not limited thereto, Within the technical scope of the present application, any changes or substitutions that can be easily thought of by anyone skilled in the art, Should all it cover within the scope of protection of this application.Therefore, the protection scope of the application should be with scope of protection of the claims Subject to.

Claims

1. a kind of three-dimensional rebuilding method, comprising:

The image data of input render, the original point cloud data of image dense Region is obtained；

One or more picture point is selected from the original point cloud data as current root node, is with the current root node Center is scanned division to the original point cloud according to predetermined radii, using the picture point scanned as the current root section The leaf node of point is stored, then is continued according to predetermined radii to original with each leaf node for current root node Point cloud is scanned division, using the picture point scanned as the leaf node of each current root node, circular recursion, directly To the point cloud data for obtaining storing with the data structure of Octree；

The point cloud data stored with the data structure of Octree is input to trained region convolutional neural networks to carry out The convolution algorithm of area-of-interest Rol, export in the point cloud data with trained mould in the regional nerve convolutional network The characteristic of the similar area-of-interest of the feature of type, wherein the regional nerve convolutional network exports described interested The characteristic in region is using the center of the target object in the area-of-interest as coordinate origin；

The characteristic of the area-of-interest of region convolutional neural networks output is input to pond layer and carries out pond, To reduce the characteristic for characterizing the target object in the area-of-interest, the spy of the area-of-interest of Chi Huahou is obtained Levy data；

Using to the loss function of the trained model prediction, identifying the target pair in the area-of-interest in advance The three-dimensional reconstruction of elephant loses；

Using it is predetermined by object from the loss function for transforming to the coordinate transform centered on object centered on camera, Identify that the characteristic of the target object in the area-of-interest carries out coordinate in the region convolutional neural networks Coordinate transform loss caused by transformation；

Acquisition does not have directly to observe the institute in the area-of-interest by camera when capturing the image data of the input Perspective distortion caused by stating centered on target object；

According to pre-defined algorithm, restore the pose centered on the camera of the target object in the area-of-interest；

According to the pose centered on the camera of the target object, the coordinate transform is lost and the perspective distortion, Three-dimensional pose prediction is carried out to the target object, obtains the pose angle of the target object；

Image, coordinate transform loss and the perspective distortion formed according to the area-of-interest, to the target pair As carrying out 3D shape prediction, the 3D shape of the target object is obtained；

In conjunction with three-dimensional reconstruction loss, the pose angle of coordinate transform loss and the target object and target object 3D shape obtains the three-dimensional feature data of the target object.

2. the method according to claim 1, wherein interested exporting the region convolutional neural networks The characteristic in region is input to after pond layer progress pond, identifies three of the target object in the area-of-interest Dimension is rebuild before loss, the method also includes:

The characteristic of the area-of-interest of Chi Huahou is input to depth convolutional neural networks, obtains the depth convolution The characteristic of the area-of-interest of neural network output.

3. the method according to claim 1, wherein the three-dimensional feature data for obtaining the target object it Afterwards, the method also includes:

The three-dimensional feature data of the target object and the true value of the pre-stored target object are compared, to institute The three-dimensional feature data for stating target object optimize.

4. method according to claim 1-3, which is characterized in that the figure formed according to the area-of-interest Picture, coordinate transform loss and the perspective distortion, carry out 3D shape prediction to the target object, obtain the target The 3D shape of object, comprising:

Image, coordinate transform loss and the perspective distortion formed according to the area-of-interest, carries out 3D shape Prediction, obtains the 3D shape parameter of the target object；

By the 3D shape parameter of target object shape corresponding with each shape of the preset trained model Parameter is compared, using form parameter and the immediate shape of 3D shape parameter of the target object as the target pair The 3D shape of elephant.

5. method according to any of claims 1-4, which is characterized in that according to pre-defined algorithm, it is emerging to restore the sense The pose centered on the camera of the target object in interesting region, comprising:

The central point C for choosing the target object in the area-of-interest, obtains the target object in the following way The pose P centered on the camera_E:

Wherein, R=R_C×R_V, t=R_C× [0,0, d]^T, d be the target object away from the camera away from From, wherein R_CFor the camera main shaft to the central point C across the target object spin matrix, andR_VFor the spin matrix form of vision point, k_CFor the internal reference of the camera.

6. a kind of three-dimensional reconstruction apparatus, comprising:

Preprocessing module is configured to render to the image data of input, obtains the original point cloud of image dense Region Data；

Octree constructs module, is configured to select one or more picture point from the original point cloud data as current root Node is scanned division to the original point cloud according to predetermined radii, by what is scanned centered on the current root node Picture point is stored as the leaf node of the current root node, then with each leaf node for current root node, According to predetermined radii, continue to be scanned division to original point cloud, using the picture point scanned as each current root section The leaf node of point, circular recursion, until the point cloud data for obtaining storing with the data structure of Octree；

Region convolution module is configured to using trained region convolutional neural networks to the data structure storage with Octree The point cloud data carry out area-of-interest Rol convolution algorithm, obtain in the point cloud data with the regional nerve roll up The characteristic of the similar area-of-interest of the feature of trained model in product network, wherein the regional nerve convolution net The characteristic of the area-of-interest of network output is using the center of the target object in the area-of-interest as coordinate origin；

Pond layer module, the characteristic for being configured to the area-of-interest obtained to the region convolution module carry out pond Change, the area-of-interest to reduce the characteristic for characterizing the target object in the area-of-interest, after output pool Characteristic；

First-loss computing module is configured in advance to the loss function of the trained model prediction, identifies institute State the three-dimensional reconstruction loss of the target object in area-of-interest；

Second costing bio disturbance module, is configured with and predetermined is with object from being transformed to centered on camera by object The loss function of the coordinate transform at center identifies the characteristic of the target object in the area-of-interest in the area Coordinate transform loss caused by being coordinately transformed in the convolutional neural networks of domain；

Pose prediction module, is configured to: obtaining and is not observed directly due to camera when capturing the image data of the input Perspective distortion caused by centered on the target object in the area-of-interest；According to pre-defined algorithm, restore the sense The pose centered on the camera of the target object in interest region；And according to the target object with described Pose, coordinate transform loss and the perspective distortion centered on camera, it is pre- to carry out three-dimensional pose to the target object It surveys, obtains the pose angle of the target object；

Shape Prediction module is configured to the image formed according to the area-of-interest, the coordinate transform is lost and described Perspective distortion carries out 3D shape prediction to the target object, obtains the 3D shape of the target object；With

Three-dimensional reconstruction module is configured in conjunction with three-dimensional reconstruction loss, coordinate transform loss and the target object Pose angle and the target object 3D shape, obtain the three-dimensional feature data of the target object.

7. device according to claim 6, which is characterized in that further include:

Depth convolution module, be configured with depth convolutional neural networks the pond layer module is exported it is described interested The characteristic in region makees depth convolution, the characteristic of the area-of-interest after exporting depth convolution.

8. device according to claim 6, which is characterized in that further include:

Optimization module is configured to the true of the three-dimensional feature data of the target object and the pre-stored target object Real value compares, and optimizes to the three-dimensional feature data of the target object.

9. according to the described in any item devices of claim 6-8, which is characterized in that the Shape Prediction module includes:

Form parameter acquiring unit, be configured to according to the area-of-interest formed image, the coordinate transform loss and The perspective distortion carries out 3D shape prediction, obtains the 3D shape parameter of the target object；

3D shape acquiring unit, be configured to by the 3D shape parameter of the target object with it is preset described trained The corresponding form parameter of each shape of model is compared, most by the 3D shape parameter of form parameter and the target object 3D shape of the close shape as the target object.

10. according to the described in any item devices of claim 6-9, which is characterized in that the pose prediction module passes through with lower section Formula obtains the pose centered on the camera of the target object: