CN113766229A - Encoding method, decoding method, device, equipment and readable storage medium - Google Patents
Encoding method, decoding method, device, equipment and readable storage medium Download PDFInfo
- Publication number
- CN113766229A CN113766229A CN202111160289.XA CN202111160289A CN113766229A CN 113766229 A CN113766229 A CN 113766229A CN 202111160289 A CN202111160289 A CN 202111160289A CN 113766229 A CN113766229 A CN 113766229A
- Authority
- CN
- China
- Prior art keywords
- point
- matrix
- point cloud
- target
- sub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 80
- 239000011159 matrix material Substances 0.000 claims abstract description 133
- 230000009466 transformation Effects 0.000 claims abstract description 67
- 238000013139 quantization Methods 0.000 claims description 8
- 230000001131 transforming effect Effects 0.000 claims description 8
- 230000015572 biosynthetic process Effects 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 5
- 238000002360 preparation method Methods 0.000 claims 1
- 238000012545 processing Methods 0.000 abstract description 13
- 238000010586 diagram Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Image Processing (AREA)
Abstract
The application discloses an encoding method, a decoding method, a device, equipment and a readable storage medium, which relate to the technical field of image processing and aim to improve the processing performance. The method comprises the following steps: clustering point cloud data to be processed of a current frame to obtain a plurality of sub-point clouds; generating a generalized Laplace matrix for any target sub-point cloud in the plurality of sub-point clouds according to Euclidean distances between a plurality of point pairs in the target sub-point cloud and Euclidean distances between target points in the target sub-point cloud and corresponding points of the target points; performing inter-frame prediction and graph Fourier residual transformation on the target sub-point cloud by using the generalized Laplace matrix; quantizing and coding the transformed multiple sub-point clouds respectively to obtain a coded code stream; and the corresponding point is positioned in a reference point cloud of the target sub-point cloud, and the reference point cloud is positioned in a reference frame of the current frame. The embodiment of the application can improve the processing performance.
Description
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an encoding method, a decoding method, an apparatus, a device, and a readable storage medium.
Background
With the development of computer hardware and algorithms, the three-dimensional point cloud data is more and more convenient to acquire, and the data volume of the point cloud data is larger and larger. The point cloud data is composed of a large number of three-dimensional unordered points, each of which includes position information (X, Y, Z) and several attribute information (color, normal vector, etc.).
In order to facilitate the storage and transmission of point cloud data, point cloud compression technology is becoming a focus of attention. The prior art provides a scheme for selectively encoding one or more 3D point cloud blocks using inter-coding (e.g., motion compensation) techniques of previously encoded/decoded frames. However, this scheme has poor processing performance such as encoding.
Disclosure of Invention
The embodiment of the application provides an encoding method, a decoding method, a device, equipment and a readable storage medium, so as to improve the processing performance.
In a first aspect, an embodiment of the present application provides an encoding method, which is applied to an encoding device, and includes:
clustering point cloud data to be processed of a current frame to obtain a plurality of sub-point clouds;
generating a generalized Laplace matrix for any target sub-point cloud in the plurality of sub-point clouds according to Euclidean distances between a plurality of point pairs in the target sub-point cloud and Euclidean distances between target points in the target sub-point cloud and corresponding points of the target points;
performing inter-frame prediction and graph Fourier residual transformation on the target sub-point cloud by using the generalized Laplace matrix;
quantizing and coding the transformed multiple sub-point clouds respectively to obtain a coded code stream;
and the corresponding point is positioned in a reference point cloud of the target sub-point cloud, and the reference point cloud is positioned in a reference frame of the current frame.
In a second aspect, an embodiment of the present application further provides a decoding method, which is applied to a decoding device, and the method includes:
acquiring a coding code stream;
carrying out graph Fourier inverse transformation based on Euclidean distance weight on the coded code stream to obtain a transformation result;
obtaining a decoding code stream based on the conversion result;
the coding code stream is obtained by coding a result of inter-frame prediction and image Fourier residual transformation of sub-point clouds by coding equipment.
In a third aspect, an embodiment of the present application further provides an encoding apparatus, including:
the first acquisition module is used for clustering point cloud data to be processed of a current frame to obtain a plurality of sub-point clouds;
the first generation module is used for generating a generalized Laplace matrix for any target sub-point cloud in the plurality of sub-point clouds according to Euclidean distances between a plurality of point pairs in the target sub-point cloud and Euclidean distances between target points in the target sub-point cloud and corresponding points of the target points;
the first transformation module is used for performing inter-frame prediction and image Fourier residual transformation on the target sub-point cloud by using the generalized Laplace matrix;
the first coding module is used for quantizing and coding the transformed sub-point clouds respectively to obtain a coding code stream;
and the corresponding point is positioned in a reference point cloud of the target sub-point cloud, and the reference point cloud is positioned in a reference frame of the current frame.
In a fourth aspect, an embodiment of the present application further provides a decoding apparatus, including:
the first acquisition module is used for acquiring a coding code stream;
the first transformation module is used for carrying out graph Fourier inverse transformation based on Euclidean distance weight on the coding code stream to obtain a transformation result;
the first decoding module is used for obtaining a decoding code stream based on the conversion result;
the coding code stream is obtained by coding a result of inter-frame prediction and image Fourier residual transformation of sub-point clouds by coding equipment.
In a fifth aspect, an embodiment of the present application further provides an electronic device, including: a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the steps in the encoding method or the decoding method as described above when executing the program.
In a sixth aspect, the present application further provides a readable storage medium, on which a program is stored, where the program, when executed by a processor, implements the steps in the encoding method or the decoding method as described above.
In the embodiment of the application, point cloud data to be processed of a current frame is clustered to obtain a plurality of sub-point clouds, a generalized Laplace matrix is generated for any target sub-point cloud according to Euclidean distances between a plurality of point pairs in the target sub-point cloud and Euclidean distances between target points in the target sub-point cloud and corresponding points of the target points, inter-frame prediction and image Fourier residual error transformation are respectively carried out on the plurality of sub-point clouds by utilizing the generalized Laplace matrix, and therefore a coding code stream is obtained based on a transformation result. The generalized Laplace matrix is generated by utilizing Euclidean distances between points, so that the global correlation characteristics can be utilized in the embodiment of the application, the correlation between the points can be more fully expressed, the similarity between point cloud data can be removed as far as possible, and the encoding performance is improved.
The performance of the encoding end is improved, and correspondingly, for the decoding end, the data needing to be decoded is optimized, so that the decoding efficiency and performance can be correspondingly improved.
Drawings
Fig. 1 is a flowchart of an encoding method provided in an embodiment of the present application;
FIGS. 2 and 3 are schematic diagrams comparing the effect of the method of the embodiment of the present application and the method of the prior art;
fig. 4 is a flowchart of a decoding method provided in an embodiment of the present application;
fig. 5 is a block diagram of an encoding apparatus according to an embodiment of the present application;
fig. 6 is a block diagram of a decoding device according to an embodiment of the present application.
Detailed Description
In the embodiment of the present application, the term "and/or" describes an association relationship of associated objects, and means that there may be three relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
In the embodiments of the present application, the term "plurality" means two or more, and other terms are similar thereto.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a flowchart of an encoding method provided in an embodiment of the present application, and is applied to an encoding apparatus. As shown in fig. 1, the method comprises the following steps:
In the step, the point cloud data to be processed is subjected to voxel formation to obtain point cloud voxels, and then the voxel formation point cloud data is clustered to obtain a plurality of sub-point clouds.
Specifically, a three-dimensional grid with a preset size is constructed, point cloud data to be processed are placed in the constructed three-dimensional grid to obtain coordinates of each point, and the three-dimensional grid containing the points is used as a point cloud voxel to obtain a plurality of point cloud voxels. In addition, coordinate and attribute information of each point cloud voxel can be obtained. Wherein the attribute information includes intensity, color, and the like. In the embodiment of the application, the coordinates of the point cloud volume element are specifically the coordinates of the central point of each point in the point cloud volume element; the color information of the point cloud volume element is specifically an average value of the color information of each point in the point cloud volume element. In practical application, point cloud data to be processed can be subjected to voxel formation in an octree mode and the like to obtain a plurality of point cloud voxels.
The point cloud data is clustered by using a space uniform division method. The clustering method may employ a clustering method using K-means, for example.
In the embodiment of the application, the point cloud data to be processed is divided into a plurality of sub-point clouds based on the position information, and the space is uniformly divided. Each sub-point cloud may be independently encoded.
102, for any target sub-point cloud in the plurality of sub-point clouds, generating a generalized Laplace matrix according to Euclidean distances between a plurality of point pairs in the target sub-point cloud and Euclidean distances between target points in the target sub-point cloud and corresponding points of the target points. And the corresponding point is positioned in a reference point cloud of the target sub-point cloud, and the reference point cloud is positioned in a reference frame of the current frame.
Any sub-point cloud in the plurality of sub-point clouds can be used as the target sub-point cloud. In practical application, the processing mode of each target sub-point cloud is the same.
Specifically, the following contents may be included in this step:
and S1021, obtaining a weight matrix according to Euclidean distances among a plurality of point pairs in the target sub-point cloud.
Wherein, the target sub-point cloud may include a plurality of points, and each two points constitute one point pair in the embodiment. In the embodiment of the present application, the euclidean distance between two points in each point pair is calculated. For example, for the ith point and the jth point in the target sub-point cloud, the euclidean distance between the ith point and the jth point is calculated. In particular, for point i (x)1,x2……xn) And point j (y)1,y2……yn) The euclidean distance d (i, j) between them can be calculated in practical application according to the following formula:
wherein i is more than or equal to 1 and less than or equal to M, j is more than or equal to 1 and less than or equal to M, i, j and M are integers, and M is the total number of points included in the target sub-point cloud.
Then, calculating the weight according to the following formula, and forming the weight matrix W by using the weight:
wherein ,WijRepresenting the weight corresponding to the edge from the ith point to the jth point in the target sub-point cloud; distance represents the Euclidean distance from the ith point to the jth point; σ is a constant different from 0 and represents the tuning parameter.
And S1022, obtaining the Laplace matrix according to the degree matrix and the weight matrix.
In this step, the difference between the utilization matrix and the weight matrix is used as the laplacian matrix.
Specifically, the method comprises the following steps: L-D-W, L denotes a laplacian matrix, D denotes a degree matrix, and W denotes a weight matrix.
Wherein a diagonal element d of the degree matrixi=∑jWijAnd the other elements are 0. Wherein d isiThe ith diagonal element, W, of the degree-of-representation matrixijRepresenting weights corresponding to edges from the ith point to the jth point in the target sub-point cloud
And S1023, generating a diagonal matrix.
The diagonal matrix is generated according to Euclidean distances between target points in the target sub-point cloud and corresponding points of the target points.
Specifically, the reference point cloud of the target sub-point element may be determined in the reference frame first. For example, motion estimation is performed in a reference frame to find a matching reference point cloud. Wherein, the target sub-point cloud and the reference point are in one-to-one correspondence. For example, using an iterative closest point algorithm, in the reference frame, a reference point cloud of the target sub-point elements may be determined based on euclidean distances. Then, the diagonal matrix D is generated based on the Euclidean distance between each point in the target sub-point cloud and the corresponding point of each point in the reference point cloudw. Wherein, the value on the ith diagonal of the diagonal matrix is the reciprocal of the Euclidean distance between the ith point and the point p, and the other elements are 0. And the point p is the corresponding point of the ith point in the reference point cloud.
And S1024, obtaining the generalized Laplace matrix according to the diagonal matrix and the Laplace matrix.
In this step, the sum of the diagonal matrix and the laplacian matrix is used as the generalized laplacian matrix.
Specifically, the method comprises the following steps: lg ═ L + DwWherein Lg represents a generalized Laplace matrix, L represents a Laplace matrix, and DwRepresenting a diagonal matrix.
And 103, performing inter-frame prediction and graph Fourier residual transformation on the target sub-point cloud by using the generalized Laplacian matrix.
In this step, the inter-frame prediction and graph fourier residual transform may be understood as being based on euclidean distance weights, and may include the following:
and S1031, obtaining the attribute predicted value of the target attribute of the current frame by the reference frame.
In the embodiment of the application, an inter-frame prediction method is adopted, and the attribute value of the current frame is predicted by using the reference frame. Wherein the attributes may include color, intensity, normal vector, etc. Then the target attribute may be any attribute.
Specifically, in this step, the attribute prediction value of the target attribute of the current frame by the reference frame is obtained according to the following formula:
wherein ,attribute prediction value, x, representing target attribute of reference frame to current framet-1An attribute value, L, representing a target attribute of a reference framegRepresenting a generalized laplacian matrix.
S1032, generating a residual error of the target attribute of the current frame according to the attribute value of the target attribute of the current frame and the attribute predicted value of the reference frame to the target attribute of the current frame.
Specifically, here, the difference between the attribute value of the target attribute of the current frame and the attribute prediction value of the reference frame for the target attribute of the current frame may be used as the residual:
wherein δ represents a residual of a target property of the current frame,attribute prediction value, x, representing target attribute of reference frame to current frametAn attribute value representing a target attribute of the current frame.
The residual error is obtained by the inter-frame prediction method, so that the difference between two frames can be obtained as much as possible. As the same part between the two frames does not need additional processing, the code rate can be saved by calculating the residual error.
And S1033, transforming the residual error of the target attribute of the current frame based on the generalized Laplacian matrix.
In this step, a transformation matrix is obtained by using the generalized laplacian matrix, and then, the residual error of the target attribute of the current frame is transformed by using the transformation matrix.
Specifically, the following formula is solved to obtain a transformation matrix:
Obtaining the transformation result by using the following formula:
where, theta represents the result of the transformation,representing a transformation matrix, δ representing the residual of the target property of the current frame.
In the embodiment of the application, on the basis of the traditional graph Fourier transform, the concept of generalized graph Fourier transform is introduced, and prediction and residual transformation are performed on the inter-frame attributes of the point cloud data, so that the redundancy among data can be further removed, and the coding efficiency is improved.
And the processing mode of other sub-point clouds is the same as that of the target sub-point cloud.
And step 104, quantizing and coding the transformed multiple sub-point clouds respectively to obtain a coded code stream.
In the step, the transformed multiple sub-point clouds are subjected to uniform quantization and arithmetic coding to obtain a coding code stream.
Taking the target attribute as color as an example, here, the color can be decomposed into three 3 × 1 vectors (YUV or RGB). Taking the Y component as an example, the attribute value of the current frame is predicted according to the process in S1031, and a residual is generated according to S1032. Thereafter, the residual is transformed using S1033. And uniformly quantizing and carrying out arithmetic coding on the transformed Y component to obtain a code stream. For each component, the same processing can be done for the Y component.
In the embodiment of the application, because the generalized laplacian matrix is generated by using the euclidean distance between the points, the global correlation characteristic can be used in the embodiment of the application to more fully express the correlation between the points, so that the similarities between the point cloud data can be removed as much as possible, and the encoding performance is improved.
In practical applications, tests are performed on the actual point cloud sequence. In the test, the test is performed on 16 frames of dynamic point clouds, and fig. 2 shows a comparison between the performance of the method of the embodiment of the present application and the performance of the RAHT (Region-Adaptive Hierarchical Transform) and NWGFT (main direction weight graph fourier Transform) methods. To quantify the gain, a comparison of the data with the RAHT method was performed in the experiment, as shown in FIG. 3.
As can be seen from fig. 2 and 3, in the embodiment of the present application, on the basis of the conventional graph fourier transform, a concept of a generalized graph fourier transform is introduced, and prediction and residual transform are performed on the point cloud inter-frame attributes, so that redundancy between data can be further removed, and the encoding efficiency is improved. Experimental results show that the method can improve subjective and objective performance and can be applied to a compression, transmission and storage system of actual point cloud.
Referring to fig. 4, fig. 4 is a flowchart of a decoding method provided in an embodiment of the present application, which is applied to an encoding apparatus. As shown in fig. 4, the method comprises the following steps:
The coding code stream is obtained by coding a result of inter-frame prediction and image Fourier residual transformation of sub-point clouds by using a generalized Laplacian matrix.
And step 402, carrying out graph Fourier inverse transformation based on Euclidean distance weight on the coding code stream to obtain a transformation result.
At a decoding end, after entropy decoding is carried out on the coded code stream, inverse quantization is carried out on the coded code stream. And then, carrying out graph Fourier inverse transformation based on Euclidean distance weight on the coded code stream after inverse quantization to obtain a transformation result.
Specifically, the inverse quantized coded code stream may be subjected to inverse graph fourier transform based on euclidean distance weights using the following formula:
wherein ,the inverse-transform residual value is represented,a transformation matrix is represented that is,the quantized residual values representing the target properties of the current frame, and epsilon represent the inverse quantized coefficients.
And 403, obtaining a decoding code stream based on the conversion result.
In the embodiment of the application, because the generalized laplacian matrix is generated by using the euclidean distance between the points, the global correlation characteristic can be used in the embodiment of the application to more fully express the correlation between the points, so that the similarities between the point cloud data can be removed as much as possible, and the encoding performance is improved. The performance of the encoding end is improved, and correspondingly, for the decoding end, the data needing to be decoded is optimized, so that the decoding efficiency and performance can be correspondingly improved.
The embodiment of the application also provides a coding device. Referring to fig. 5, fig. 5 is a structural diagram of an encoding apparatus according to an embodiment of the present application. Because the principle of the coding device for solving the problem is similar to the coding method in the embodiment of the present application, the implementation of the coding device can refer to the implementation of the method, and repeated details are not repeated.
As shown in fig. 5, the encoding apparatus 500 includes:
a first obtaining module 501, configured to cluster point cloud data to be processed of a current frame to obtain a plurality of sub-point clouds; a first generating module 502, configured to generate a generalized laplacian matrix for any target sub-point cloud of the plurality of sub-point clouds according to euclidean distances between a plurality of point pairs in the target sub-point cloud and between target points in the target sub-point cloud and corresponding points of the target point; a first transformation module 503, configured to perform inter-frame prediction and graph fourier residual transformation on the target sub-point cloud by using the generalized laplacian matrix; a first encoding module 504, configured to quantize and encode the transformed multiple sub-point clouds respectively to obtain an encoded code stream; and the corresponding point is positioned in a reference point cloud of the target sub-point cloud, and the reference point cloud is positioned in a reference frame of the current frame.
Optionally, the first obtaining module includes: the first processing submodule is used for carrying out voxel formation on the point cloud data to be processed to obtain a point cloud voxel; and the first acquisition submodule is used for clustering the voxel-based point cloud data to obtain a plurality of sub-point clouds.
Optionally, the first generating module includes:
the first obtaining submodule is used for obtaining a weight matrix according to Euclidean distances among a plurality of point pairs in the target sub-point cloud; the second obtaining submodule is used for obtaining a Laplace matrix according to the degree matrix and the weight matrix; a first generation submodule for generating a diagonal matrix; and the second generation submodule is used for obtaining the generalized Laplace matrix according to the diagonal matrix and the Laplace matrix.
Specifically, the second obtaining sub-module is configured to use a difference between the degree matrix and the weight matrix as a laplacian matrix; the second generation submodule is configured to use a sum of the diagonal matrix and the laplacian matrix as the generalized laplacian matrix.
Wherein a diagonal element d of the degree matrixi=∑jWij, wherein ,diThe ith diagonal element, W, of the degree-of-representation matrixijRepresenting the weight corresponding to the edge from the ith point to the jth point in the target sub-point cloud; i is more than or equal to 1 and less than or equal to M, j is more than or equal to 1 and less than or equal to M, wherein M is an integer and is the total number of points included in the target sub-point cloud;
the diagonal matrix is generated according to Euclidean distances between target points in the target sub-point cloud and corresponding points of the target points.
Optionally, the first obtaining sub-module includes:
the first calculating unit is used for calculating the Euclidean distance between the ith point and the jth point in the target sub-point cloud;
a first obtaining unit, configured to calculate weights according to the following formula, and form the weight matrix using the weights:
wherein ,WijRepresenting the weight corresponding to the edge from the ith point to the jth point in the target sub-point cloud; distance represents the Euclidean distance from the ith point to the jth point; σ is a constant different from 0 and represents a tuning parameter; i is more than or equal to 1 and less than or equal to M, j is more than or equal to 1 and less than or equal to M, wherein M is an integer and is the total number of points included in the target sub-point cloud.
Optionally, the first generation submodule includes:
a first determination unit, configured to determine a reference point cloud of the target sub-point element in the reference frame; a first generating unit, configured to generate the diagonal matrix based on an euclidean distance between each point in a target sub-point cloud and a corresponding point of each point in the reference point cloud; and the value on the ith diagonal of the diagonal matrix is the reciprocal of the Euclidean distance between the ith point and a point p, wherein the point p is the corresponding point of the ith point in the reference point cloud.
Optionally, the first determining unit is configured to determine, in the reference frame, a reference point cloud of the target sub-point element by using an iterative closest point algorithm.
Optionally, the first transformation module includes:
the first obtaining submodule is used for obtaining an attribute predicted value of the target attribute of the current frame from the reference frame; the first generation submodule is used for generating a residual error of the target attribute of the current frame according to the attribute value of the target attribute of the current frame and the attribute prediction value of the reference frame on the target attribute of the current frame; and the first transformation submodule is used for transforming the residual error of the target attribute of the current frame based on the generalized Laplace matrix.
Optionally, the first obtaining sub-module is configured to obtain an attribute prediction value of the target attribute of the current frame by the reference frame according to the following formula:
wherein Attribute prediction value, x, representing target attribute of reference frame to current framet-1Attribute values representing target attributes of the reference frame, and Lg representing a generalized laplacian matrix.
Optionally, the first generating sub-module is configured to use a difference between the attribute value of the target attribute of the current frame and the attribute prediction value of the target attribute of the current frame with respect to the reference frame as the residual error.
Optionally, the first transformation submodule includes:
a first obtaining unit, configured to obtain a transformation matrix by using the generalized laplacian matrix; and the first transformation unit is used for transforming the residual error of the target attribute of the current frame by using the transformation matrix.
Optionally, the first obtaining unit is configured to solve the following formula to obtain a transformation matrix:
Optionally, the first transforming unit is configured to obtain the transformation result by using the following formula:
where, theta represents the result of the transformation,representing a transformation matrix, delta representing said current frameThe residual error of the target property of (1).
The apparatus provided in the embodiment of the present application may implement the method embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.
The embodiment of the application also provides a decoding device. Referring to fig. 6, fig. 6 is a structural diagram of a decoding apparatus according to an embodiment of the present application. Because the principle of the decoding apparatus for solving the problem is similar to the decoding method in the embodiment of the present application, the implementation of the decoding apparatus can refer to the implementation of the method, and repeated details are not repeated.
As shown in fig. 6, the decoding apparatus 600 includes:
a first obtaining module 601, configured to obtain an encoded code stream; a first transform module 602, configured to perform inverse graph fourier transform based on euclidean distance weight on the encoded code stream to obtain a transform result; a first decoding module 603, configured to obtain a decoded code stream based on the transform result; the coding code stream is obtained by coding a result of inter-frame prediction and image Fourier residual transformation of sub-point clouds by coding equipment.
Optionally, the first transformation module includes: the first processing submodule is used for carrying out inverse quantization on the coding code stream; and the first transformation submodule is used for carrying out inverse graph Fourier transformation based on Euclidean distance weight on the coded code stream after inverse quantization to obtain a transformation result.
Optionally, the first transform submodule is configured to perform inverse graph fourier transform based on euclidean distance weights on the inversely quantized coded code stream by using the following formula:
wherein ,the inverse-transform residual value is represented,display changeAnd the matrix is changed, so that the matrix is changed,the quantized residual values representing the target properties of the current frame, and epsilon represent the inverse quantized coefficients.
The apparatus provided in the embodiment of the present application may implement the method embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.
It should be noted that the division of the unit in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation. In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented as a software functional unit and sold or used as a stand-alone product, may be stored in a processor readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
An embodiment of the present application further provides an electronic device, including: a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the steps in the encoding method or the decoding method as described above when executing the program.
The embodiment of the present application further provides a readable storage medium, where a program is stored on the readable storage medium, and when the program is executed by a processor, the program implements each process of the above encoding or decoding method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the detailed description is omitted here. The readable storage medium may be any available medium or data storage device that can be accessed by a processor, including but not limited to magnetic memory (e.g., floppy disk, hard disk, magnetic tape, magneto-optical disk (MO), etc.), optical memory (e.g., CD, DVD, BD, HVD, etc.), and semiconductor memory (e.g., ROM, EPROM, EEPROM, nonvolatile memory (NAND FLASH), Solid State Disk (SSD)), etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. With such an understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the methods according to the embodiments of the present application.
While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (20)
1. An encoding method applied to an encoding device, the method comprising:
clustering point cloud data to be processed of a current frame to obtain a plurality of sub-point clouds;
generating a generalized Laplace matrix for any target sub-point cloud in the plurality of sub-point clouds according to Euclidean distances between a plurality of point pairs in the target sub-point cloud and Euclidean distances between target points in the target sub-point cloud and corresponding points of the target points;
performing inter-frame prediction and graph Fourier residual transformation on the target sub-point cloud by using the generalized Laplace matrix;
quantizing and coding the transformed multiple sub-point clouds respectively to obtain a coded code stream;
and the corresponding point is positioned in a reference point cloud of the target sub-point cloud, and the reference point cloud is positioned in a reference frame of the current frame.
2. The method of claim 1, wherein clustering point cloud data to be processed of a current frame to obtain a plurality of sub-point clouds comprises:
carrying out voxel formation on the point cloud data to be processed to obtain point cloud voxels;
and clustering the voxel-based point cloud data to obtain a plurality of sub-point clouds.
3. The method of claim 1, wherein generating a generalized Laplace matrix according to Euclidean distances between a plurality of point pairs in the target sub-point cloud and Euclidean distances between corresponding points of the target point and the target point in the target sub-point cloud comprises:
obtaining a weight matrix according to Euclidean distances between a plurality of point pairs in the target sub-point cloud;
obtaining a Laplace matrix according to the degree matrix and the weight matrix;
generating a diagonal matrix;
and obtaining the generalized Laplace matrix according to the diagonal matrix and the Laplace matrix.
4. The method of claim 3,
obtaining a Laplace matrix according to the degree matrix and the weight matrix, wherein the obtaining of the Laplace matrix comprises the following steps:
using the difference between the degree matrix and the weight matrix as a Laplace matrix;
obtaining the generalized Laplace matrix according to the diagonal matrix and the Laplace matrix, including:
using a sum of the diagonal matrix and the laplacian matrix as the generalized laplacian matrix;
wherein a diagonal element d of the degree matrixi=∑jWij, wherein ,diThe ith diagonal element, W, of the degree-of-representation matrixijRepresenting the weight corresponding to the edge from the ith point to the jth point in the target sub-point cloud; i is more than or equal to 1 and less than or equal to M, j is more than or equal to 1 and less than or equal to M, wherein M is an integer and is the total number of points included in the target sub-point cloud;
the diagonal matrix is generated according to Euclidean distances between target points in the target sub-point cloud and corresponding points of the target points.
5. The method of claim 3, wherein obtaining a weight matrix from Euclidean distances between pairs of points in the target sub-point cloud comprises:
calculating the Euclidean distance between the ith point and the jth point in the target sub-point cloud;
calculating weights according to the following formula, and forming the weight matrix by using the weights:
wherein ,WijRepresenting the weight corresponding to the edge from the ith point to the jth point in the target sub-point cloud; distance represents the Euclidean distance from the ith point to the jth point; σ is a constant different from 0 and represents a tuning parameter; i is more than or equal to 1 and less than or equal to M, j is more than or equal to 1 and less than or equal to M, wherein M is an integer and is the total number of points included in the target sub-point cloud.
6. The method of claim 3, wherein generating the diagonal matrix comprises:
determining a reference point cloud of the target sub-point elements in the reference frame;
generating the diagonal matrix based on Euclidean distances between each point in the target sub-point cloud and the corresponding point of each point in the reference point cloud;
and the value on the ith diagonal of the diagonal matrix is the reciprocal of the Euclidean distance between the ith point and a point p, wherein the point p is the corresponding point of the ith point in the reference point cloud.
7. The method of claim 6, wherein determining the reference point cloud of the target sub-point element in the reference frame comprises:
and determining the reference point cloud of the target sub-point element in the reference frame by using an iterative closest point algorithm.
8. The method of claim 1, wherein the using the generalized Laplace matrix to perform inter-frame prediction and graph Fourier residual transform on the plurality of sub-point clouds respectively comprises:
obtaining an attribute predicted value of the reference frame to the target attribute of the current frame;
generating a residual error of the target attribute of the current frame according to the attribute value of the target attribute of the current frame and the attribute prediction value of the reference frame to the target attribute of the current frame;
transforming a residual of a target attribute of the current frame based on the generalized Laplace matrix.
9. The method of claim 8, wherein obtaining the attribute prediction value of the target attribute of the current frame from the reference frame comprises:
obtaining an attribute predicted value of the target attribute of the current frame by the reference frame according to the following formula:
10. The method according to claim 8, wherein the generating a residual error of the target attribute of the current frame according to the attribute value of the target attribute of the current frame and the attribute prediction value of the target attribute of the current frame by the reference frame comprises:
and using the difference between the attribute value of the target attribute of the current frame and the attribute predicted value of the reference frame to the target attribute of the current frame as the residual error.
11. The method of claim 8, wherein transforming the residual of the target property of the current frame based on the generalized Laplacian matrix comprises:
obtaining a transformation matrix by utilizing the generalized Laplace matrix;
and transforming the residual error of the target attribute of the current frame by using the transformation matrix.
13. The method of claim 11, wherein transforming the residual error of the target property of the current frame using the transformation matrix comprises:
obtaining the transformation result by using the following formula:
14. A decoding method applied to a decoding device, the method comprising:
acquiring a coding code stream;
carrying out graph Fourier inverse transformation based on Euclidean distance weight on the coded code stream to obtain a transformation result;
obtaining a decoding code stream based on the conversion result;
the coding code stream is obtained by coding a result of inter-frame prediction and image Fourier residual transformation of sub-point clouds by using a generalized Laplacian matrix.
15. The method according to claim 14, wherein said performing inverse graph fourier transform based on euclidean distance weights on said coded code stream to obtain a transform result comprises:
carrying out inverse quantization on the coded code stream;
and carrying out graph Fourier inverse transformation based on Euclidean distance weight on the coded code stream after inverse quantization to obtain a transformation result.
16. The method according to claim 15, wherein said performing inverse graph fourier transform based on euclidean distance weights on said coded code stream to obtain a transform result comprises:
and carrying out inverse graph Fourier transform based on Euclidean distance weight on the coded code stream after inverse quantization by using the following formula:
17. An encoding apparatus, comprising:
the first acquisition module is used for clustering point cloud data to be processed of a current frame to obtain a plurality of sub-point clouds;
the first generation module is used for generating a generalized Laplace matrix for any target sub-point cloud in the plurality of sub-point clouds according to Euclidean distances between a plurality of point pairs in the target sub-point cloud and Euclidean distances between target points in the target sub-point cloud and corresponding points of the target points;
the first transformation module is used for performing inter-frame prediction and image Fourier residual transformation on the target sub-point cloud by using the generalized Laplace matrix;
the first coding module is used for quantizing and coding the transformed sub-point clouds respectively to obtain a coding code stream;
and the corresponding point is positioned in a reference point cloud of the target sub-point cloud, and the reference point cloud is positioned in a reference frame of the current frame.
18. A decoding apparatus, comprising:
the first acquisition module is used for acquiring a coding code stream;
the first transformation module is used for carrying out graph Fourier inverse transformation based on Euclidean distance weight on the coding code stream to obtain a transformation result;
the first decoding module is used for obtaining a decoding code stream based on the conversion result;
the coding code stream is obtained by coding a result of inter-frame prediction and image Fourier residual transformation of sub-point clouds by coding equipment.
19. An electronic device, comprising: a memory, a processor, and a program stored on the memory and executable on the processor; it is characterized in that the preparation method is characterized in that,
the processor, which is used for reading the program in the memory to realize the steps in the coding method according to any one of claims 1 to 13; or implementing the steps in the decoding method of any of claims 14 to 16.
20. A readable storage medium for storing a program, wherein the program, when executed by a processor, implements the steps in the encoding method of any one of claims 1 to 13; or implementing the steps in the decoding method of any of claims 14 to 16.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111160289.XA CN113766229B (en) | 2021-09-30 | 2021-09-30 | Encoding method, decoding method, device, equipment and readable storage medium |
PCT/CN2022/123245 WO2023051783A1 (en) | 2021-09-30 | 2022-09-30 | Encoding method, decoding method, apparatus, device, and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111160289.XA CN113766229B (en) | 2021-09-30 | 2021-09-30 | Encoding method, decoding method, device, equipment and readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113766229A true CN113766229A (en) | 2021-12-07 |
CN113766229B CN113766229B (en) | 2023-04-28 |
Family
ID=78798550
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111160289.XA Active CN113766229B (en) | 2021-09-30 | 2021-09-30 | Encoding method, decoding method, device, equipment and readable storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113766229B (en) |
WO (1) | WO2023051783A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114785998A (en) * | 2022-06-20 | 2022-07-22 | 北京大学深圳研究生院 | Point cloud compression method and device, electronic equipment and storage medium |
WO2023051783A1 (en) * | 2021-09-30 | 2023-04-06 | 咪咕文化科技有限公司 | Encoding method, decoding method, apparatus, device, and readable storage medium |
WO2023173238A1 (en) * | 2022-03-12 | 2023-09-21 | Oppo广东移动通信有限公司 | Encoding method, decoding method, code stream, encoder, decoder, and storage medium |
CN116797625A (en) * | 2023-07-20 | 2023-09-22 | 无锡埃姆维工业控制设备有限公司 | Monocular three-dimensional workpiece pose estimation method |
WO2023245981A1 (en) * | 2022-06-20 | 2023-12-28 | 北京大学深圳研究生院 | Point cloud compression method and apparatus, and electronic device and storage medium |
WO2024077911A1 (en) * | 2022-10-13 | 2024-04-18 | Beijing Bytedance Network Technology Co., Ltd. | Method, apparatus, and medium for point cloud coding |
CN118433395A (en) * | 2024-07-02 | 2024-08-02 | 北京数原数字化城市研究中心 | Encoding method and device and electronic equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171761A (en) * | 2017-12-13 | 2018-06-15 | 北京大学 | A kind of point cloud inner frame coding method and device that transformation is schemed based on Fourier |
WO2020197086A1 (en) * | 2019-03-25 | 2020-10-01 | 엘지전자 주식회사 | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method |
CN112385238A (en) * | 2019-07-10 | 2021-02-19 | 深圳市大疆创新科技有限公司 | Data encoding method, data decoding method, equipment and storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110418135B (en) * | 2019-08-05 | 2022-05-27 | 北京大学深圳研究生院 | Point cloud intra-frame prediction method and device based on neighbor weight optimization |
CN110572655B (en) * | 2019-09-30 | 2023-01-10 | 北京大学深圳研究生院 | Method and equipment for encoding and decoding point cloud attribute based on neighbor weight parameter selection and transmission |
CN113766229B (en) * | 2021-09-30 | 2023-04-28 | 咪咕文化科技有限公司 | Encoding method, decoding method, device, equipment and readable storage medium |
-
2021
- 2021-09-30 CN CN202111160289.XA patent/CN113766229B/en active Active
-
2022
- 2022-09-30 WO PCT/CN2022/123245 patent/WO2023051783A1/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171761A (en) * | 2017-12-13 | 2018-06-15 | 北京大学 | A kind of point cloud inner frame coding method and device that transformation is schemed based on Fourier |
WO2020197086A1 (en) * | 2019-03-25 | 2020-10-01 | 엘지전자 주식회사 | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method |
CN112385238A (en) * | 2019-07-10 | 2021-02-19 | 深圳市大疆创新科技有限公司 | Data encoding method, data decoding method, equipment and storage medium |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023051783A1 (en) * | 2021-09-30 | 2023-04-06 | 咪咕文化科技有限公司 | Encoding method, decoding method, apparatus, device, and readable storage medium |
WO2023173238A1 (en) * | 2022-03-12 | 2023-09-21 | Oppo广东移动通信有限公司 | Encoding method, decoding method, code stream, encoder, decoder, and storage medium |
CN114785998A (en) * | 2022-06-20 | 2022-07-22 | 北京大学深圳研究生院 | Point cloud compression method and device, electronic equipment and storage medium |
WO2023245981A1 (en) * | 2022-06-20 | 2023-12-28 | 北京大学深圳研究生院 | Point cloud compression method and apparatus, and electronic device and storage medium |
WO2024077911A1 (en) * | 2022-10-13 | 2024-04-18 | Beijing Bytedance Network Technology Co., Ltd. | Method, apparatus, and medium for point cloud coding |
CN116797625A (en) * | 2023-07-20 | 2023-09-22 | 无锡埃姆维工业控制设备有限公司 | Monocular three-dimensional workpiece pose estimation method |
CN116797625B (en) * | 2023-07-20 | 2024-04-19 | 无锡埃姆维工业控制设备有限公司 | Monocular three-dimensional workpiece pose estimation method |
CN118433395A (en) * | 2024-07-02 | 2024-08-02 | 北京数原数字化城市研究中心 | Encoding method and device and electronic equipment |
CN118433395B (en) * | 2024-07-02 | 2024-09-03 | 北京数原数字化城市研究中心 | Encoding method and device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN113766229B (en) | 2023-04-28 |
WO2023051783A1 (en) | 2023-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113766229B (en) | Encoding method, decoding method, device, equipment and readable storage medium | |
De Queiroz et al. | Transform coding for point clouds using a Gaussian process model | |
US9349072B2 (en) | Local feature based image compression | |
US20200092584A1 (en) | Methods and devices for encoding and reconstructing a point cloud | |
CN102236675A (en) | Method for processing matched pairs of characteristic points of images, image retrieval method and image retrieval equipment | |
US20170026665A1 (en) | Method and device for compressing local feature descriptor, and storage medium | |
US20100114871A1 (en) | Distance Quantization in Computing Distance in High Dimensional Space | |
CN114245896A (en) | Vector query method and device, electronic equipment and storage medium | |
JP2011014133A (en) | Method for clustering sample using mean shift procedure | |
US12046009B2 (en) | 3D point cloud encoding and decoding method, compression method and device based on graph dictionary learning | |
Zhang et al. | Transformer and upsampling-based point cloud compression | |
CN104392207A (en) | Characteristic encoding method for recognizing digital image content | |
CN115081556A (en) | User classification method based on multi-view spectral clustering algorithm and related device | |
CN115130571A (en) | Feature encoding method, feature decoding method, feature encoding device, feature decoding device, electronic device, and storage medium | |
Gupta et al. | Hybrid edge-based fractal image encoding using K-NN search | |
CN110363713A (en) | High spectrum image noise-reduction method based on recurrence sample scaling and bilinearity Factorization | |
CN105872549A (en) | Block search and orthogonal matching pursuit based video converting and encoding method | |
Hajizadeh et al. | Predictive compression of animated 3D models by optimized weighted blending of key‐frames | |
Lee et al. | Feature map compression for video coding for machines based on receptive block based principal component analysis | |
Rahman et al. | Enhancing fractal image compression speed using peer adjacent mapping with sum of absolute difference for computed radiography images | |
Wang et al. | Fractal image encoding with flexible classification sets | |
CN114359291A (en) | Method for training instance segmentation model and instance segmentation method | |
US20180288411A1 (en) | Method for encoding/decoding video signal by using single optimized graph | |
CN114189692B (en) | Point cloud attribute coding method and decoding method based on continuous subspace diagram transformation | |
Zhang et al. | Blind image separation based on reorganization of block DCT |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |