WO2024216412A1 - 一种编码方法、编码器以及存储介质 - Google Patents
一种编码方法、编码器以及存储介质 Download PDFInfo
- Publication number
- WO2024216412A1 WO2024216412A1 PCT/CN2023/088555 CN2023088555W WO2024216412A1 WO 2024216412 A1 WO2024216412 A1 WO 2024216412A1 CN 2023088555 W CN2023088555 W CN 2023088555W WO 2024216412 A1 WO2024216412 A1 WO 2024216412A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- search
- pixel
- motion vector
- point
- macro
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 121
- 239000013598 vector Substances 0.000 claims abstract description 232
- 230000008569 process Effects 0.000 claims description 53
- 230000015654 memory Effects 0.000 claims description 28
- 238000004590 computer program Methods 0.000 claims description 13
- 229910003460 diamond Inorganic materials 0.000 claims description 8
- 239000010432 diamond Substances 0.000 claims description 8
- 238000012937 correction Methods 0.000 abstract description 5
- 238000010586 diagram Methods 0.000 description 28
- 238000001914 filtration Methods 0.000 description 10
- 238000010845 search algorithm Methods 0.000 description 9
- 238000013139 quantization Methods 0.000 description 8
- 230000006835 compression Effects 0.000 description 7
- 238000007906 compression Methods 0.000 description 7
- 238000003384 imaging method Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 4
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 3
- 230000001360 synchronised effect Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Definitions
- the embodiments of the present application relate to the field of video coding and decoding technology, and specifically to a coding method, an encoder, and a storage medium.
- the imaging model of the light field camera adds a set of microlens arrays in front of the imaging plane, so that the light of the same point on the object plane can be captured by multiple microlenses at the same time, which is equivalent to shooting the same point from multiple angles at the same time.
- the embodiments of the present application provide a coding method, an encoder, and a storage medium, which can improve the motion estimation effect of light field images and improve the robustness of the encoder.
- an embodiment of the present application provides an encoding method, which is applied to an encoder, and the method includes:
- An optimal motion vector for the current block is determined according to the third motion vector.
- an embodiment of the present application provides an encoder, the encoder comprising a determination unit and a search unit; wherein,
- the determining unit is configured as a determining unit and a searching unit; wherein:
- the determining unit is configured to determine a first motion vector of the current block
- the search unit is configured to use the pixel point pointed to by the first motion vector as a first starting point, perform a macro-pixel level search with a macro-pixel level search step length, and determine a second motion vector of the current block;
- the search unit is further configured to use the pixel point pointed to by the second motion vector as a second starting point, perform a regular pixel-level search with a regular pixel-level search step length, and determine a third motion vector of the current block;
- the determining unit is further configured to determine an optimal motion vector for the current block according to the third motion vector.
- an encoder comprising a first memory and a first processor; wherein:
- a first memory for storing a computer program that can be run on the first processor
- the first processor is configured to execute the method according to the first aspect when running a computer program.
- an embodiment of the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed, it implements the method described in the first aspect.
- the embodiment of the present application provides a coding method, an encoder and a storage medium, the method comprising: determining a first motion vector of a current block; taking a pixel point pointed by the first motion vector as a first starting point, and using a macro pixel as a The method performs a macro-pixel-level search with a macro-pixel-level search step to determine the second motion vector of the current block; uses the pixel point pointed to by the second motion vector as the second starting point, performs a conventional pixel-level search with a conventional pixel-level search step to determine the third motion vector of the current block; and determines the best motion vector of the current block based on the third motion vector.
- a conventional pixel-level fine search is added on the basis of the macro-pixel-level search, and a pixel-level offset correction is performed on the position of the candidate reference block searched at the macro-pixel level to increase its correlation with the current block to a greater extent, thereby further optimizing the motion estimation effect of the encoded light field image and improving the robustness of the encoder.
- FIG1 is a schematic diagram of a light field image according to an embodiment of the present application.
- FIG2 is a schematic diagram of an application scenario of an embodiment of the present application.
- FIG3 is a schematic block diagram of a composition of an encoder provided in an embodiment of the present application.
- FIG4 is a schematic block diagram of a decoder provided in an embodiment of the present application.
- FIG5 is a schematic diagram of a network architecture of a coding and decoding system provided in an embodiment of the present application.
- FIG6 is a schematic diagram of a flow chart of an encoding method provided in an embodiment of the present application.
- FIG7 is a schematic diagram of a diamond template in a macro-pixel level fast search process in an embodiment of the present application.
- FIG8 is a schematic diagram of a square template in a macro-pixel level fast search process in an embodiment of the present application.
- FIG9 is a first schematic diagram of a macro-pixel level fine search process in an embodiment of the present application.
- FIG10 is a second schematic diagram of a macro-pixel level fine search process in an embodiment of the present application.
- FIG11 is a schematic diagram of a conventional pixel-level search process in an embodiment of the present application.
- FIG12 is a schematic diagram of the structure of an encoder provided in an embodiment of the present application.
- FIG13 is a schematic diagram of a specific hardware structure of an encoder provided in an embodiment of the present application.
- FIG. 14 is a schematic diagram of the composition structure of a coding and decoding system provided in an embodiment of the present application.
- first ⁇ second ⁇ third involved in the embodiments of the present application are only used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that “first ⁇ second ⁇ third” can be interchanged in a specific order or sequence where permitted, so that the embodiments of the present application described here can be implemented in an order other than that illustrated or described here.
- a first image component, a second image component, and a third image component are generally used to represent a coding block (CB).
- the three image components are a brightness component, a blue chrominance component, and a red chrominance component.
- the brightness component is usually represented by the symbol Y
- the blue chrominance component is usually represented by the symbol
- the red chrominance component is usually represented by the symbol Cb or U, and the symbol Cr or V; in this way, the video image can be represented in YCbCr format or YUV format.
- MPEG Moving Picture Experts Group
- JVET Joint Video Experts Team
- LVC Lenslet video coding
- VTM VVC Test Model
- Motion Vector is also called motion vector
- CAVLC Context-based Adaptive Variable-Length code
- CABAC Context-based Adaptive Binary Arithmetic Coding
- SRCC Scan Region based Coefficient Coding
- digital video compression technology is mainly used to compress huge digital video data for transmission and storage.
- the existing digital video compression standards can save a lot of video data, there is still a need to pursue better digital video compression technology to reduce the bandwidth and traffic pressure of digital video transmission.
- the light field camera adds a set of microlens arrays in front of the imaging plane. This allows the light from the same point on the object plane to be captured by multiple microlenses at the same time, which is equivalent to shooting the same point from multiple angles at the same time. Due to its special imaging model, the visual effect of light field images is very different from that of traditional images. This also leads to the fact that the compression method for general images or videos is not effective when processing light field images or videos.
- the MPEG LVC working group is to study compression methods that are more suitable for light field videos.
- Light field images are composed of a series of regularly arranged macro pixels, as shown in Figure 1.
- the search method based on macro pixel units can make more full use of the correlation of light field images, thereby bringing more efficient compression performance.
- searching only in macro pixel units may lose some local optimal points.
- the matching blocks may move across macro pixels between different frames, and there are also differences such as parallax and size between adjacent macro pixels, the best candidates for matching blocks may not be strictly arranged according to the macro pixel spacing.
- Fig. 2 is a schematic diagram of an application scenario of an embodiment of the present application, where a multi-camera or a multi-camera array captures a light field video, outputs light field video data, and the encoder compresses and transmits it to the decoding end, and the decoder decompresses the light field video data and displays it.
- the codec part can use existing video codec tools (for example, AVC, HEVC, or VVC, etc.).
- the encoder 100 may include a transform and quantization unit 101, an intra-frame estimation unit 102, an intra-frame prediction unit 103, a motion compensation unit 104, a motion estimation unit 105, an inverse transform and inverse quantization unit 106, a filter control analysis unit 107, a filtering unit 108, an encoding unit 109, and a decoded image buffer unit 110.
- the filtering unit 108 can implement deblocking filtering and sample adaptive offset (SAO) filtering, and the encoding unit 109 can implement header information encoding and context-based adaptive binary arithmetic coding (CABAC).
- a video coding block can be obtained by dividing the video coding block into a video coding unit (CTU), and then the residual pixel information obtained after intra-frame or inter-frame prediction is transformed by the transform and quantization unit 101, including transforming the residual information from the pixel domain to the transform domain, and quantizing the obtained transform coefficients to further reduce the bit rate;
- the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used to perform intra-frame prediction on the video coding block; specifically, the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used to determine the intra-frame prediction mode to be used to encode the video coding block;
- the motion compensation unit 104 and the motion estimation unit 105 are used to perform inter-frame prediction coding of the received video coding block relative to one or
- the intra prediction unit 103 is used to perform motion compensation; after determining the intra prediction mode, the intra prediction unit 103 is also used to provide the selected intra prediction data to the encoding unit 109, and the motion estimation unit 105 also sends the calculated motion vector data to the encoding unit 109; in addition, the inverse transform and inverse quantization unit 106 is used to reconstruct the video coding block, reconstruct the residual block in the pixel domain, and the reconstructed residual block is removed by the filter control analysis unit 107 and the filtering unit 108.
- the encoding unit 109 is used to encode various coding parameters and quantized transform coefficients.
- the context content can be based on adjacent coding blocks and can be used to encode information indicating the determined intra prediction mode and output the code stream of the video signal; and the decoded image buffer unit 110 is used to store the reconstructed video coding block for prediction reference. As the video image encoding proceeds, new reconstructed video encoding blocks are continuously generated, and these reconstructed video encoding blocks are stored in the decoded image buffer unit 110 .
- the decoder 200 includes a decoding unit 201, an inverse transform and inverse quantization unit 202, an intra-frame prediction unit 203, a motion compensation unit 204, a filtering unit 205, and a decoded image cache unit 206, etc., wherein the decoding unit 201 can implement header information decoding and CABAC decoding, and the filtering unit 205 can implement deblocking filtering and SAO filtering.
- the decoding unit 201 can implement header information decoding and CABAC decoding
- the filtering unit 205 can implement deblocking filtering and SAO filtering.
- a code stream of the video signal is output; the code stream is input to the decoder 200, and first passes through the decoding unit 201 to obtain the decoded transform coefficients; the transform coefficients are processed by the inverse transform and inverse quantization unit 202 to generate residual blocks in the pixel domain; the intra-frame prediction unit 203 can be used to generate prediction data for the current video decoding block based on the determined intra-frame prediction mode and the data from the previously decoded blocks of the current frame or picture; the motion compensation unit 204 is to determine the prediction information for the video decoding block by analyzing the motion vector and other associated syntax elements, and use The prediction information is used to generate a predictive block of the video decoding block being decoded; a decoded video block is formed by summing the residual block from the inverse transform and inverse quantization unit 202 and the corresponding predictive block generated by the intra-frame prediction unit 203 or the motion compensation unit 204; the decoded video signal passes through the filtering unit 205 to remove the block effect artifacts,
- the embodiment of the present application also provides a network architecture of a codec system including an encoder and a decoder, wherein FIG5 shows a schematic diagram of a network architecture of a codec system provided by the embodiment of the present application.
- the network architecture includes one or more electronic devices 13 to 1N and a communication network 01, wherein the electronic devices 13 to 1N can perform video interaction through the communication network 01.
- the electronic device can be various types of devices with video codec functions, for example, the electronic device can include a smart phone, a tablet computer, Personal computers, personal digital assistants, navigators, digital phones, video phones, televisions, sensor devices, servers, etc. are not specifically limited here.
- the decoder or encoder described in the embodiments of the present application can be the above electronic devices.
- the method of the embodiment of the present application is mainly applied to the motion estimation unit 105 part of the inter-frame prediction as shown in Figure 3.
- the motion estimation unit 105 obtains the motion vector
- the motion compensation unit 104 performs motion compensation based on the motion vector determined by the motion estimation unit 105
- the encoding unit 109 encodes the motion vector data
- the motion compensation unit 204 of the decoder decodes the motion vector data for motion compensation.
- the "current block” specifically refers to the coding block currently to be inter-frame predicted.
- FIG6 a schematic diagram of a flow chart of an encoding method provided by an embodiment of the present application is shown. As shown in FIG6 , the method may include:
- S601 Determine a first motion vector of a current block
- the pixel point pointed to by the first motion vector is the starting point of a macro-pixel level search, which is used to locate the candidate reference block of the current block.
- the method further includes: constructing a motion vector candidate list, and determining a first motion vector.
- the first motion vector may be a motion vector in the motion vector candidate list.
- the first motion vector may be a motion vector corresponding to a minimum rate-distortion cost value in a motion vector candidate list.
- determining the first motion vector of the current block includes: constructing a motion vector candidate list; searching the motion vector candidate list to determine a rate-distortion cost value corresponding to each motion vector; and determining the motion vector corresponding to the minimum rate-distortion cost value as the first motion vector.
- the method further includes: before performing the first macro-pixel level search, constructing a motion vector candidate list and determining a first motion vector.
- the initial search point is determined based on the AMVP technology of HEVC: among the candidate MVs given by AMVP, the encoder selects the MV with the smallest rate-distortion cost and uses the position pointed to by the MV as the initial search point
- the first motion vector may be the second motion vector obtained in the previous macro-pixel level search.
- the first motion vector may be the third motion vector obtained in the last regular pixel level search.
- the macro-pixel level search step size includes a macro-pixel level horizontal step size and a macro-pixel level vertical step size.
- the macro-pixel level horizontal step size is greater than or equal to 0
- the macro-pixel level vertical step size is greater than or equal to 0.
- the macro-pixel level horizontal step size may be an integer multiple of the macro-pixel horizontal size
- the macro-pixel level vertical step size may be an integer multiple of the macro-pixel vertical size.
- the macro-pixel level horizontal step size may also be expressed as an integer multiple of the macro-pixel horizontal spacing
- the macro-pixel level vertical step size may also be expressed as an integer multiple of the macro-pixel vertical spacing.
- the macro-pixel level search process with a macro-pixel level search step can be to start from a first starting point, search with a macro-pixel level search step, determine the rate-distortion cost value when predicting using the candidate reference block pointed to by each search point, determine the search point corresponding to the minimum rate-distortion cost value, and use the motion vector of the point as the second motion vector.
- the macro pixel level search process can be implemented based on any one or more search algorithms. Exemplary, full search algorithm, three-step search algorithm, four-step search algorithm, diamond search algorithm, square search algorithm, regular hexagon search algorithm, Regular octagon search algorithm, etc.
- the macro-pixel level search process may include a fast search, which may also be understood as a coarse search.
- the macro-pixel level search process may include a fine search.
- the macro-pixel level search process may include a fast search and a fine search.
- the macro-pixel level search process includes: starting from a first starting point, performing a macro-pixel level fast search with a first macro-pixel level search step length, and determining a first search point; starting from the first search point, performing a macro-pixel level fine search with a second macro-pixel level search step length, and determining a second search point; wherein the first macro-pixel level search step length is greater than or equal to the second macro-pixel level search step length; and using the motion vector corresponding to the second search point as the second motion vector.
- the first macro-pixel level search step size remains unchanged or changes.
- the second macro-pixel level search step size remains unchanged or changes. In other words, the first macro-pixel level search step size and the second macro-pixel level search step size are not used to limit a specific macro-pixel level search step size.
- the motion vector search step size is set to a multiple of the light field macro pixel spacing each time, as shown in formula (1):
- dx and dy are the motion vector offsets in this iteration
- dx is the horizontal motion vector offset
- dy is the vertical motion vector offset
- dx is the motion vector offsets in the previous iteration
- ⁇ stepX and ⁇ stepY are the search steps added in this iteration
- Lx1 and Ly1 are the step lengths of each step, which can also be understood as step length units.
- the step length unit is set to a light field macro pixel size, including the size in the horizontal direction and the vertical direction.
- the step length unit can also be set to or the spacing of a light field macro pixel, including the spacing in the horizontal direction and the vertical direction.
- Lx1 ⁇ stepX and Ly1 ⁇ stepY are the increase in the motion vector offset in this iteration, which can also be understood as the search step length added in this iteration.
- the fast search process may include: a first macro-pixel level search step starts from a macro-pixel, and the first macro-pixel level search step increases by an integer power of 2, searching according to a search template, and determining a rate-distortion cost value for each search point; determining the search point corresponding to the minimum rate-distortion cost value as the first search point.
- the first macro-pixel level search step starts with a macro-pixel unit (or a macro-pixel spacing), increases by an integer power of 2, and searches are performed within a specified search range according to the search template.
- the search template includes one of the following: a diamond template, a square template, a regular hexagonal template, a regular octagonal template, etc.
- Figure 7 is a schematic diagram of a diamond template for a macro-pixel-level fast search process in an embodiment of the present application.
- the initial search point is the starting point
- the search step starts from 1 unit, and increases in the form of an integer power of 2, and searches are performed within a specified search range according to the diamond template, and the search point with the smallest rate-distortion cost is selected as the result of this step, where the step unit is set to the macro-pixel spacing of the light field image.
- each search point in Figure 7 can be understood as a candidate reference block of the current block, the search point is the upper left corner pixel of the candidate reference block, and the box is only for schematically indicating the position of the search point and representing the size of the candidate reference block.
- Figure 8 is a schematic diagram of a square template for a macro-pixel-level fast search process in an embodiment of the present application.
- the initial search point is the starting point
- the search step starts from 1 unit, and increases in the form of an integer power of 2, and searches are performed within a specified search range according to the square template, and the search point with the smallest rate-distortion cost is selected as the result of this step, where the step unit is set to the macro-pixel spacing of the light field image.
- the shape of the macro pixel is not limited to a circle, but may also be a square, a regular hexagon, a regular octagon, etc.
- the fine search process includes: in the fast search process, determining the search step length corresponding to the first search point; when the search step length corresponding to the first search point is less than or equal to the first step length threshold, in the search mode
- the panel searches for two unsearched points closest to the first search point at a second macro-pixel level search step, and determines the point corresponding to the minimum rate distortion cost value as the second search point; when the search step corresponding to the first search point is greater than the second step threshold, a full search is performed in units of macro-pixels within a first search range with the first search point as the center point, and determines the search point corresponding to the minimum rate distortion cost value as the second search point; wherein the second step threshold is greater than or equal to the first step threshold.
- the first search range includes the first search point and a plurality of adjacent points around it.
- the first search range is an M ⁇ M search range, and a full search is performed on the first search range to determine the search point corresponding to the minimum rate distortion cost value as the second search point.
- first step length threshold and the second step length threshold are also at the macro-pixel level. In some embodiments, the first step length threshold and the second step length threshold are equal, and both are one macro-pixel unit. In some embodiments, the first step length threshold and the second step length threshold are not equal, for example, the first step length threshold can be one macro-pixel unit, and the first step length threshold can be two macro-pixel units.
- the second macro-pixel level search step size is one macro-pixel unit.
- the first search range includes an M ⁇ M search range with the first search point as the center point, wherein the value of M is greater than or equal to 3. It should be noted that the first search range is also determined in units of macropixels, and the first search range can also be understood as M ⁇ M macropixels with the macropixel where the first search point is located as the center point.
- FIG9 is a first schematic diagram of the macro-pixel-level fine search process in an embodiment of the present application.
- the optimal point of the fast search corresponds to a step size of 1
- a two-point search is performed around the point, and the search point with the smallest rate-distortion cost is selected as the result of the fine search.
- the search step sizes are all in macro-pixels.
- the optimal point may be the pixel point corresponding to the 4 boxes in FIG9 , and the upper point is the optimal point, and the search for points a and b continues.
- the optimal point may be the 8 adjacent points around the center point. Assuming that the point in the upper left corner is the optimal point, the search for points a and c continues.
- Figure 10 is a second schematic diagram of the macro-pixel level fine search process in an embodiment of the present application.
- a certain threshold for example, 1, 2, 4, etc.
- a full search is performed within a certain search range (for example, an M ⁇ M search range) with the point as the center, and the search point with the minimum rate distortion cost is selected as the result of the fine search, and the search step size here is also in macro-pixels.
- the macro-pixel level search process further includes: when the search step corresponding to the first search point is greater than the first step threshold and less than or equal to the second step threshold, no macro-pixel level fine search is performed, and the motion vector corresponding to the first search point is used as the second motion vector.
- the first step threshold is less than the second step threshold, and when the search step corresponding to the first search point is greater than the first step threshold and less than or equal to the second step threshold, the first search point is directly used as the result of the macro-pixel level search without performing a fine search.
- the macro-pixel level search may also be based on a conventional TZsearch search algorithm.
- a conventional pixel-level search is performed with the pixel point determined by the macro-pixel-level search as the second starting point, and a pixel-level offset correction is performed on the search point obtained based on the macro-pixel-level search.
- a regular pixel can be understood as a single pixel constituting a pixel array, and a macro pixel includes multiple regular pixels.
- the pixel point pointed to by the second motion vector is the starting point of a regular pixel-level search, which is used to locate the candidate reference block of the current block. Since a macro pixel includes multiple regular pixels in the horizontal and vertical dimensions, the regular pixel-level search step size is smaller than the macro pixel-level search step size, and the regular pixel-level search is a fine-grained search compared to the macro pixel-level search.
- the conventional pixel-level search step size includes a conventional pixel-level horizontal step size and a conventional pixel-level vertical step size.
- the conventional pixel level horizontal step size is an integer multiple of the conventional pixel horizontal size
- the conventional pixel level vertical step size is an integer multiple of the conventional pixel vertical size.
- the conventional pixel level horizontal step size can also be expressed as an integer multiple of the macro pixel horizontal spacing
- the conventional pixel level vertical step size can also be expressed as an integer multiple of the conventional pixel vertical spacing.
- the regular pixel level search process with a regular pixel level search step size can be that starting from the second starting point, searching with a regular pixel level search step size, determining the rate-distortion cost value when predicting using the candidate reference block pointed to by each search point, determining the search point corresponding to the minimum rate-distortion cost value, and using the motion vector of the point as the third motion vector.
- the motion vector search step size is set to a multiple of the conventional pixel spacing each time, as shown in formula (2):
- dx and dy are the motion vector offsets in this iteration
- dx is the horizontal motion vector offset
- dy is the vertical motion vector offset
- dx , and dy are the motion vector offsets in the previous iteration
- ⁇ stepX and ⁇ stepY are the search steps added in this iteration
- Lx2 and Ly2 are the step lengths of each step, which can also be understood as step length units.
- the step length unit is set to a conventional pixel size, including the size in the horizontal and vertical directions. It should be noted that when searching at the conventional pixel level, the step length unit can also be set to a regular pixel spacing, including the spacing in the horizontal and vertical directions.
- the conventional pixel-level search process may include a fast search, which may also be understood as a coarse search.
- the conventional pixel-level search process may include a fine search.
- the conventional pixel-level search process may include a fast search and a fine search.
- the conventional pixel level search step length may be a fixed value or a variable.
- the conventional pixel level search step length may be a conventional pixel unit.
- the conventional pixel-level search process includes: searching for adjacent points within a second search range with the second starting point as the center point, and determining a third search point corresponding to the minimum rate distortion cost value; searching for adjacent points within the second search range with the third search point as the center point, and determining a fourth search point corresponding to the minimum rate distortion cost value; when the second starting point and the fourth search point are the same, determining the motion vector of the second starting point as the third motion vector; when the second starting point and the fourth search point are not the same, searching for adjacent points within the second search range with the fourth search point as the center point, until the search point determined for the i-th time is the same as the search point determined for the (i+2)th time, and determining the motion vector of the search point determined for the i-th time as the third motion vector.
- the second search range includes each search point and multiple adjacent points around it.
- the second search range is an N ⁇ N search range, and adjacent points within the second search range are searched to determine the search point corresponding to the minimum rate distortion cost value as the second search point.
- the conventional pixel-level search process also includes: initializing the first search times; after each search, the first search times performs a self-increment operation; when the first search times is equal to the first search times threshold, the motion vector of the search point corresponding to the minimum rate-distortion cost value in the conventional pixel-level search process is used as the third motion vector.
- a search times threshold can also be set for the conventional pixel-level search. When the search times reaches a certain threshold, the search is terminated in advance, and the optimal point with the minimum rate-distortion cost is used as the search result of the conventional pixel-level search. It should be noted that the search performed each time the center point is moved during the conventional pixel-level search process can be understood as a search process.
- search eight pixel points in the area of the point select the optimal point with the smallest rate-distortion cost, and use the optimal point as the new center point to perform the same pixel-level fine search in the neighborhood again.
- the optimal point selected in a certain search is consistent with the previously selected point, the optimal point is used as the result of this step; when the number of searches reaches a certain threshold, the search is terminated in advance, and the optimal point with the smallest rate-distortion cost is used as the result of this step.
- FIG 11 is a schematic diagram of a conventional pixel-level search process in an embodiment of the present application.
- pixel point "0" is the initial pixel point, and eight pixel points around it are searched with it as the center to obtain the optimal pixel point "1"; when "1" is selected, the optimal point is selected.
- the same eight-point search is performed with “2" as the center, and "3” is obtained by analogy.
- "2" is obtained with "3" as the center, which is consistent with the previous selection. Therefore, the motion vector corresponding to pixel "2" is the result of this step.
- the conventional pixel-level search process includes: performing a full search in units of conventional pixels within a second search range with the second starting point as the center point, and determining a third search point corresponding to the minimum rate distortion cost value; when the second starting point and the third search point are the same, determining the motion vector of the second starting point as the third motion vector; when the second starting point and the third search point are not the same, performing a full search in units of conventional pixels within the second search range with the third search point as the center point, until the search point determined for the i-th time is the same as the search point determined for the (i+1)th time, and determining the motion vector of the search point determined for the i-th time as the third motion vector.
- the second search range includes each search point and multiple adjacent points around it.
- the second search range is an N ⁇ N search range, and the second search range is fully searched to determine the search point corresponding to the minimum rate distortion cost value as the second search point. It should be noted that the second search range is determined in units of regular pixels.
- a full search is performed within a 3 ⁇ 3 search range, and the optimal point with the smallest rate-distortion cost is selected, and the optimal point is used as the new center point to perform a pixel-level fine search in the same neighborhood again.
- the optimal point selected in a certain search is consistent with the previously selected point, the optimal point is used as the result of this step; when the number of searches reaches a certain threshold, the search is terminated in advance, and the optimal point with the smallest rate-distortion cost is used as the result of this step.
- pixel point "0" is the initial pixel point, and a full search is performed with it as the center to obtain the optimal pixel point "1"; the same search is performed with "1” as the center to obtain "2”; the same search is performed with "2" as the center to obtain “2”; it is consistent with the previously selected point, so the motion vector corresponding to pixel point "2" is the result of this step.
- S604 Determine the best motion vector of the current block according to the third motion vector.
- the method further includes: after the current regular pixel-level search is completed, using the third motion vector obtained by the current regular pixel-level search as a new first motion vector, performing a next macro-pixel-level search and a next regular pixel-level search, and determining a third motion vector obtained by the next regular pixel-level search;
- Determining the best motion vector of the current block according to the third motion vector includes: when the third motion vector obtained by the current regular pixel level search is the same as the third motion vector obtained by the next regular pixel level search, the search ends and the best motion vector of the current block is determined.
- S601 to S603 can be understood as a search process, and each search process determines a third motion vector.
- S601 to S603 are repeated until the points searched twice are the same, that is, the third motion vectors are the same, and the search ends, and the third motion vector is used as the best motion vector.
- the method further includes: before performing the first macro-pixel level search, initializing the second search number; after each macro-pixel level search and the regular pixel level search are completed, the second search number is incremented by 1;
- Determining the best motion vector of the current block according to the third motion vector includes: when the second search times is equal to the second search times threshold, the search ends, and the third motion vector obtained by the last conventional pixel-level search is used as the best motion vector of the current block.
- a second search times threshold value may be set for the entire search process. When the second search times reaches a certain threshold value, the search is terminated in advance, and the last search result is used as the final search result.
- the method further includes: determining a motion vector prediction value of the current block; determining a motion vector residual value of the current block according to the motion vector prediction value and the best motion vector of the current block; encoding the motion vector residual value of the current block, and writing the obtained coded bits into the bitstream.
- the decoding bitstream is determined to determine the motion vector residual value of the current block, motion compensation is performed, a motion vector prediction value is determined, and the motion vector of the current block is determined according to the motion vector prediction value and the motion vector residual value of the current block.
- the above encoding method adds conventional pixel-level fine search on the basis of macro-pixel-level search, which is more comprehensive.
- the local optimum is taken into account, the search results are further refined, and the motion estimation performance is improved; the spacing between macro pixels on the light field image is difficult to ensure strict consistency, and the introduction of pixel-level fine search improves the robustness.
- FIG12 shows a schematic diagram of the composition structure of an encoder provided by an embodiment of the present application.
- the encoder 120 may include a determination unit 1201 and a search unit 1202; wherein,
- a determining unit 1201 is configured to determine a first motion vector of a current block
- the search unit 1202 is configured to use the pixel point pointed to by the first motion vector as a first starting point, perform a macro-pixel level search with a macro-pixel level search step, and determine a second motion vector of the current block;
- the search unit 1202 is further configured to use the pixel point pointed to by the second motion vector as a second starting point, perform a regular pixel-level search with a regular pixel-level search step length, and determine a third motion vector of the current block;
- the determining unit 1201 is further configured to determine an optimal motion vector of the current block according to the third motion vector.
- the search unit 1202 is configured to start from a first starting point, perform a macro-pixel level fast search with a first macro-pixel level search step to determine a first search point; start from the first search point, perform a macro-pixel level fine search with a second macro-pixel level search step to determine a second search point; wherein the first macro-pixel level search step is greater than or equal to the second macro-pixel level search step; and use the motion vector corresponding to the second search point as the second motion vector.
- the search unit 1202 is configured to start a first macro-pixel-level search step from one macro-pixel, and the first macro-pixel-level search step increases by an integer power of 2, search according to the search template, and determine the rate-distortion cost value of each search point; determine the search point corresponding to the minimum rate-distortion cost value as the first search point.
- the search unit 1202 is configured to determine the search step corresponding to the first search point during the fast search process; when the search step corresponding to the first search point is less than or equal to the first step threshold, search the two unsearched points closest to the first search point in the search template with the second macro-pixel level search step, and determine the point corresponding to the minimum rate distortion cost value as the second search point; when the search step corresponding to the first search point is greater than the second step threshold, perform a full search in units of macro-pixels within the first search range with the first search point as the center point, and determine the search point corresponding to the minimum rate distortion cost value as the second search point; wherein the second step threshold is greater than or equal to the first step threshold.
- the search unit 1202 is further configured to not perform macro-pixel level fine search when the search step corresponding to the first search point is greater than the first step threshold and less than or equal to the second step threshold, and to use the motion vector corresponding to the first search point as the second motion vector.
- the second macro-pixel level search step size is one macro-pixel unit.
- the first search range includes an M ⁇ M search range with the first search point as the center point; wherein the value of M is greater than or equal to 3.
- the search template includes one of the following: a diamond template, a square template, a regular hexagonal template, and a regular octagonal template.
- the macropixel level search step includes a macropixel level horizontal step and a macropixel level vertical step; the macropixel level horizontal step is an integer multiple of the macropixel horizontal size, and the macropixel level vertical step is an integer multiple of the macropixel vertical size.
- the search unit 1202 is configured to search for adjacent points within a second search range with the second starting point as the center point, and determine a third search point corresponding to the minimum rate distortion cost value; search for adjacent points within the second search range with the third search point as the center point, and determine a fourth search point corresponding to the minimum rate distortion cost value; when the second starting point and the fourth search point are the same, determine the motion vector of the second starting point as the third motion vector; when the second starting point and the fourth search point are not the same, search for adjacent points within the second search range with the fourth search point as the center point, until the search point determined for the i-th time is the same as the search point determined for the (i+2)th time, and determine the motion vector of the search point determined for the i-th time as the third motion vector.
- the search unit 1202 is configured to perform a full search in a second search range with the second starting point as the center point in units of regular pixels to determine a third search point corresponding to the minimum rate distortion cost value; when the second starting point and the third search point are the same, determine the motion vector of the second starting point as the third motion vector; when the second starting point and the third search point are not the same, perform a full search in units of regular pixels within the second search range with the third search point as the center point, until the search point determined for the i-th time is the same as the search point determined for the (i+1)th time, and determine the motion vector of the search point determined for the i-th time as the third motion vector.
- the search unit 1202 is configured to initialize a first search number; after each search is completed, the first search number is incremented by 1; when the first search number is equal to the first number threshold, the motion vector of the search point corresponding to the minimum rate distortion cost value in the conventional pixel-level search process is set as the third motion vector.
- the regular pixel level search step size includes a regular pixel level horizontal step size and a regular pixel level vertical step size; the regular pixel level horizontal step size is an integer multiple of the regular pixel horizontal size, and the regular pixel level vertical step size is an integer multiple of the regular pixel vertical size.
- the determination unit 1201 is configured to construct a motion vector candidate list; search the motion vector candidate list to determine the rate-distortion cost value corresponding to each motion vector; and determine the motion vector corresponding to the minimum rate-distortion cost value as the first motion vector.
- the determination unit 1201 is configured to, after the current regular pixel-level search is completed, use the third motion vector obtained by the current regular pixel-level search as the new first motion vector; perform the next macro pixel-level search and the next regular pixel-level search, and determine the third motion vector obtained by the next regular pixel-level search;
- the determination unit 1201 is configured to terminate the search and determine the best motion vector of the current block when the third motion vector obtained by the current regular pixel-level search is the same as the third motion vector obtained by the next regular pixel-level search.
- the determination unit 1201 is configured to initialize the second search times before performing the first macro-pixel level search; after each macro-pixel level search and regular pixel level search, the second search times performs a self-increment operation; when the second search times is equal to the second search times threshold, the search ends, and the third motion vector obtained from the last regular pixel level search is used as the optimal motion vector of the current block.
- the determination unit 1201 is configured to determine the motion vector prediction value of the current block; determine the motion vector residual value of the current block based on the motion vector prediction value and the optimal motion vector of the current block; the encoder may also include an encoding unit 1203, configured to encode the motion vector residual value of the current block and write the obtained encoded bits into the bitstream.
- each functional unit of the encoder is used to execute the encoding method described in any one of the aforementioned embodiments.
- a "unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course, it may be a module, or it may be non-modular.
- the components in the present embodiment may be integrated into a processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit may be implemented in the form of hardware or in the form of a software functional module.
- the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
- the technical solution of this embodiment is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product.
- the computer software product is stored in a storage medium, including several instructions for a computer device (which can be a personal computer, server, or network device, etc.) or a processor to perform all or part of the steps of the method of this embodiment.
- the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, etc., various media that can store program codes.
- the embodiment of the present application provides a computer-readable storage medium, which is applied to the encoder 120.
- the machine-readable storage medium stores a computer program, and when the computer program is executed by the first processor, the method of any one of the aforementioned embodiments is implemented.
- the encoder 120 may include: a first communication interface 1301, a first memory 1302 and a first processor 1303; each component is coupled together through a first bus system 1304. It can be understood that the first bus system 1304 is used to achieve connection and communication between these components. In addition to the data bus, the first bus system 1304 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are labeled as the first bus system 1304 in Figure 13. Among them,
- the first communication interface 1301 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
- the first processor 1303 is configured to execute, when running the computer program:
- An optimal motion vector for the current block is determined according to the third motion vector.
- the first memory 1302 in the embodiment of the present application can be a volatile memory or a non-volatile memory, or can include both volatile and non-volatile memories.
- the non-volatile memory can be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory.
- the volatile memory can be a random access memory (RAM), which is used as an external cache.
- RAM static RAM
- DRAM dynamic RAM
- SDRAM synchronous DRAM
- DDRSDRAM double data rate synchronous DRAM
- ESDRAM enhanced SDRAM
- SLDRAM synchronous link DRAM
- DRRAM direct RAM bus RAM
- the first processor 1303 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by the hardware integrated logic circuit or software instructions in the first processor 1303.
- the above-mentioned first processor 1303 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
- DSP Digital Signal Processor
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- the methods, steps and logic block diagrams disclosed in the embodiments of the present application can be implemented or executed.
- the general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc.
- the steps of the method disclosed in the embodiments of the present application can be directly embodied as a hardware decoding processor to execute, or the hardware and software modules in the decoding processor can be executed.
- the software module can be located in a mature storage medium in the field such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory or an electrically erasable programmable memory, a register, etc.
- the storage medium is located in the first memory 1302, and the first processor 1303 reads the information in the first memory 1302 and completes the steps of the above method in combination with its hardware.
- the processing unit can be realized in one or more application specific integrated circuits (Application Specific Integrated Circuits, ASIC), digital signal processor (Digital Signal Processing, DSP), digital signal processing device (DSP Device, DSPD), programmable logic device (Programmable Logic Device, PLD), field programmable gate array (Field-Programmable Gate Array, FPGA), general processor, controller, microcontroller, microprocessor, other electronic units or their combination for performing the functions of the present application.
- the technology of the present application can be realized by the module (such as process, function, etc.) that performs the functions of the present application.
- the software code can be stored in a memory and executed by a processor.
- the memory can be realized in a processor or outside a processor.
- the first processor 1303 is further configured to execute any method in the foregoing embodiments when running a computer program.
- the present embodiment provides an encoder, in which, when performing motion estimation, a conventional pixel-level fine search is added on the basis of a macro-pixel-level search, and a pixel-level offset correction is performed on the candidate reference block position searched at the macro-pixel level to increase the correlation with the current block to a greater extent, thereby further optimizing the motion estimation effect of the encoded light field image and improving the robustness of the encoder.
- the coding and decoding system 140 may include an encoder 1401 and a decoder 1402 .
- the encoder 1401 may be the encoder described in any one of the aforementioned embodiments
- the decoder 1402 may be the decoder described in any one of the aforementioned embodiments.
- the methods disclosed in the several method embodiments provided in this application can be arbitrarily combined without conflict to obtain a new method embodiment.
- the features disclosed in the several product embodiments provided in this application can be arbitrarily combined without conflict to obtain a new product embodiment.
- the features disclosed in the several method or device embodiments provided in this application can be arbitrarily combined without conflict to obtain a new method embodiment or device embodiment.
- a coding method, an encoder and a storage medium comprising: determining a first motion vector of a current block; using the pixel point pointed to by the first motion vector as a first starting point, performing a macro-pixel-level search with a macro-pixel-level search step, and determining a second motion vector of the current block; using the pixel point pointed to by the second motion vector as a second starting point, performing a conventional pixel-level search with a conventional pixel-level search step, and determining a third motion vector of the current block; and determining the best motion vector of the current block according to the third motion vector.
- a conventional pixel-level fine search is added on the basis of the macro-pixel-level search, and a pixel-level offset correction is performed on the candidate reference block position searched at the macro-pixel level, so that it increases its correlation with the current block to a greater extent, thereby further optimizing the motion estimation effect of the encoded light field image and improving the robustness of the encoder.
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本申请实施例公开了一种编码方法、编码器以及存储介质,该方法包括:确定当前块的第一运动矢量;将第一运动矢量指向的像素点作为第一起始点,以宏像素级搜索步长进行宏像素级搜索,确定当前块的第二运动矢量;将第二运动矢量指向的像素点作为第二起始点,以常规像素级搜索步长进行常规像素级搜索,确定当前块的第三运动矢量;根据第三运动矢量,确定当前块的最佳运动矢量。如此,在宏像素级搜索的基础上增加了常规像素级精细搜索,对宏像素级搜索到的候选参考块位置进行像素级的偏移修正,使其更大程度地增加与当前块的相关性,从而进一步优化编码光场图像的运动估计效果,提高编码器鲁棒性。
Description
本申请实施例涉及视频编解码技术领域,具体涉及一种编码方法、编码器以及存储介质。
光场相机的成像模型在成像平面前增加了一组微透镜阵列。使得物体平面的同一个点的光线可以同时被多个微透镜捕获,相当于同时对同一点从多个角度拍摄。
由于光场相机特殊的成像模型,光场图像的视觉效果与常规图像差距很大。这也导致对于传统图像或视频的压缩方法,在处理光场图像或视频时效果不佳。
发明内容
本申请实施例提供一种编码方法、编码器以及存储介质,能够提高光场图像的运动估计效果,提高编码器鲁棒性。
本申请实施例的技术方案可以如下实现:
第一方面,本申请实施例提供了一种编码方法,应用于编码器,该方法包括:
确定当前块的第一运动矢量;
将所述第一运动矢量指向的像素点作为第一起始点,以宏像素级搜索步长进行宏像素级搜索,确定当前块的第二运动矢量;
将所述第二运动矢量指向的像素点作为第二起始点,以常规像素级搜索步长进行常规像素级搜索,确定当前块的第三运动矢量;
根据所述第三运动矢量,确定当前块的最佳运动矢量。
第二方面,本申请实施例提供了一种编码器,该编码器包括确定单元和搜索单元;其中,
所述确定单元,配置为确定单元和搜索单元;其中:
所述确定单元,配置为确定当前块的第一运动矢量;
所述搜索单元,配置为将所述第一运动矢量指向的像素点作为第一起始点,以宏像素级搜索步长进行宏像素级搜索,确定当前块的第二运动矢量;
所述搜索单元,还配置为将所述第二运动矢量指向的像素点作为第二起始点,以常规像素级搜索步长进行常规像素级搜索,确定当前块的第三运动矢量;
所述确定单元,还配置为根据所述第三运动矢量,确定当前块的最佳运动矢量。
第三方面,本申请实施例提供了一种编码器,该编码器包括第一存储器和第一处理器;其中,
第一存储器,用于存储能够在第一处理器上运行的计算机程序;
第一处理器,用于在运行计算机程序时,执行如第一方面所述的方法。
第四方面,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,所述计算机程序被执行时实现如第一方面所述的方法。
本申请实施例提供了一种编码方法、编码器以及存储介质,该方法包括:确定当前块的第一运动矢量;将所述第一运动矢量指向的像素点作为第一起始点,以宏像素
级搜索步长进行宏像素级搜索,确定当前块的第二运动矢量;将所述第二运动矢量指向的像素点作为第二起始点,以常规像素级搜索步长进行常规像素级搜索,确定当前块的第三运动矢量;根据所述第三运动矢量,确定当前块的最佳运动矢量。如此,在宏像素级搜索的基础上增加了常规像素级精细搜索,对宏像素级搜索到的候选参考块位置进行像素级的偏移修正,使其更大程度地增加与当前块的相关性,从而进一步优化编码光场图像的运动估计效果,提高编码器鲁棒性。
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1是本申请实施例的一种光场图像的示意图;
图2是本申请实施例的一个应用场景的示意图;
图3为本申请实施例提供的一种编码器的组成框图示意图;
图4为本申请实施例提供的一种解码器的组成框图示意图;
图5为本申请实施例提供的一种编解码系统的网络架构示意图;
图6为本申请实施例提供的一种编码方法的流程示意图;
图7为本申请实施例中宏像素级快速搜索过程的菱形模板示意图;
图8为本申请实施例中宏像素级快速搜索过程的正方形模板示意图;
图9为本申请实施例中宏像素级精细搜索过程的第一示意图;
图10为本申请实施例中宏像素级精细搜索过程的第二示意图;
图11为本申请实施例中常规像素级搜索过程的示意图;
图12为本申请实施例提供的一种编码器的组成结构示意图;
图13为本申请实施例提供的一种编码器的具体硬件结构示意图;
图14为本申请实施例提供的一种编解码系统的组成结构示意图。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。
还需要指出,本申请实施例所涉及的术语“第一\第二\第三”仅是用于区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。
在视频图像中,一般采用第一图像分量、第二图像分量和第三图像分量来表征编码块(Coding Block,CB)。其中,这三个图像分量分别为一个亮度分量、一个蓝色色度分量和一个红色色度分量,具体地,亮度分量通常使用符号Y表示,蓝色色度分量通常使
用符号Cb或者U表示,红色色度分量通常使用符号Cr或者V表示;这样,视频图像可以用YCbCr格式表示,也可以用YUV格式表示。
对本申请实施例进行进一步详细说明之前,先对本申请实施例中涉及的名词和术语进行说明,本申请实施例中涉及的名词和术语适用于如下的解释:
动态图像专家组(Moving Picture Experts Group,MPEG)
国际标准化组织(International Standardization Organization,ISO)
国际电工委员会(International Electrotechnical Commission,IEC)
联合视频专家组(Joint Video Experts Team,JVET)
开放媒体联盟(Alliance for Open Media,AOM)
光场视频编码(Lenslet video coding,LVC)
下一代视频编码标准H.266/多功能视频编码(Versatile Video Coding,VVC)
VVC的参考软件测试平台(VVC Test Model,VTM)
音视频编码标准(Audio Video Standard,AVS)
AVS的高性能测试模型(High-Performance Model,HPM)
变换系数(Transform coefficients)
量化参数(Quantization Parameter)
运动向量(Motion Vector,MV)也称运动矢量
基于上下文的自适应变长编码(Context-based Adaptive Variable-Length code,CAVLC)
基于上下文的自适应二进制算术编码(Context-based Adaptive Binary Arithmetic Coding,CABAC)
基于扫描区域的系数编码(Scan Region based Coefficient Coding,SRCC)
可以理解,数字视频压缩技术主要是将庞大的数字影像视频数据进行压缩,以便于传输以及存储等。随着互联网视频的激增以及人们对视频清晰度的要求越来越高,尽管已有的数字视频压缩标准能够节省不少视频数据,但目前仍然需要追求更好的数字视频压缩技术,以减少数字视频传输的带宽和流量压力。
与一般的摄像机成像模型不同,光场相机在成像平面前增加了一组微透镜阵列。使得物体平面的同一个点的光线可以同时被多个微透镜捕获,相当于同时对同一点从多个角度拍摄。由于其特殊的成像模型,光场图像的视觉效果与传统图片差距很大。这也导致对于一般图像或视频的压缩方法,在处理光场图像或视频时效果不佳。MPEG LVC工作组就是为了研究更加适应于光场视频的压缩方法。
光场图像是由一系列规则排列的宏像素组成的,如图1所示。根据光场相机的成像原理,相邻宏像素之间存在很强的相关性。因此在运动估计的搜索过程中,相比于常规的基于像素单位的方式,基于宏像素单位的搜索方式能更加充分地利用光场图像的相关性,从而带来更加高效的压缩性能。然而仅以宏像素为单位进行搜索可能会丢失一些局部最优点。考虑到匹配块在不同帧之间可能会跨宏像素运动,同时相邻宏像素之间也存有视差和尺寸等差异,匹配块的最佳候选未必严格按照宏像素间距排列。
图2是本申请实施例的一个应用场景的示意图,多目摄像头或多目摄像头阵列捕获光场视频,输出光场视频数据,编码器压缩后传输给解码端,解码器解压缩得到光场视频数据,并显示。其中编解码器的部分可以使用现有视频编解码工具(例如,AVC、HEVC或VVC等)。
参见图3,其示出了本申请实施例提供的一种编码器的组成框图示意图。如图3所示,编码器(具体为“视频编码器”)100可以包括变换与量化单元101、帧内估计单元102、帧内预测单元103、运动补偿单元104、运动估计单元105、反变换与反量化单元106、滤波器控制分析单元107、滤波单元108、编码单元109和解码图像缓存单元110
等,其中,滤波单元108可以实现去方块滤波及样本自适应缩进(Sample Adaptive 0ffset,SAO)滤波,编码单元109可以实现头信息编码及基于上下文的自适应二进制算术编码(Context-based Adaptive Binary Arithmetic Coding,CABAC)。针对输入的原始视频信号,通过编码树单元(Coding Tree Unit,CTU)的划分可以得到一个视频编码块,然后对经过帧内或帧间预测后得到的残差像素信息通过变换与量化单元101对该视频编码块进行变换,包括将残差信息从像素域变换到变换域,并对所得的变换系数进行量化,用以进一步减少比特率;帧内估计单元102和帧内预测单元103是用于对该视频编码块进行帧内预测;明确地说,帧内估计单元102和帧内预测单元103用于确定待用以编码该视频编码块的帧内预测模式;运动补偿单元104和运动估计单元105用于执行所接收的视频编码块相对于一或多个参考帧中的一或多个块的帧间预测编码以提供时间预测信息;由运动估计单元105执行的运动估计为产生运动向量的过程,所述运动向量可以估计该视频编码块的运动,然后由运动补偿单元104基于由运动估计单元105所确定的运动向量执行运动补偿;在确定帧内预测模式之后,帧内预测单元103还用于将所选择的帧内预测数据提供到编码单元109,而且运动估计单元105将所计算确定的运动向量数据也发送到编码单元109;此外,反变换与反量化单元106是用于该视频编码块的重构建,在像素域中重构建残差块,该重构建残差块通过滤波器控制分析单元107和滤波单元108去除方块效应伪影,然后将该重构残差块添加到解码图像缓存单元110的帧中的一个预测性块,用以产生经重构建的视频编码块;编码单元109是用于编码各种编码参数及量化后的变换系数,在基于CABAC的编码算法中,上下文内容可基于相邻编码块,可用于编码指示所确定的帧内预测模式的信息,输出该视频信号的码流;而解码图像缓存单元110是用于存放重构建的视频编码块,用于预测参考。随着视频图像编码的进行,会不断生成新的重构建的视频编码块,这些重构建的视频编码块都会被存放在解码图像缓存单元110中。
参见图4,其示出了本申请实施例提供的一种解码器的组成框图示意图。如图4所示,解码器(具体为“视频解码器”)200包括解码单元201、反变换与反量化单元202、帧内预测单元203、运动补偿单元204、滤波单元205和解码图像缓存单元206等,其中,解码单元201可以实现头信息解码以及CABAC解码,滤波单元205可以实现去方块滤波以及SAO滤波。输入的视频信号经过图3的编码处理之后,输出该视频信号的码流;该码流输入解码器200中,首先经过解码单元201,用于得到解码后的变换系数;针对该变换系数通过反变换与反量化单元202进行处理,以便在像素域中产生残差块;帧内预测单元203可用于基于所确定的帧内预测模式和来自当前帧或图片的先前经解码块的数据而产生当前视频解码块的预测数据;运动补偿单元204是通过剖析运动向量和其他关联语法元素来确定用于视频解码块的预测信息,并使用该预测信息以产生正被解码的视频解码块的预测性块;通过对来自反变换与反量化单元202的残差块与由帧内预测单元203或运动补偿单元204产生的对应预测性块进行求和,而形成解码的视频块;该解码的视频信号通过滤波单元205以便去除方块效应伪影,可以改善视频质量;然后将经解码的视频块存储于解码图像缓存单元206中,解码图像缓存单元206存储用于后续帧内预测或运动补偿的参考图像,同时也用于视频信号的输出,即得到了所恢复的原始视频信号。
进一步地,本申请实施例还提供了一种包含编码器和解码器的编解码系统的网络架构,其中,图5示出了本申请实施例提供的一种编解码系统的网络架构示意图。如图5所示,该网络架构包括一个或多个电子设备13至1N和通信网络01,其中,电子设备13至1N可以通过通信网络01进行视频交互。电子设备在实施的过程中可以为各种类型的具有视频编解码功能的设备,例如,所述电子设备可以包括智能手机、平板电脑、
个人计算机、个人数字助理、导航仪、数字电话、视频电话、电视机、传感设备、服务器等,这里不作具体限定。另外,本申请实施例所述的解码器或编码器就可以为上述电子设备。
需要说明的是,本申请实施例的方法主要应用在如图3所示的帧间预测的运动估计单元105部分,运动估计单元105得到的运动矢量,运动补偿单元104基于由运动估计单元105所确定的运动矢量执行运动补偿,编码单元109编码运动向量数据,以及解码器的运动补偿单元204解码运动向量数据同样用来进行运动补偿。
还需要说明的是,当应用于运动估计单元105部分时,“当前块”具体是指当前待进行帧间预测的编码块。
为便于理解本申请实施例的技术方案,以下通过具体实施例详述本申请的技术方案。以上相关技术作为可选方案与本申请实施例的技术方案可以进行任意结合,其均属于本申请实施例的保护范围。本申请实施例包括以下内容中的至少部分内容。
在本申请的一实施例中,参见图6,其示出了本申请实施例提供的一种编码方法的流程示意图。如图6所示,该方法可以包括:
S601:确定当前块的第一运动矢量;
需要说明的是,第一运动矢量指向的像素点为一次宏像素级搜索的起始点,用于定位当前块的候选参考块。
在一些实施例中,该方法还包括:构建运动矢量候选列表,确定第一运动矢量。第一运动矢量可以为运动矢量候选列表中的一个运动矢量。
示例性的,第一运动矢量可以为运动矢量候选列表中最小率失真代价值对应的运动矢量。具体地,确定当前块的第一运动矢量,包括:构建运动矢量候选列表;对运动矢量候选列表进行搜索,确定每个运动矢量对应的率失真代价值;确定最小率失真代价值对应的运动矢量为第一运动矢量。
在一些实施例中,该方法还包括:在进行第一次宏像素级搜索之前,构建运动矢量候选列表,确定第一运动矢量。
示例性的,基于HEVC的AMVP技术确定初始搜索点:在AMVP给出的候选MV中,编码器选出率失真代价最小的MV,并将该MV指向的位置作为初始搜索点
在一些实施例中,若执行两次以上的宏像素级搜索时,第一运动矢量可以为上一次宏像素级搜索到的第二运动矢量。
在一些实施例中,若执行两次以上的宏像素级搜索和常规像素级搜索时,第一运动矢量可以为上一次常规像素级搜索到的第三运动矢量。
S602:将第一运动矢量指向的像素点作为第一起始点,以宏像素级搜索步长进行宏像素级搜索,确定当前块的第二运动矢量;
在一些实施例中,宏像素级搜索步长包括宏像素级水平步长和宏像素级垂直步长。在实际应用中,宏像素级水平步长大于等于0,宏像素级垂直步长大于等于0。宏像素级水平步长可以为宏像素水平尺寸的整数倍,宏像素级垂直步长可以为宏像素垂直尺寸的整数倍。在一些实施例中,宏像素级水平步长也可以表示为宏像素水平间距的整数倍,宏像素级垂直步长也可以表示为宏像素垂直间距的整数倍。
需要说明的是,以宏像素级搜索步长进行宏像素级搜索过程可以是,从第一起始点开始,以宏像素级搜索步长进行搜索,确定利用每个搜索点指向的候选参考块进行预测时的率失真代价值,确定最小率失真代价值对应的搜索点,将该点的运动矢量作为第二运动矢量。
宏像素级搜索过程可以基于任一种或多种搜索算法实现。示例性的,全搜索算法、三步搜索算法、四步搜索算法、菱形搜索算法、正方形搜索算法、正六边形搜索算法、
正八边形搜索算法等。
在一些实施例中,宏像素级搜索过程可以包括快速搜索,也可以理解为粗搜索。宏像素级搜索过程可以包括精细搜索。
在一些实施例中,宏像素级搜索过程可以包括快速搜索和精细搜索。示例性的,宏像素级搜索过程包括:从第一起始点开始,以第一宏像素级搜索步长进行宏像素级快速搜索,确定第一搜索点从第一搜索点开始,以第二宏像素级搜索步长进行宏像素级精细搜索,确定第二搜索点;其中,第一宏像素级搜索步长大于或者等于第二宏像素级搜索步长;将第二搜索点对应的运动矢量作为第二运动矢量。
需要说明的是,快速搜索过程中,第一宏像素级搜索步长不变或者变化。精细搜索过程中,第二宏像素级搜索步长不变或者变化。也就是说,第一宏像素级搜索步长和第二宏像素级搜索步长并不是用于限制某一特定宏像素级搜索步长。
示例性的,将每次的运动矢量搜索步长设置为光场宏像素间距的倍数,如式(1)所示:
式中,dx和dy为本次迭代中的运动矢量偏移量,dx为水平方向运动矢量偏移量,和dy为垂直方向运动矢量偏移量,dx
,和dy
,为上一次迭代中的运动矢量偏移量,ΔstepX和ΔstepY为本次迭代中增加的搜索步数,Lx1和Ly1为每一步的步长,也可以理解为步长单位,在宏像素级搜索时步长单位设置为一个光场宏像素尺寸,包括分为水平方向和垂直方向上的尺寸。需要说明的是,在宏像素级搜索时步长单位也可以设置为或者一个光场宏像素的间距,包括分为水平方向和垂直方向上的间距。Lx1×ΔstepX和Ly1×ΔstepY为本次迭代中的运动矢量偏移量的增加量,也可以理解为本次迭代中增加的搜索步长。
在一些实施例中,快速搜索过程可以包括:第一宏像素级搜索步长从一个宏像素开始,且第一宏像素级搜索步长以2的整数次幂递增,按照搜索模板进行搜索,确定每个搜索点的率失真代价值;确定最小率失真代价值对应的搜索点为第一搜索点。
示例性的,第一宏像素级搜索步长以一个宏像素单位(或者一个宏像素间距)开始,以2的整数次幂递增,按照搜索模板在规定的搜索范围内进行搜索。
在一些实施例中,搜索模板包括以下之一:菱形模板、正方形模板、正六边形模板、正八边形模板等。
图7为本申请实施例中宏像素级快速搜索过程的菱形模板示意图。如图7所示,初始搜索点为起始点,搜索步长从1个单位开始,以2的整数次幂形式递增,按照菱形模板在规定的搜索范围内进行搜索,从中选出率失真代价最小的搜索点作为该步骤的结果,这里的步长单位设置为光场图像的宏像素间距。
需要说明的是,图7中每个搜索点对应的方框可以理解为当前块的一个候选参考块,搜索点为候选参考块的左上角像素点,方框只是为了示意性的表示搜索点的位置并代表候选参考块的尺寸。
图8为本申请实施例中宏像素级快速搜索过程的正方形模板示意图。如图8所示,初始搜索点为起始点,搜索步长从1个单位开始,以2的整数次幂形式递增,按照正方形模板在规定的搜索范围内进行搜索,从中选出率失真代价最小的搜索点作为该步骤的结果,这里的步长单位设置为光场图像的宏像素间距。
需要说明的是,宏像素形状不限于圆形,还可为正方形、正六边形、正八边形等。
在一些实施例中,精细搜索过程包括:在快速搜索过程中,确定第一搜索点对应的搜索步长;第一搜索点对应的搜索步长小于或者等于第一步长阈值时,在搜索模
板内以第二宏像素级搜索步长搜索距离第一搜索点最近的两个未搜索过的点,确定最小率失真代价值对应的点为第二搜索点;第一搜索点对应的搜索步长大于第二步长阈值时,对以第一搜索点为中心点的第一搜索范围内以宏像素为单位进行全搜索,确定最小率失真代价值对应的搜索点为第二搜索点;其中,第二步长阈值大于或者等于第一步长阈值。
在一些实施例中,第一搜索范围包括第一搜索点及其周围多个相邻点。示例性的,第一搜索范围为M×M的搜索范围,对第一搜索范围进行全搜索,确定最小率失真代价值对应的搜索点为第二搜索点。
需要说明的是,第一步长阈值和第二步长阈值也为宏像素级的。在一些实施例中,第一步长阈值和第二步长阈值相等,均为一个宏像素单位。在一些实施例中,第一步长阈值和第二步长阈值不相等,例如,第一步长阈值可以为一个宏像素单位,第一步长阈值可以为两个宏像素单位。
在一些实施例中,第二宏像素级搜索步长为一个宏像素单位。
在一些实施例中,第一搜索范围包括以第一搜索点为中心点的M×M的搜索范围;其中,M的取值大于或者等于3。需要说明的是,第一搜索范围也是以宏像素为单位确定,第一搜索范围还可以理解为以第一搜索点所在宏像素为中心点的M×M个宏像素。
图9为本申请实施例中宏像素级精细搜索过程的第一示意图。如图9所示,若快速搜索的最优点对应步长为1,则在该点周围做两点搜索,选出率失真代价最小的搜索点作为精细搜索的结果,这里的搜索步长都是以宏像素为单位。若使用的是菱形模板,则最优点可能是图9中4个方框对应的像素点,上方点为最优点,则继续搜索点a和b。若使用的是正方形模板,则最优点可能为中心点周围的8个相邻点,假设左上角的点为最优点,则继续搜索点a和c。
图10为本申请实施例中宏像素级精细搜索过程的第二示意图。如图10所示,若快速搜索的最优点对应步长大于某个阈值(例如,1,2,4等),则以该点为中心在一定搜索范围(例如M×M的搜索范围)内做全搜索,选出率失真代价最小的搜索点作为精细搜索的结果,这里的搜索步长也是以宏像素为单位。
在一些实施例中,宏像素级搜索过程还包括:第一搜索点对应的搜索步长大于第一步长阈值且小于或者等于第二步长阈值时,不进行宏像素级精细搜索,将第一搜索点对应的运动矢量作为第二运动矢量。需要说明的是,此时第一步长阈值小于第二步长阈值,当第一搜索点对应的搜索步长大于第一步长阈值且小于或者等于第二步长阈值,直接将第一搜索点作为宏像素级搜索的结果,无需执行精细搜索。
在一些实施例中,宏像素级搜索还可以为常规TZsearch搜索算法的基础上。
进一步地,以宏像素级搜索确定的像素点作为第二起始点,执行常规像素级搜索,对基于宏像素级搜索得出的搜索点进行像素级的偏移修正。
S603:将第二运动矢量指向的像素点作为第二起始点,以常规像素级搜索步长进行常规像素级搜索,确定当前块的第三运动矢量;
需要说明的是,常规像素可以理解为组成像素阵列的单个像素,宏像素包括多个常规像素。第二运动矢量指向的像素点为一次常规像素级搜索的起始点,用于定位当前块的候选参考块。由于宏像素在水平和垂直维度上包括多个常规像素,因此,常规像素级搜索步长小于宏像素级搜索步长,常规像素级搜索相比于宏像素级搜索是一种细粒度的搜索。
在一些实施例中,常规像素级搜索步长包括常规像素级水平步长和常规像素级垂直步长。在实际应用中,常规像素级水平步长大于等于0,常规像素级垂直步长大于
等于0。常规像素级水平步长为常规像素水平尺寸的整数倍,常规像素级垂直步长为常规像素垂直尺寸的整数倍。在一些实施例中,常规像素级水平步长也可以表示为宏像素水平间距的整数倍,常规像素级垂直步长也可以表示为常规像素垂直间距的整数倍。
需要说明的是,以常规像素级搜索步长进行常规像素级搜索过程可以是,从第二起始点开始,以常规像素级搜索步长进行搜索,确定利用每个搜索点指向的候选参考块进行预测时的率失真代价值,确定最小率失真代价值对应的搜索点,将该点的运动矢量作为第三运动矢量。
示例性的,将每次的运动矢量搜索步长设置为常规像素间距的倍数,如式(2)所示:
式中,dx和dy为本次迭代中的运动矢量偏移量,dx为水平方向运动矢量偏移量,和dy为垂直方向运动矢量偏移量,dx
,和dy
,为上一次迭代中的运动矢量偏移量,ΔstepX和ΔstepY为本次迭代中增加的搜索步数,Lx2和Ly2为每一步的步长,也可以理解为步长单位,在常规像素级搜索时步长单位设置为一个常规像素尺寸,包括分为水平方向和垂直方向上的尺寸。需要说明的是,在常规像素级搜索时步长单位也可以设置为一个常规像素的间距,包括分为水平方向和垂直方向上的间距。
在一些实施例中,常规像素级搜索过程可以包括快速搜索,也可以理解为粗搜索。常规像素级搜索过程可以包括精细搜索。在一些实施例中,常规像素级搜索过程可以包括快速搜索和精细搜索。
也就是说,常规像素级搜索过程中常规像素级搜索步长可以为某一固定值或者变量。示例性的,常规像素级搜索步长可以为一个常规像素单位。
在一些实施例中,常规像素级搜索过程包括:在以第二起始点为中心点的第二搜索范围内搜索相邻点,确定最小率失真代价值对应的第三搜索点;在以第三搜索点为中心点的第二搜索范围内搜索相邻点,确定最小率失真代价值对应的第四搜索点;第二起始点和第四搜索点相同时,确定第二起始点的运动矢量为第三运动矢量;第二起始点和第四搜索点不相同时,在以第四搜索点为中心点的第二搜索范围内搜索相邻点,直到第i次确定的搜索点和第i+2次确定的搜索点相同,将第i次确定的搜索点的运动矢量确定为第三运动矢量。
可以理解的是,第二搜索范围包括每个搜索点及其周围多个相邻点。示例性的,第二搜索范围为N×N的搜索范围,对第二搜索范围内的相邻点进行搜索,确定最小率失真代价值对应的搜索点为第二搜索点。
在一些实施例中,常规像素级搜索过程还包括:初始化第一搜索次数;每次搜索结束后,第一搜索次数执行自加1操作;第一搜索次数等于第一次数阈值时,将常规像素级搜索过程中,最小率失真代价值对应的搜索点的运动矢量为第三运动矢量。需要说明的是,还可以为常规像素级搜索设置搜索次数阈值,当搜索次数达到某一阈值,则提前终止搜索,并以率失真代价最小的最优点作为常规像素级搜索结果。需要说明的是,常规像素级搜索过程中每移动一次中心点执行的搜索可以理解为一次搜索过程。
以S602得到的第二运动矢量指向的像素点作为中心点,在该点周围邻域内做多步像素级精细搜索,示例性的,搜索该点领域内八个像素点,从中选出率失真代价最小的最优点,并把该最优点作为新的中心点,再次进行相同的邻域内像素级精细搜索。当某次搜索选出的最优点与先前所选一致,则该最优点作为本步骤的结果;当搜索次数达到某一阈值,则提前终止搜索,并以率失真代价最小的最优点作为本步骤的结果。图11为本申请实施例中常规像素级搜索过程的示意图,如图11所示,像素点“0”为初始像素点,以其为中心搜索周围八个像素点,得出最优像素点“1”;在以“1”
为中心做同样的八点搜索得出“2”;以此类推得出“3”;最后以“3”为中心得出“2”,与先前所选一致,故像素点“2”对应的运动矢量便是本步骤的结果。
在另一些实施例中,常规像素级搜索过程包括:在以第二起始点为中心点的第二搜索范围以常规像素为单位进行全搜索,确定最小率失真代价值对应的第三搜索点;第二起始点和第三搜索点相同时,确定第二起始点的运动矢量为第三运动矢量;第二起始点和第三搜索点不相同时,在以第三搜索点为中心点的第二搜索范围内以常规像素为单位进行全搜索,直到第i次确定的搜索点和第i+1次确定的搜索点相同,将第i次确定的搜索点的运动矢量确定为第三运动矢量。
可以理解的是,第二搜索范围包括每个搜索点及其周围多个相邻点。示例性的,第二搜索范围为N×N的搜索范围,对第二搜索范围进行全搜索,确定最小率失真代价值对应的搜索点为第二搜索点。需要说明的是,第二搜索范围是以常规像素为单位确定。
以S602得到的第二运动矢量指向的像素点作为中心点,在该点周围邻域内做全搜素。示例性的,在3×3的搜索范围内进行全搜索,从中选出率失真代价最小的最优点,并把该最优点作为新的中心点,再次进行相同的邻域内像素级精细搜索。当某次搜索选出的最优点与先前所选一致,则该最优点作为本步骤的结果;当搜索次数达到某一阈值,则提前终止搜索,并以率失真代价最小的最优点作为本步骤的结果。例如,以图11为例,像素点“0”为初始像素点,以其为中心进行全搜索,得出最优像素点“1”;在以“1”为中心做同样搜索得出“2”;在以“2”为中心做同样搜索得出“2”;与先前所选一致,故像素点“2”对应的运动矢量便是本步骤的结果。
S604:根据第三运动矢量,确定当前块的最佳运动矢量。
在一些实施例中,方法还包括:本次常规像素级搜索结束后,将本次常规像素级搜索得到的第三运动矢量作为新的第一运动矢量,执行下一次宏像素级搜索和下一次常规像素级搜索,确定下一次常规像素级搜索得到的第三运动矢量;
根据第三运动矢量,确定当前块的最佳运动矢量,包括:在本次常规像素级搜索得到的第三运动矢量和下一次常规像素级搜索得到的第三运动矢量相同的情况下,搜索结束,确定当前块的最佳运动矢量。
需要说明的是,S601至S603可以理解为一次搜索过程,每次搜索过程确定一个第三运动矢量,在连续两次搜索到的点相同的情况下,继续重复S601至S603。直到在连续两次搜索到的点相同的情况下,即第三运动矢量相同,搜索结束,将该第三运动矢量作为最佳运动矢量。
在一些实施例中,方法还包括:在进行第一次宏像素级搜索之前,初始化第二搜索次数;每次宏像素级搜索和常规像素级搜索结束后,第二搜索次数执行自加1操作;
根据第三运动矢量,确定当前块的最佳运动矢量,包括:第二搜索次数等于第二次数阈值时,搜索结束,将最后一次常规像素级搜索得到的第三运动矢量作为当前块的最佳运动矢量。
需要说明的是,还可以为整个搜索过程设置第二搜索次数阈值,当第二搜索次数达到某一阈值,则提前终止搜索,并以最后一次搜索结果作为最终的搜索结果。
在一些实施例中,该方法还包括:确定当前块的运动矢量预测值;根据当前块的运动矢量预测值和最佳运动矢量,确定当前块的运动矢量残差值;编码当前块的运动矢量残差值,将得到的编码比特写入码流。解码端,解码码流确定当前块的运动矢量残差值,执行运动补偿,确定运动矢量预测值,根据当前块的运动矢量预测值和运动矢量残差值,确定当前块的运动矢量。
采用上述编码方法,在宏像素级搜索基础上增加了常规像素级精细搜索,更加充分
地考虑了局部最优点,进一步精细搜索结果,提高了运动估计性能;光场图像上各宏像素间的间距难以保证严格一致,引入像素级的精细搜索提高了对此的鲁棒性。
在本申请的再一实施例中,基于前述实施例相同的发明构思,参见图12,其示出了本申请实施例提供的一种编码器的组成结构示意图。如图12所示,该编码器120可以包括确定单元1201和搜索单元1202;其中,
确定单元1201,配置为确定当前块的第一运动矢量;
搜索单元1202,配置为将第一运动矢量指向的像素点作为第一起始点,以宏像素级搜索步长进行宏像素级搜索,确定当前块的第二运动矢量;
搜索单元1202,还配置为将第二运动矢量指向的像素点作为第二起始点,以常规像素级搜索步长进行常规像素级搜索,确定当前块的第三运动矢量;
确定单元1201,还配置为根据第三运动矢量,确定当前块的最佳运动矢量。
在一些实施例中,搜索单元1202,配置为从第一起始点开始,以第一宏像素级搜索步长进行宏像素级快速搜索,确定第一搜索点;从第一搜索点开始,以第二宏像素级搜索步长进行宏像素级精细搜索,确定第二搜索点;其中,第一宏像素级搜索步长大于或者等于第二宏像素级搜索步长;将第二搜索点对应的运动矢量作为第二运动矢量。
在一些实施例中,搜索单元1202,配置为第一宏像素级搜索步长从一个宏像素开始,且第一宏像素级搜索步长以2的整数次幂递增,按照搜索模板进行搜索,确定每个搜索点的率失真代价值;确定最小率失真代价值对应的搜索点为第一搜索点。
在一些实施例中,搜索单元1202,配置为在快速搜索过程中,确定第一搜索点对应的搜索步长;第一搜索点对应的搜索步长小于或者等于第一步长阈值时,在搜索模板内以第二宏像素级搜索步长搜索距离第一搜索点最近的两个未搜索过的点,确定最小率失真代价值对应的点为第二搜索点;第一搜索点对应的搜索步长大于第二步长阈值时,对以第一搜索点为中心点的第一搜索范围内以宏像素为单位进行全搜索,确定最小率失真代价值对应的搜索点为第二搜索点;其中,第二步长阈值大于或者等于第一步长阈值。
在一些实施例中,搜索单元1202,还配置为第一搜索点对应的搜索步长大于第一步长阈值且小于或者等于第二步长阈值时,不进行宏像素级精细搜索,将第一搜索点对应的运动矢量作为第二运动矢量。
在一些实施例中,第二宏像素级搜索步长为一个宏像素单位。
在一些实施例中,第一搜索范围包括以第一搜索点为中心点的M×M的搜索范围;其中,M的取值大于或者等于3。
在一些实施例中,搜索模板包括以下之一:菱形模板、正方形模板、正六边形模板、正八边形模板。
在一些实施例中,宏像素级搜索步长包括宏像素级水平步长和宏像素级垂直步长;宏像素级水平步长为宏像素水平尺寸的整数倍,宏像素级垂直步长为宏像素垂直尺寸的整数倍。
在一些实施例中,搜索单元1202,配置为在以第二起始点为中心点的第二搜索范围内搜索相邻点,确定最小率失真代价值对应的第三搜索点;在以第三搜索点为中心点的第二搜索范围内搜索相邻点,确定最小率失真代价值对应的第四搜索点;第二起始点和第四搜索点相同时,确定第二起始点的运动矢量为第三运动矢量;第二起始点和第四搜索点不相同时,在以第四搜索点为中心点的第二搜索范围内搜索相邻点,直到第i次确定的搜索点和第i+2次确定的搜索点相同,将第i次确定的搜索点的运动矢量确定为第三运动矢量。
在一些实施例中,搜索单元1202,配置为在以第二起始点为中心点的第二搜索范围以常规像素为单位进行全搜索,确定最小率失真代价值对应的第三搜索点;第二起始点和第三搜索点相同时,确定第二起始点的运动矢量为第三运动矢量;第二起始点和第三搜索点不相同时,在以第三搜索点为中心点的第二搜索范围内以常规像素为单位进行全搜索,直到第i次确定的搜索点和第i+1次确定的搜索点相同,将第i次确定的搜索点的运动矢量确定为第三运动矢量。
在一些实施例中,搜索单元1202,配置为初始化第一搜索次数;每次搜索结束后,第一搜索次数执行自加1操作;第一搜索次数等于第一次数阈值时,将常规像素级搜索过程中,最小率失真代价值对应的搜索点的运动矢量为第三运动矢量。
在一些实施例中,常规像素级搜索步长包括常规像素级水平步长和常规像素级垂直步长;常规像素级水平步长为常规像素水平尺寸的整数倍,常规像素级垂直步长为常规像素垂直尺寸的整数倍。
在一些实施例中,确定单元1201,配置为构建运动矢量候选列表;对运动矢量候选列表进行搜索,确定每个运动矢量对应的率失真代价值;确定最小率失真代价值对应的运动矢量为第一运动矢量。
在一些实施例中,确定单元1201,配置为本次常规像素级搜索结束后,将本次常规像素级搜索得到的第三运动矢量作为新的第一运动矢量;执行下一次宏像素级搜索和下一次常规像素级搜索,确定下一次常规像素级搜索得到的第三运动矢量;
确定单元1201,配置为在本次常规像素级搜索得到的第三运动矢量和下一次常规像素级搜索得到的第三运动矢量相同的情况下,搜索结束,确定当前块的最佳运动矢量。
在一些实施例中,确定单元1201,配置为在进行第一次宏像素级搜索之前,初始化第二搜索次数;每次宏像素级搜索和常规像素级搜索结束后,第二搜索次数执行自加1操作;第二搜索次数等于第二次数阈值时,搜索结束,将最后一次常规像素级搜索得到的第三运动矢量作为当前块的最佳运动矢量。
在一些实施例中,确定单元1201,配置为确定当前块的运动矢量预测值;根据当前块的运动矢量预测值和最佳运动矢量,确定当前块的运动矢量残差值;该编码器还可以包括编码单元1203,配置为编码当前块的运动矢量残差值,将得到的编码比特写入码流。
可以理解地,编码器各个功能单元用于执行前述实施例中任一项所述的编码方法。
可以理解地,在本申请实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
因此,本申请实施例提供了一种计算机可读存储介质,应用于编码器120,该计算
机可读存储介质存储有计算机程序,计算机程序被第一处理器执行时实现前述实施例中任一项的方法。
基于编码器120的组成以及计算机可读存储介质,参见图13,其示出了本申请实施例提供的编码器120的具体硬件结构示意图。如图13所示,编码器120可以包括:第一通信接口1301、第一存储器1302和第一处理器1303;各个组件通过第一总线系统1304耦合在一起。可理解,第一总线系统1304用于实现这些组件之间的连接通信。第一总线系统1304除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图13中将各种总线都标为第一总线系统1304。其中,
第一通信接口1301,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;
第一存储器1302,用于存储能够在第一处理器1303上运行的计算机程序;
第一处理器1303,用于在运行计算机程序时,执行:
确定当前块的第一运动矢量;
将所述第一运动矢量指向的像素点作为第一起始点,以宏像素级搜索步长进行宏像素级搜索,确定当前块的第二运动矢量;
将所述第二运动矢量指向的像素点作为第二起始点,以常规像素级搜索步长进行常规像素级搜索,确定当前块的第三运动矢量;
根据所述第三运动矢量,确定当前块的最佳运动矢量。
可以理解,本申请实施例中的第一存储器1302可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请描述的系统和方法的第一存储器1302旨在包括但不限于这些和任意其它适合类型的存储器。
而第一处理器1303可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过第一处理器1303中的硬件的集成逻辑电路或者软件形式的指令完成。上述的第一处理器1303可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于第一存储器1302,第一处理器1303读取第一存储器1302中的信息,结合其硬件完成上述方法的步骤。
可以理解的是,本申请描述的这些实施例可以用硬件、软件、固件、中间件、微码
或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请功能的其它电子单元或其组合中。对于软件实现,可通过执行本申请功能的模块(例如过程、函数等)来实现本申请的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。
可选地,作为另一个实施例,第一处理器1303还配置为在运行计算机程序时,执行前述实施例中任一项的方法。
本实施例提供了一种编码器,在该编码器中,进行运动估计时,在宏像素级搜索的基础上增加了常规像素级精细搜索,对宏像素级搜索到的候选参考块位置进行像素级的偏移修正,使其更大程度地增加与当前块的相关性,从而进一步优化编码光场图像的运动估计效果,提高编码器鲁棒性。
在本申请的再一实施例中,参见图14,其示出了本申请实施例提供的一种编解码系统的组成结构示意图。如图14所示,编解码系统140可以包括编码器1401和解码器1402。
在本申请实施例中,编码器1401可以为前述实施例中任一项所述的编码器,解码器1402可以为前述实施例中任一项所述的解码器。
需要说明的是,在本申请中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
本申请实施例中提供了一种编码方法、编码器以及存储介质,该方法包括:确定当前块的第一运动矢量;将第一运动矢量指向的像素点作为第一起始点,以宏像素级搜索步长进行宏像素级搜索,确定当前块的第二运动矢量;将第二运动矢量指向的像素点作为第二起始点,以常规像素级搜索步长进行常规像素级搜索,确定当前块的第三运动矢量;根据第三运动矢量,确定当前块的最佳运动矢量。如此,在宏像素级搜索的基础上增加了常规像素级精细搜索,对宏像素级搜索到的候选参考块位置进行像素级的偏移修正,使其更大程度地增加与当前块的相关性,从而进一步优化编码光场图像的运动估计效果,提高编码器鲁棒性。
Claims (20)
- 一种编码方法,应用于编码器,所述方法包括:确定当前块的第一运动矢量;将所述第一运动矢量指向的像素点作为第一起始点,以宏像素级搜索步长进行宏像素级搜索,确定当前块的第二运动矢量;将所述第二运动矢量指向的像素点作为第二起始点,以常规像素级搜索步长进行常规像素级搜索,确定当前块的第三运动矢量;根据所述第三运动矢量,确定当前块的最佳运动矢量。
- 根据权利要求1所述的方法,其中,所述宏像素级搜索过程包括:从所述第一起始点开始,以第一宏像素级搜索步长进行宏像素级快速搜索,确定第一搜索点;从所述第一搜索点开始,以第二宏像素级搜索步长进行宏像素级精细搜索,确定第二搜索点;其中,所述第一宏像素级搜索步长大于或者等于所述第二宏像素级搜索步长;将所述第二搜索点对应的运动矢量作为所述第二运动矢量。
- 根据权利要求2所述的方法,其中,所述宏像素级快速搜索过程包括:所述第一宏像素级搜索步长从一个宏像素开始,且所述第一宏像素级搜索步长以2的整数次幂递增,按照搜索模板进行搜索,确定每个搜索点的率失真代价值;确定最小率失真代价值对应的搜索点为所述第一搜索点。
- 根据权利要求2所述的方法,其中,所述宏像素级精细搜索过程包括:在所述快速搜索过程中,确定所述第一搜索点对应的搜索步长;所述第一搜索点对应的搜索步长小于或者等于第一步长阈值时,在搜索模板内以所述第二宏像素级搜索步长搜索距离所述第一搜索点最近的两个未搜索过的点,确定最小率失真代价值对应的点为所述第二搜索点;所述第一搜索点对应的搜索步长大于所述第二步长阈值时,对以所述第一搜索点为中心点的第一搜索范围内以宏像素为单位进行全搜索,确定最小率失真代价值对应的搜索点为所述第二搜索点;其中,所述第二步长阈值大于或者等于所述第一步长阈值。
- 根据权利要求4所述的方法,其中,所述宏像素级搜索过程还包括:所述第一搜索点对应的搜索步长大于所述第一步长阈值且小于或者等于所述第二步长阈值时,不进行所述宏像素级精细搜索,将所述第一搜索点对应的运动矢量作为所述第二运动矢量。
- 根据权利要求4所述的方法,其中,所述第二宏像素级搜索步长为一个宏像素单位。
- 根据权利要求4所述的方法,其中,所述第一搜索范围包括以所述第一搜索点为中心点的M×M的搜索范围;其中,M的取值大于或者等于3。
- 根据权利要求3或4所述的方法,其中,所述搜索模板包括以下之一:菱形模板、正方形模板、正六边形模板、正八边形模板。
- 根据权利要求1-6任一项所述的方法,其中,所述宏像素级搜索步长包括宏像素级水平步长和宏像素级垂直步长;所述宏像素级水平步长为宏像素水平尺寸的整数倍,所述宏像素级垂直步长为宏像素垂直尺寸的整数倍。
- 根据权利要求1所述的方法,其中,所述常规像素级搜索过程包括:在以所述第二起始点为中心点的第二搜索范围内搜索相邻点,确定最小率失真代价值对应的第三搜索点;在以所述第三搜索点为中心点的第二搜索范围内搜索相邻点,确定最小率失真代价值对应的第四搜索点;所述第二起始点和所述第四搜索点相同时,确定所述第二起始点的运动矢量为所述第三运动矢量;所述第二起始点和所述第四搜索点不相同时,在以所述第四搜索点为中心点的第二搜索范围内搜索相邻点,直到第i次确定的搜索点和第i+2次确定的搜索点相同,将第i次确定的搜索点的运动矢量确定为所述第三运动矢量。
- 根据权利要求1所述的方法,其中,所述常规像素级搜索过程包括:在以所述第二起始点为中心点的第二搜索范围以常规像素为单位进行全搜索,确定最小率失真代价值对应的第三搜索点;所述第二起始点和所述第三搜索点相同时,确定所述第二起始点的运动矢量为所述第三运动矢量;所述第二起始点和所述第三搜索点不相同时,在以所述第三搜索点为中心点的第二搜索范围内以常规像素为单位进行全搜索,直到第i次确定的搜索点和第i+1次确定的搜索点相同,将第i次确定的搜索点的运动矢量确定为所述第三运动矢量。
- 根据权利要求10或11所述的方法,其中,所述常规像素级搜索过程还包括:初始化第一搜索次数;每次搜索结束后,所述第一搜索次数执行自加1操作;所述第一搜索次数等于第一次数阈值时,将常规像素级搜索过程中,最小率失真代价值对应的搜索点的运动矢量为所述第三运动矢量。
- 根据权利要求10-12任一项所述的方法,其中,所述常规像素级搜索步长包括常规像素级水平步长和常规像素级垂直步长;所述常规像素级水平步长为常规像素水平尺寸的整数倍,所述常规像素级垂直步长为常规像素垂直尺寸的整数倍。
- 根据权利要求1所述的方法,其中,所述确定当前块的第一运动矢量,包括:构建运动矢量候选列表;对所述运动矢量候选列表进行搜索,确定每个运动矢量对应的率失真代价值;确定最小率失真代价值对应的运动矢量为所述第一运动矢量。
- 根据权利要求1-14任一项所述的方法,其中,所述方法还包括:本次常规像素级搜索结束后,将本次常规像素级搜索得到的第三运动矢量作为新的第一运动矢量;执行下一次宏像素级搜索和下一次常规像素级搜索,确定下一次常规像素级搜索得到的第三运动矢量;所述根据所述第三运动矢量,确定当前块的最佳运动矢量,包括:在本次常规像素级搜索得到的第三运动矢量和下一次常规像素级搜索得到的第三运动矢量相同的情况下,搜索结束,确定当前块的最佳运动矢量。
- 根据权利要求15所述的方法,其中,所述方法还包括:在进行第一次宏像素级搜索之前,初始化第二搜索次数;每次宏像素级搜索和常规像素级搜索结束后,所述第二搜索次数执行自加1操作;所述根据所述第三运动矢量,确定当前块的最佳运动矢量,包括:所述第二搜索次数等于第二次数阈值时,搜索结束,将最后一次常规像素级搜索 得到的第三运动矢量作为当前块的所述最佳运动矢量。
- 根据权利要求1所述的方法,其中,所述方法还包括:确定当前块的运动矢量预测值;根据当前块的所述运动矢量预测值和所述最佳运动矢量,确定当前块的运动矢量残差值;编码当前块的所述运动矢量残差值,将得到的编码比特写入码流。
- 一种编码器,包括确定单元和搜索单元;其中:所述确定单元,配置为确定当前块的第一运动矢量;所述搜索单元,配置为将所述第一运动矢量指向的像素点作为第一起始点,以宏像素级搜索步长进行宏像素级搜索,确定当前块的第二运动矢量;所述搜索单元,还配置为将所述第二运动矢量指向的像素点作为第二起始点,以常规像素级搜索步长进行常规像素级搜索,确定当前块的第三运动矢量;所述确定单元,还配置为根据所述第三运动矢量,确定当前块的最佳运动矢量。
- 一种编码器,包括第一存储器和第一处理器;其中:所述第一存储器,用于存储能够在所述第一处理器上运行的计算机程序;所述第一处理器,用于在运行所述计算机程序时,执行如权利要求1至17中任一项所述的方法。
- 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序被执行时实现如权利要求1至17中任一项所述的方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2023/088555 WO2024216412A1 (zh) | 2023-04-16 | 2023-04-16 | 一种编码方法、编码器以及存储介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2023/088555 WO2024216412A1 (zh) | 2023-04-16 | 2023-04-16 | 一种编码方法、编码器以及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024216412A1 true WO2024216412A1 (zh) | 2024-10-24 |
Family
ID=93151885
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/088555 WO2024216412A1 (zh) | 2023-04-16 | 2023-04-16 | 一种编码方法、编码器以及存储介质 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024216412A1 (zh) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120281760A1 (en) * | 2011-05-04 | 2012-11-08 | Hyung Joon Kim | Iterative Grid-Pattern Motion Search |
CN107483936A (zh) * | 2017-08-01 | 2017-12-15 | 清华大学深圳研究生院 | 一种基于宏像素的光场视频帧间预测方法 |
US20180131960A1 (en) * | 2015-07-03 | 2018-05-10 | Huawei Technologies Co., Ltd. | Video coding method, video decoding method, video coding apparatus, and video decoding apparatus |
US20200137415A1 (en) * | 2017-06-30 | 2020-04-30 | Huawei Technologies Co., Ltdl. | Search region for motion vector refinement |
US20200322624A1 (en) * | 2018-03-07 | 2020-10-08 | Tencent Technology (Shenzhen) Company Limited | Video motion estimation method and apparatus, and storage medium |
US20200336760A1 (en) * | 2018-02-02 | 2020-10-22 | Panasonic Intellectual Property Corporation Of America | Encoder, decoder, encoding method, and decoding method |
-
2023
- 2023-04-16 WO PCT/CN2023/088555 patent/WO2024216412A1/zh unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120281760A1 (en) * | 2011-05-04 | 2012-11-08 | Hyung Joon Kim | Iterative Grid-Pattern Motion Search |
US20180131960A1 (en) * | 2015-07-03 | 2018-05-10 | Huawei Technologies Co., Ltd. | Video coding method, video decoding method, video coding apparatus, and video decoding apparatus |
US20200137415A1 (en) * | 2017-06-30 | 2020-04-30 | Huawei Technologies Co., Ltdl. | Search region for motion vector refinement |
CN107483936A (zh) * | 2017-08-01 | 2017-12-15 | 清华大学深圳研究生院 | 一种基于宏像素的光场视频帧间预测方法 |
US20200336760A1 (en) * | 2018-02-02 | 2020-10-22 | Panasonic Intellectual Property Corporation Of America | Encoder, decoder, encoding method, and decoding method |
US20200322624A1 (en) * | 2018-03-07 | 2020-10-08 | Tencent Technology (Shenzhen) Company Limited | Video motion estimation method and apparatus, and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111711824B (zh) | 视频编解码中的环路滤波方法、装置、设备及存储介质 | |
TWI843809B (zh) | 用於視訊寫碼中具有運動向量差之合併模式之信令傳輸 | |
TWI650007B (zh) | 圖像處理裝置及方法、電腦程式產品以及記錄媒體 | |
TW202101989A (zh) | 用於視訊寫碼之參考圖像重採樣及框間寫碼工具 | |
CN112913250B (zh) | 编码器、解码器及对任意ctu尺寸使用ibc搜索范围优化的对应方法 | |
US11070812B2 (en) | Coefficient domain block differential pulse-code modulation in video coding | |
WO2020125595A1 (zh) | 视频译码器及相应方法 | |
US20200288126A1 (en) | Reshaping filter average calculation for video coding | |
CN109923865A (zh) | 一种视频数据的解码方法、解码装置、编码方法和编码装置 | |
TW202126040A (zh) | 用於視訊編碼的簡化的調色板預測器更新 | |
TW202112135A (zh) | 用於視訊寫碼之色度內預測單元 | |
US20220215593A1 (en) | Multiple neural network models for filtering during video coding | |
TW202234883A (zh) | 塊內複製暫用視框緩衝器 | |
WO2022117036A1 (zh) | 量化参数的解码方法和装置 | |
TW202127884A (zh) | 用於用於視訊譯碼的跨分量適應性環路濾波的位元位移 | |
US20200314430A1 (en) | Method and device for context-adaptive binary arithmetic coding a sequence of binary symbols representing a syntax element related to video data | |
JP4641892B2 (ja) | 動画像符号化装置、方法、及びプログラム | |
WO2020143585A1 (zh) | 视频编码器、视频解码器及相应方法 | |
CN114830665B (zh) | 仿射运动模型限制 | |
TW202143715A (zh) | 視訊譯碼中的子區塊合併候選的信令數目 | |
CN115665407B (zh) | 用于帧内预测的分量间线性建模方法和装置 | |
WO2020253681A1 (zh) | 融合候选运动信息列表的构建方法、装置及编解码器 | |
WO2020224476A1 (zh) | 一种图像划分方法、装置及设备 | |
US20200304785A1 (en) | Simplified non-linear adaptive loop filter | |
WO2024216412A1 (zh) | 一种编码方法、编码器以及存储介质 |