CN117596399B

CN117596399B - Transformation parameter determining method and device, electronic equipment and storage medium

Info

Publication number: CN117596399B
Application number: CN202311305558.6A
Authority: CN
Inventors: 樊星星
Original assignee: Xiaohongshu Technology Co ltd
Current assignee: Xiaohongshu Technology Co ltd
Priority date: 2023-10-09
Filing date: 2023-10-09
Publication date: 2024-10-22
Anticipated expiration: 2043-10-09
Also published as: CN117596399A

Abstract

The application discloses a transformation parameter determining method, a transformation parameter determining device, electronic equipment and a storage medium, wherein the transformation parameter determining method comprises the following steps: determining a maximum depth of the first coding block, the maximum depth being used to identify a maximum number of partitions of the first coding block; determining a plurality of candidate segmentation sizes of the first coding block according to the maximum depth; dividing the first coding block according to each candidate division size in the plurality of candidate division sizes to obtain a plurality of second coding blocks; determining a plurality of transformation cores according to the attribute information of each second coding block in the plurality of second coding blocks, wherein the plurality of transformation cores are in one-to-one correspondence with the plurality of second coding blocks; according to each transformation core in the plurality of transformation cores, performing transformation processing on a second coding block corresponding to each transformation core to obtain a plurality of rate-distortion costs, wherein the plurality of rate-distortion costs are in one-to-one correspondence with the plurality of transformation cores; and taking the transformation kernel corresponding to the minimum value and the division size of the second coding block corresponding to the minimum value in the multiple rate distortion costs as transformation parameters of the first coding block.

Description

Transformation parameter determining method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of video coding technologies, and in particular, to a method and apparatus for determining a transformation parameter, an electronic device, and a storage medium.

Background

In AV1 (AOMedia Video 1), transform Coding (Transform Coding) is performed in units of blocks, which are called Transform blocks (Transform blocks). In general, a coding block may be divided into a plurality of transform blocks in a plurality of ways, and each transform block may be further divided to obtain a plurality of smaller transform blocks. In order to find the best transform block partitioning, the encoder needs to calculate its rate-distortion Cost (Rate Distortion Cost, RD Cost) for each possible size of transform block and transform type, and then select the minimum RD Cost as the final transform block size and transform type. While the transform blocks in AV1 support multiple different size partitions, while each support 16 possible transform core types, existing methods require traversing all combinations of partition sizes and transform cores, which results in a relatively high computational complexity for the AV1 encoder while the coding efficiency is low.

Disclosure of Invention

In order to solve the above-mentioned problems in the prior art, embodiments of the present application provide a transformation parameter determining method, apparatus, electronic device, and storage medium, which can quickly determine an optimal segmentation size and transformation kernel of a current coding block without reducing compression performance of an AV1 video encoder, thereby improving coding speed while reducing computation complexity of the AV1 video encoder.

In a first aspect, an embodiment of the present application provides a transformation parameter determining method, including:

determining a maximum depth of the first coding block, wherein the maximum depth is used for identifying a maximum number of divisions of the first coding block;

determining a plurality of candidate segmentation sizes of the first coding block according to the maximum depth;

Dividing the first coding block according to each candidate division size in the plurality of candidate division sizes to obtain a plurality of second coding blocks;

determining a plurality of transformation cores according to attribute information of each second coding block in the plurality of second coding blocks, wherein the plurality of transformation cores are in one-to-one correspondence with the plurality of second coding blocks;

According to each transformation core in the plurality of transformation cores, performing transformation processing on a second coding block corresponding to each transformation core to obtain a plurality of rate-distortion costs, wherein the plurality of rate-distortion costs are in one-to-one correspondence with the plurality of transformation cores;

and taking the transformation kernel corresponding to the minimum value and the division size of the second coding block corresponding to the minimum value in the multiple rate distortion costs as transformation parameters of the first coding block.

In one possible implementation, determining a plurality of transform kernels according to attribute information of each of the plurality of second coding blocks includes:

determining a plurality of candidate transformation kernels according to the attribute information of each second coding block;

performing multiple transformation processing on each second coding block according to the multiple candidate transformation cores, and determining a transformation core corresponding to each second coding block;

Taking a set of transformation cores corresponding to each second coding block as a plurality of transformation cores;

wherein the ith one of the plurality of transform processes includes:

transforming each second coding block according to a candidate transformation core Ai to determine Hadamard transformation absolute difference and Bi, wherein the candidate transformation core Ai is an ith transformation core in a plurality of candidate transformation cores, and i is an integer greater than or equal to 1;

if the Hadamard transform absolute difference sum Bi is in a first range, determining a candidate transform kernel Ai as a transform kernel corresponding to each second coding block;

If the Hadamard transform absolute difference sum Bi is in a second range, determining a coeff value Ci, and when the coeff value Ci is in a third range, determining a candidate transformation core Ai as a transformation core corresponding to each second coding block, otherwise, ending the ith transformation process, and performing the (i+1) th transformation process until the transformation core corresponding to each second coding block is determined or a plurality of candidate transformation cores are traversed;

If the Hadamard transform absolute difference sum Bi is in the fourth range, directly ending the ith transform processing, and performing the (i+1) th transform processing until the transform core corresponding to each second coding block is determined or a plurality of candidate transform cores are traversed.

In one possible implementation, the attribute information includes: picture content information and block type information, determining a plurality of candidate transform kernels from attribute information of each second encoded block, comprising:

If the picture content information of each second coding block meets the first condition, determining a plurality of first transformation cores as a plurality of candidate transformation cores of each second coding block, wherein the plurality of first transformation cores are transformation cores comprising IDTX transformation in all transformation cores;

If the picture content information of each second coding block does not meet the first condition and the block type information of each second coding block meets the second condition, determining a plurality of second transformation cores as a plurality of candidate transformation cores of each second coding block, wherein the plurality of second transformation cores are transformation cores which do not comprise FLIPADST transformation in all transformation cores;

Otherwise, all transform kernels are determined as a plurality of candidate transform kernels for each second encoding block.

In one possible implementation, when the plurality of candidate transform kernels are transform kernels that do not include FLIPADST transforms from among all the transform kernels, in the 1 st transform process, the candidate transform kernel A1 is a dct_dct transform kernel.

In one possible implementation, before determining the plurality of candidate transform kernels according to the attribute information of each second coding block, the method further includes:

determining mode information of neighboring coding blocks of each second coding block;

If the mode information meets the third condition and the size information of each second coding block meets the fourth condition, ending the processing of each second coding block;

And if the mode information meets the third condition and the prediction mode of each second coding block meets the fifth condition, carrying out subsequent processing on a designated area in each second coding block, wherein the designated area is a 1/4 area at the upper right corner of each second coding block.

In one possible implementation, determining the maximum depth of the first encoded block includes:

Determining a block type of the first encoded block;

Determining a current depth of the first coding block according to the block type and the size information of the first coding block;

If the block type meets the sixth condition, the maximum depth of the first coding block is the current depth of the first coding block plus 1;

if the block type satisfies the seventh condition, the maximum depth of the first encoded block is the current depth of the first encoded block plus 2.

In one possible implementation, after determining the block type of the first encoded block, the method further comprises:

determining whether a previous depth exists in the first coding block according to the current depth;

if so, acquiring a quantized coefficient after the coding of the coding block corresponding to the previous depth;

when the quantized coefficient satisfies the eighth condition, the determination processing of the transform parameter of the first encoded block is ended.

In a second aspect, an embodiment of the present application provides a transformation parameter determining device, including:

The analysis module is used for determining the maximum depth of the first coding block, wherein the maximum depth is used for identifying the maximum segmentation times of the first coding block, determining a plurality of candidate segmentation sizes of the first coding block according to the maximum depth, and segmenting the first coding block according to each candidate segmentation size in the plurality of candidate segmentation sizes to obtain a plurality of second coding blocks;

the traversal module is used for determining a plurality of transformation cores according to the attribute information of each second coding block in the plurality of second coding blocks, wherein the plurality of transformation cores are in one-to-one correspondence with the plurality of second coding blocks;

and the processing module is used for carrying out transformation processing on the second coding block corresponding to each transformation core according to each transformation core in the plurality of transformation cores to obtain a plurality of rate distortion costs, wherein the plurality of rate distortion costs are in one-to-one correspondence with the plurality of transformation cores, and the transformation core corresponding to the minimum value and the segmentation size of the second coding block corresponding to the minimum value in the plurality of rate distortion costs are used as transformation parameters of the first coding block.

In a third aspect, an embodiment of the present application provides an electronic device, including: and a processor coupled to the memory, the memory for storing a computer program, the processor for executing the computer program stored in the memory to cause the electronic device to perform the method as in the first aspect.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program, the computer program causing a computer to perform the method as in the first aspect.

In a fifth aspect, embodiments of the present application provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program, the computer being operable to cause a computer to perform a method as in the first aspect.

The implementation of the embodiment of the application has the following beneficial effects:

In the embodiment of the application, the maximum depth of the first coding block is determined, so that a plurality of candidate division sizes of the first coding block are further determined, and then the division sizes are screened, so that the number of the division sizes needing to be traversed is reduced. Then, the first code block is divided according to each candidate division size in the plurality of candidate division sizes, and a plurality of second code blocks are obtained. And then determining the optimal transformation core corresponding to each second coding block according to the attribute information of each second coding block in the plurality of second coding blocks to obtain a plurality of transformation cores. And finally, according to each transformation core in the plurality of transformation cores, carrying out transformation processing on the second coding block corresponding to each transformation core to obtain a plurality of rate distortion costs. And then, taking the transformation kernel corresponding to the minimum value and the division size of the second coding block corresponding to the minimum value in the multiple rate distortion costs as transformation parameters of the first coding block. Therefore, the optimal transformation parameters of the first coding block can be determined by only traversing the combination of a plurality of candidate division sizes determined according to the maximum depth and the second coding block and the corresponding optimal transformation core under the corresponding division sizes. Compared with the scheme of traversing all possible combinations of the segmentation size and the transformation kernel in the prior art, the scheme greatly reduces the number of combinations to be traversed by screening the segmentation size and the transformation kernel, and then can quickly determine the optimal segmentation size and the transformation kernel of the current coding block on the premise of not reducing the compression performance of the AV1 video encoder, thereby realizing the improvement of the coding speed while reducing the calculation complexity of the AV1 video encoder.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic hardware structure of a transformation parameter determining device according to an embodiment of the present application;

Fig. 2 is a schematic flow chart of a method for determining transformation parameters according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a rule for recursive partitioning of N, N/2N and N/4N encoded blocks according to an embodiment of the present application;

FIG. 4 is a flowchart of a method for determining an optimal transform kernel for each second encoding block according to an embodiment of the present application;

FIG. 5 is a functional block diagram of a transformation parameter determining device according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the present application. All other embodiments, based on the embodiments of the application, which are apparent to those of ordinary skill in the art without inventive faculty, are intended to be within the scope of the application.

The terms "first," "second," "third," and "fourth" and the like in the description and in the claims and drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those skilled in the art will explicitly and implicitly understand that the embodiments described herein may be combined with other embodiments.

Referring to fig. 1, fig. 1 is a schematic hardware structure of a transformation parameter determining device according to an embodiment of the present application. The transformation parameter determination device 100 comprises at least one processor 101, a communication line 102, a memory 103 and at least one communication interface 104.

In this embodiment, the processor 101 may be a general-purpose central processing unit (central processing unit, CPU), microprocessor, application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of the program according to the present application.

Communication line 102 may include a pathway to transfer information between the above-described components.

The communication interface 104, which may be any transceiver-like device (e.g., antenna, etc.), is used to communicate with other devices or communication networks, such as ethernet, RAN, wireless local area network (wireless local area networks, WLAN), etc.

The memory 103 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (random access memory, RAM) or other type of dynamic storage device that can store information and instructions, or an electrically erasable programmable read-only memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-only memory, EEPROM), a compact disc (compact disc read-only memory) or other optical disc storage, optical disc storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.

In this embodiment, the memory 103 may be independently provided and connected to the processor 101 via the communication line 102. Memory 103 may also be integrated with processor 101. The memory 103 provided by embodiments of the present application may generally have non-volatility. The memory 103 is used for storing computer-executable instructions for executing the scheme of the present application, and is controlled by the processor 101 to execute the instructions. The processor 101 is configured to execute computer-executable instructions stored in the memory 103 to implement the methods provided in the embodiments of the present application described below.

In alternative embodiments, computer-executable instructions may also be referred to as application code, as the application is not particularly limited.

In alternative embodiments, processor 101 may include one or more CPUs, such as CPU0 and CPU1 in fig. 1.

In alternative embodiments, the transformation parameter determination device 100 may include multiple processors, such as the processor 101 and the processor 107 in fig. 1. Each of these processors may be a single-core (single-CPU) processor or may be a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).

In an alternative embodiment, if the transformation parameter determining device 100 is a server, for example, it may be a stand-alone server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery network (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platform. The transformation parameter determining means 100 may further comprise an output device 105 and an input device 106. The output device 105 communicates with the processor 101 and may display information in a variety of ways. For example, the output device 105 may be a Liquid Crystal Display (LCD) CRYSTAL DISPLAY, a Light Emitting Diode (LED) display device, a Cathode Ray Tube (CRT) display device, or a projector (projector), or the like. The input device 106 is in communication with the processor 101 and may receive user input in a variety of ways. For example, the input device 106 may be a mouse, a keyboard, a touch screen device, a sensing device, or the like.

The transformation parameter determining device 100 may be a general-purpose device or a special-purpose device. The embodiment of the present application does not limit the type of the transformation parameter determining device 100.

Hereinafter, a transformation parameter determining method disclosed in the present application will be described.

Referring to fig. 2, fig. 2 is a flow chart of a method for determining transformation parameters according to an embodiment of the present application. The transformation parameter determining method comprises the following steps:

201: a maximum depth of the first encoded block is determined.

In this embodiment, the maximum depth is used to identify the maximum number of partitions of the first encoded block. In video coding standards based on hybrid coding frameworks, transform coding is typically used on the prediction residual pixels to remove possible spatial correlation. In the AV1 video coding standard, the largest transform block size is 64x64 and the smallest transform block size is 4x4, in addition to the square block partition mode of NxN, rectangular transform block partitions of NxN/2, N/2xN, nxN/4 and N/4xN are supported. There are 19 kinds of transform block sizes supported together, and the size of one transform block is ：64x64,64x32,32x64,64x16,16x64,32x32,32x16,16x32,32x8,8x32,16x16,16x8,8x16,16x4,4x16,8x8,8x4,4x8,4x4., which is determined by the mode of transform partition and the size of the coding block.

For inter coded blocks, all transform blocks may be recursively partitioned in size, typically the initial size of the transform block recursively partition is equal to the size of the coded block, which does not exceed 64x64 when the coded block is greater than 64x64. For the luminance component, the transform blocks support at most 2 recursive divisions (i.e. 1 coding block can be divided into at most 21 transform blocks), and the rules for recursive division of coding blocks for nxn, N/2xN and N/4xN are shown in fig. 3.

In brief, for 1:1 square blocks, 4 1:1 square blocks can be obtained by first partitioning or not partitioning, and then 4 1:1 square blocks can be partitioned or not partitioned for the second time; for rectangular blocks with the ratio of 1:2 or 2:1, two square blocks with the ratio of 1:1 can be obtained by first segmentation, or the square blocks with the ratio of 1:1 can not be segmented, and then the square blocks with the ratio of 2 pieces with the ratio of 1:1 can be segmented for the second time or not; for 1:4 or 4:1 rectangular blocks, two 1:2 or 2:1 rectangular blocks can be obtained by first partitioning, or the rectangular blocks can be not partitioned, and then 2 rectangular blocks with the ratio of 1:2 or 2:1 can be partitioned or not partitioned for the second time.

For an intra coded block, all transform block sizes partitioned by it must be the same. For example a 32x16 coded block, a first block partition will result in two 16x16 transformed blocks, on the basis of which a second block partition will result in 8x8 transformed blocks. That is, the transform block of the intra-frame encoded block may be obtained by recursively dividing the transform block, and the initial size of the transform block recursively dividing is generally equal to the size of the encoded block, and the transform block supports at most 2 recursively dividing for the luminance component.

In this embodiment, the transform block sizes of the chrominance components of either inter-coded or intra-coded blocks are generally equal to the size of the chrominance components of the encoded block, and if the width or height of the chrominance components of the encoded block is greater than 32, the corresponding chrominance transform block width or height will be 32.

In this embodiment, the depth information is used to identify the depth of the current coding block in the partition tree established by the above-described partition rules. Illustratively, taking the above-mentioned partition rule of 3 sizes of inter-coded blocks as an example, taking the largest size in each size as depth 1, and adding 1 to the depth every time the partition is performed. Specifically, taking the size of NxN as an example, the corresponding maximum size is 64x64, and the depth of a transform block having a size of 64x64 is set to 1. The next segmentation will result in 4 transform blocks of 32x32, with a depth of 2 for transform blocks of size 32x32, and so on.

Based on this, in the present embodiment, the block type of the first encoded block may be determined, and then the current depth of the first encoded block may be determined according to the block type and the size information of the first encoded block. If the block type meets the sixth condition, the maximum depth of the first coding block is the current depth of the first coding block plus 1; if the block type satisfies the seventh condition, the maximum depth of the first encoded block is the current depth of the first encoded block plus 2. Specifically, if the current block is an inter-coded inter block, its maximum depth is 1 greater than the current depth, i.e., the segmentation of the transform block is performed at most once; if the current block is an intra-coded intra block, its maximum depth is 2 greater than the current depth.

In the present embodiment, it may also be determined whether the first encoded block has a previous depth after determining the block type of the first encoded block. That is, it is determined whether the current depth of the first coded block is 1, and if the current depth is not 1, it is determined that the first coded block is the next coded block obtained by dividing the upper coded block, and the previous depth exists. At this time, the encoded block corresponding to the previous depth, that is, the quantized coefficient obtained after encoding by the upper-level encoding of the first encoded block may be acquired, and then, when the quantized coefficient satisfies the eighth condition, for example, the quantized coefficient is 0, the determination processing of the transform parameter of the first encoded block may be ended.

202: A plurality of candidate partition sizes for the first encoded block are determined based on the maximum depth.

For example, for an intra-coded block of size 32x32, the maximum depth is 2 greater than the current depth, i.e. a maximum of 2 more partitions of transform blocks may be performed. Based on the segmentation rule, if the segmentation is performed for 0 times, 1 coding block of 32x32 can be obtained; if 1 division is performed, 4 coding blocks of 16x16 can be obtained; if the division is performed 2 times, the following combination can be obtained:

(1) 16 8x8 code blocks;

(2) 1 coding block of 16x16 and 12 coding blocks of 8x 8;

(3) 2 16x16 code blocks and 8x8 code blocks;

(4) 316 x16 code blocks and 48 x8 code blocks.

Thus, for the intra-coded block of size 32x32, its corresponding plurality of candidate partition sizes are 32x32, 16x16, and 8x8, respectively.

203: And dividing the first coding block according to each candidate division size in the plurality of candidate division sizes to obtain a plurality of second coding blocks.

Along with the example of using the intra-coded block with the size of 32x32, the intra-coded block with the size of 32x32 may be segmented according to the segmentation combination of the above example, to obtain the second coded block with each candidate segmentation size.

204: A plurality of transform kernels are determined based on attribute information of each of the plurality of second encoded blocks.

In this embodiment, the plurality of transform cores are in one-to-one correspondence with the plurality of second code blocks, and are optimal transform cores corresponding to each of the second code blocks.

In AV1, a total of 4 different Transform kernels are supported, discrete cosine Transform (Discrete Cosine Transform, DCT), asymmetric discrete sine Transform (ASYMMETRIC DISCRETE SINE Transform, ADST), inverted asymmetric discrete sine Transform (FLIPPED ASYMMETRIC DISCRETE SINE Transform, FLIPADST), and identity Transform (Identity Transform, IDTX), respectively. Since the two-dimensional transformation process of the transformation block can be independently performed in two one-dimensional directions, i.e., horizontal and vertical directions, there are 16 possible combined transformation kernels, as shown in table 1:

TABLE 1

Conversion core serial number	Transforming core names	In the vertical direction	In the horizontal direction
				0	DCT-DCT	DCT	DCT
1	ADST_DCT	ADST	DCT
				2	DCT_ADST	DCT	ADST
3	ADST_ADST	ADST	ADST
				4	FLIPADST_DCT	FLIPADST	DCT
5	DCT_FLIPADST	DCT	FLIPADST
				6	FLIPADST_FLIPADST	FLIPADST	FLIPADST
7	ADST_FLIPADST	ADST	FLIPADST
				8	FLIPADST_ADST	FLIPADST	ADST
9	IDTX_IDTX	IDTX	IDTX
				10	DCT_IDTX	DCT	IDTX
11	IDTX_DCT	IDTX	DCT
				12	ADST_IDTX	ADST	IDTX
13	IDTX_ADST	IDTX	ADST
				14	FLIPADST_IDTX	FLIPADST	IDTX
15	IDTX_FLIPADST	IDTX	FLIPADST

In the current solution, in order to obtain the best coding performance, the AV1 encoder needs to traverse all the above transform kernels under all possible partition sizes for coding evaluation, respectively, and then select the combination of partition size and transform kernel that can produce the best performance. The computational complexity of an exhaustive search can be high and therefore a method needs to be found to evaluate as few partition sizes and transform kernels as possible without significant loss of video compression performance.

In the present embodiment, screening of the division sizes is achieved through steps 201 to 203, and then the number of division sizes is reduced. Hereinafter, a method of screening the transform sums, that is, determining the optimal transform kernel for each second encoded block obtained after division by the division size selected by the screening will be described in detail.

Specifically, as shown in fig. 4, the method of determining the optimal transform kernel for each second coding block includes:

401: a plurality of candidate transform kernels are determined from the attribute information of each second encoding block.

In this embodiment, the attribute information may include: picture content information and block type information. Based on this, if the picture content information of each second coding block satisfies the first condition, for example, if the picture content information indicates that the picture content of the second coding block is screen content, since IDTX transform is applicable to the screen content, it is possible to determine that a plurality of first transform kernels including IDTX transform among all the transform kernels are a plurality of candidate transform kernels of each second coding block. The screen content refers to that the current picture comes from various screens, such as video pictures obtained by recording the screen.

If the picture content information of each second coding block does not satisfy the first condition and the block type information of each second coding block is sufficient for the second condition, for example, the block type information indicates that the second coding block is an intra coding block, it may be determined that a plurality of second transform cores excluding FLIPADST transforms from all transform cores are a plurality of candidate transform cores of each second coding block. That is, transform kernels left after FLIPADST _ FLIPADST, FLIPADST _dct, dct_ FLIPADST, FLIPADST _idtx, idtx_ FLIPADST, FLIPADST _adst, and adst_ FLIPADST transform kernels are removed in table 1. Meanwhile, under this condition, the dct_dct transform core may be prioritized highest, and then processed first in the subsequent transform processing.

If the picture content information of each second coding block does not meet the first condition and the block type information of each second coding block does not meet the second condition, determining all the transformation cores as a plurality of candidate transformation cores of each second coding block.

402: And carrying out multiple transformation processing on each second coding block according to the multiple candidate transformation cores, and determining a transformation core corresponding to each second coding block.

In this embodiment, each of the plurality of candidate transform kernels may be sequentially selected, and the transform processing may be performed on the second encoded block. Specifically, the ith transform process may include:

Firstly, each second coding block is transformed according to a candidate transformation core Ai, and a Hadamard transformation absolute difference sum (Sum of Absolute Transformed Difference, SATD) Bi is determined, wherein the candidate transformation core Ai is an ith transformation core in a plurality of candidate transformation cores, and i is an integer greater than or equal to 1. Meanwhile, based on the special case mentioned in step 401, when the plurality of candidate transform cores are transform cores excluding FLIPADST transforms from among all the transform cores, in the 1 st transform process, the candidate transform core A1 is fixed as a dct_dct transform core.

Since SATD is a value generated in the transformation process, it is not required to wait until the entire transformation process is completed to obtain it. Therefore, the value of the SATD can be determined, and then the subsequent processing can be determined. Specifically, if SATD Bi is in the first range, for example: when the SATD is less than 1, the current candidate transform kernel Ai may be directly determined as the transform kernel corresponding to each second coding block. And then, when the current transformation core is processed, the current transformation core is determined to be the optimal transformation core of the corresponding second coding block, the whole transformation processing is not required to be completed, the transformation processing of the transformation core which is required to be subsequently waited is not required, a large amount of computation resources are saved, and the whole transformation efficiency is improved.

If SATD Bi is in the second range, for example: SATD is 1 or more and 2 or less. The coeff value Ci of the transform process is determined and when the coeff value Ci is in a third range, for example coeff is less than 1, the candidate transform kernel Ai may be directly determined as the transform kernel corresponding to each second coding block, otherwise, it is indicated that the current transform kernel is not the best transform kernel for that second coding block. And ending the ith transformation process, and performing the (i+1) th transformation process until the transformation core corresponding to each second coding block is determined or a plurality of candidate transformation cores are traversed.

If SATD Bi is in the fourth range, for example, SATD is greater than 2, it indicates that the transform processing effect under the current transform core is too poor to be the best transform core for the second coding block. The ith transformation process can be directly ended, and the (i+1) th transformation process is performed until the transformation core corresponding to each second coding block is determined or a plurality of candidate transformation cores are traversed.

In this embodiment, if all the candidate transform kernels are traversed, and the above condition is not satisfied, a transform kernel with the relatively minimum SATD and coeff among all the candidate transform kernels is selected as the optimal transform kernel of the second coding block.

403: And taking the set of transformation cores corresponding to each second coding block as a plurality of transformation cores.

In this embodiment, after determining the transform core corresponding to each second coding block, that is, the optimal transform core, the set of transform cores may be used as the plurality of transform cores.

In an alternative embodiment, before each second coding block is checked to perform multiple transform processing according to multiple candidate transforms corresponding to each second coding block, whether the processing of the second coding block can be skipped or not can be determined according to the information of each second coding block and the information of the adjacent coding blocks, so that the overall transform efficiency is further improved. Specifically, adjacent code blocks of each second code block may be first determined, for example: mode information of the left and upper encoded blocks. If the mode information satisfies the third condition, for example: the mode information is skip mode, and the size information of each second encoding block satisfies a fourth condition, for example: if the size information is greater than 16×16, the processing of the second encoded block may be ended, i.e., the processing of the second encoded block may be skipped and the next second encoded block may be processed.

If the mode information satisfies the third condition, and the prediction mode of each second coding block satisfies the fifth condition, for example: if the prediction mode is NEAREST _ NEARESTMV, the subsequent processing can be performed only on the designated area of the 1/4 area of the upper right corner of each second coding block, so as to reduce the requirement on calculation force and the complexity of calculation.

205: And according to each transformation core in the plurality of transformation cores, performing transformation processing on the second coding block corresponding to each transformation core to obtain a plurality of rate distortion costs.

In this embodiment, a plurality of rate-distortion costs (Rate Distortion Cost, RD Cost) are in one-to-one correspondence with a plurality of transform kernels.

206: And taking the transformation kernel corresponding to the minimum value and the division size of the second coding block corresponding to the minimum value in the multiple rate distortion costs as transformation parameters of the first coding block.

In summary, in the transform parameter determining method provided by the present invention, the maximum depth of the first coding block is determined, so as to further determine a plurality of candidate partition sizes of the first coding block, and then filter the partition sizes, thereby reducing the number of partition sizes that need to be traversed. Then, the first code block is divided according to each candidate division size in the plurality of candidate division sizes, and a plurality of second code blocks are obtained. And then determining the optimal transformation core corresponding to each second coding block according to the attribute information of each second coding block in the plurality of second coding blocks to obtain a plurality of transformation cores. And finally, according to each transformation core in the plurality of transformation cores, carrying out transformation processing on the second coding block corresponding to each transformation core to obtain a plurality of rate distortion costs. And then, taking the transformation kernel corresponding to the minimum value and the division size of the second coding block corresponding to the minimum value in the multiple rate distortion costs as transformation parameters of the first coding block. Therefore, the optimal transformation parameters of the first coding block can be determined by only traversing the combination of a plurality of candidate division sizes determined according to the maximum depth and the second coding block and the corresponding optimal transformation core under the corresponding division sizes. Compared with the scheme of traversing all possible combinations of the segmentation size and the transformation kernel in the prior art, the scheme greatly reduces the number of combinations to be traversed by screening the segmentation size and the transformation kernel, and then can quickly determine the optimal segmentation size and the transformation kernel of the current coding block on the premise of not reducing the compression performance of the AV1 video encoder, thereby realizing the improvement of the coding speed while reducing the calculation complexity of the AV1 video encoder.

Referring to fig. 5, fig. 5 is a functional block diagram of a transformation parameter determining device according to an embodiment of the present application. As shown in fig. 5, the transformation parameter determining device 500 includes:

an analysis module 501, configured to determine a maximum depth of the first coding block, where the maximum depth is used to identify a maximum number of partitions of the first coding block, determine a plurality of candidate partition sizes of the first coding block according to the maximum depth, and partition the first coding block according to each candidate partition size of the plurality of candidate partition sizes to obtain a plurality of second coding blocks;

The traversal module 502 is configured to determine a plurality of transformation kernels according to attribute information of each of the plurality of second encoding blocks, where the plurality of transformation kernels are in one-to-one correspondence with the plurality of second encoding blocks;

and the processing module 503 is configured to perform a transform process on the second coding block corresponding to each transform core according to each transform core in the plurality of transform cores, to obtain a plurality of rate-distortion costs, where the plurality of rate-distortion costs are in one-to-one correspondence with the plurality of transform cores, and the transform core corresponding to the minimum value and the partition size of the second coding block corresponding to the minimum value in the plurality of rate-distortion costs are used as the transform parameters of the first coding block.

In an embodiment of the present invention, in determining a plurality of transformation kernels according to the attribute information of each of the plurality of second coding blocks, the traversing module 502 is specifically configured to:

wherein the ith one of the plurality of transform processes includes:

In an embodiment of the present invention, the attribute information includes: picture content information and block type information, based on which, in determining a plurality of candidate transform kernels from the attribute information of each second encoded block, the traversal module 502 is specifically configured to:

In the embodiment of the present invention, when the plurality of candidate transform kernels are transform kernels excluding FLIPADST transforms from all the transform kernels, in the 1 st transform process, the candidate transform kernel A1 is a dct_dct transform kernel.

In an embodiment of the present invention, before determining a plurality of candidate transform kernels according to the attribute information of each second encoding block, the traversing module 502 is further configured to:

In an embodiment of the present invention, the analysis module 501 is specifically configured to, in determining the maximum depth of the first coding block:

Determining a block type of the first encoded block;

In an embodiment of the present invention, after determining the block type of the first encoded block, the analysis module 501 is further configured to:

Referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 6, the electronic device 600 includes a transceiver 601, a processor 602, and a memory 603. Which are connected by a bus 604. The memory 603 is used for storing computer programs and data, and the data stored in the memory 603 can be transferred to the processor 602.

The processor 602 is configured to read a computer program in the memory 603 to perform the following operations:

In an embodiment of the present invention, the processor 602 is specifically configured to, in determining a plurality of transform kernels according to attribute information of each of the plurality of second encoding blocks:

wherein the ith one of the plurality of transform processes includes:

In an embodiment of the present invention, the attribute information includes: picture content information and block type information, based on which the processor 602 is specifically configured to perform the following operations in determining a plurality of candidate transform kernels from the attribute information of each second encoded block:

In an embodiment of the present invention, before determining the plurality of candidate transform kernels according to the attribute information of each second encoding block, the processor 602 is further configured to:

In an embodiment of the present invention, the processor 602 is specifically configured to, in determining the maximum depth of the first encoded block, perform the following operations:

Determining a block type of the first encoded block;

In an embodiment of the present invention, after determining the block type of the first encoded block, the processor 602 is further configured to:

It should be understood that the transformation parameter determining device in the present application may include a smart Phone (such as an Android Mobile Phone, an iOS Mobile Phone, a Windows Phone Mobile Phone, etc.), a tablet computer, a palm computer, a notebook computer, a Mobile internet device MID (Mobile INTERNET DEVICES, abbreviated as MID), a robot, a wearable device, etc. The above-described transformation parameter determining means are merely examples and are not exhaustive, including but not limited to the above-described transformation parameter determining means. In practical application, the transformation parameter determining device may further include: intelligent vehicle terminals, computer devices, etc.

From the above description of embodiments, it will be apparent to those skilled in the art that the present invention may be implemented in software in combination with a hardware platform. With such understanding, all or part of the technical solution of the present invention contributing to the background art may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in the various embodiments or parts of the embodiments of the present invention.

Accordingly, the present application also provides a computer-readable storage medium storing a computer program that is executed by a processor to implement some or all of the steps of any one of the transformation parameter determination methods described in the above method embodiments. For example, the storage medium may include a hard disk, a floppy disk, an optical disk, a magnetic tape, a magnetic disk, a flash memory, etc.

Embodiments of the present application also provide a computer program product comprising a non-transitory computer-readable storage medium storing a computer program operable to cause a computer to perform part or all of the steps of any one of the transformation parameter determination methods described in the method embodiments above.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are alternative embodiments, and that the acts and modules involved are not necessarily required for the present application.

In the foregoing embodiments, the descriptions of the embodiments are focused on, and for those portions of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, such as the division of the units, merely a logical function division, and there may be additional divisions when actually implemented, such as multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units described above may be implemented either in hardware or in software program modules.

The integrated units, if implemented in the form of software program modules, may be stored in a computer-readable memory for sale or use as a stand-alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or partly in the form of a software product or all or part of the technical solution, which is stored in a memory, and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned memory includes: a usb disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the above embodiments may be implemented by a program that instructs associated hardware, and the program may be stored in a computer readable memory, and the memory may include: flash disk, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.

The foregoing has outlined rather broadly the more detailed description of the embodiments of the application in order that the detailed description of the principles and embodiments of the application may be implemented in conjunction with the detailed description of the embodiments that follows, the claims being merely intended to facilitate the understanding of the method and concepts underlying the application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1.A method of determining transformation parameters, the method comprising:

determining a maximum depth of a first coding block, wherein the maximum depth is used for identifying a maximum segmentation number of the first coding block;

Determining a plurality of candidate partition sizes of the first coding block according to the maximum depth;

determining a plurality of transformation cores according to the attribute information of each second coding block in the plurality of second coding blocks, wherein the plurality of transformation cores are in one-to-one correspondence with the plurality of second coding blocks; comprising the following steps:

If the picture content of each second coding block is screen content, determining a transformation core comprising identical transformation IDTX in all transformation cores as a plurality of candidate transformation cores of each second coding block; if the picture content of each second coding block is not the screen content and each second coding block is an intra-frame coding block, determining that the transformation core which does not comprise the inverted asymmetric discrete sine transformation FLIPADST in all transformation cores is a plurality of candidate transformation cores of each second coding block; if the picture content of each second coding block is not the screen content and each second coding block is an inter coding block, determining that all transformation cores are a plurality of candidate transformation cores of each second coding block; performing multiple transformation processing on each second coding block according to the multiple candidate transformation cores to obtain a transformation core corresponding to each second coding block;

and taking a transformation kernel corresponding to a minimum value and a division size of a second coding block corresponding to the minimum value in the plurality of rate distortion costs as transformation parameters of the first coding block.

2. The method of claim 1, wherein the step of determining the position of the substrate comprises,

Wherein the ith one of the plurality of transform processes includes:

transforming each second coding block according to a candidate transformation core Ai, and determining a Hadamard transformation absolute difference sum Bi, wherein the candidate transformation core Ai is an ith transformation core in the plurality of candidate transformation cores, and i is an integer greater than or equal to 1;

If the Hadamard transform absolute difference sum Bi is in a first range, determining the candidate transform kernel Ai as a transform kernel corresponding to each second coding block;

If the Hadamard transform absolute difference and Bi are in a second range, determining a transformation coefficient coeff value Ci, and when the coeff value Ci is in a third range, determining the candidate transformation kernel Ai as a transformation kernel corresponding to each second coding block, otherwise, ending the ith transformation process, and performing the (i+1) th transformation process until the transformation kernel corresponding to each second coding block is determined or the plurality of candidate transformation kernels are traversed;

And if the Hadamard transform absolute difference and Bi are in a fourth range, directly ending the ith transform processing, and carrying out the (i+1) th transform processing until the transform core corresponding to each second coding block is determined or the candidate transform cores are traversed.

3. The method of claim 1, wherein the step of determining the position of the substrate comprises,

When the plurality of candidate transform kernels are transform kernels excluding FLIPADST of the all transform kernels, in the 1 st transform process, the candidate transform kernel A1 is a dct_dct transform kernel, and dct_dct means that a discrete cosine transform is employed in both the horizontal direction and the vertical direction.

4. A method according to any of claims 1-3, wherein said determining the maximum depth of the first coded block comprises:

determining the current depth of the first code according to the size information of the first code;

Determining the maximum depth of the first coding block according to the current depth of the first coding block and the block type;

If the block type is an inter-coded block, the maximum depth of the first coded block is the current depth of the first coded block plus 1;

if the block type is an intra-coded block, the maximum depth of the first coded block is the current depth of the first coded block plus 2.

5. The method of claim 4, wherein after said determining the block type of the first encoded block, the method further comprises:

Determining whether the first coding block has a previous depth according to the current depth;

When the quantization coefficient is 0, the determination processing of the transform parameter of the first encoding block is ended.

6. A transformation parameter determining device, the device comprising:

An analysis module, configured to determine a maximum depth of a first coding block, where the maximum depth is used to identify a maximum number of partitions of the first coding block, determine a plurality of candidate partition sizes of the first coding block according to the maximum depth, and partition the first coding block according to each candidate partition size of the plurality of candidate partition sizes to obtain a plurality of second coding blocks;

the traversal module is used for determining a plurality of transformation cores according to the attribute information of each second coding block in the plurality of second coding blocks, wherein the plurality of transformation cores are in one-to-one correspondence with the plurality of second coding blocks; comprising the following steps:

And the processing module is used for carrying out transformation processing on the second coding block corresponding to each transformation core according to each transformation core in the plurality of transformation cores to obtain a plurality of rate distortion costs, wherein the plurality of rate distortion costs are in one-to-one correspondence with the plurality of transformation cores, and the transformation core corresponding to the minimum value and the division size of the second coding block corresponding to the minimum value in the plurality of rate distortion costs are used as transformation parameters of the first coding block.

7. An electronic device comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor to implement the method of any of claims 1-5.

8. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which is executed by a processor to implement the method of any of claims 1-5.