CN110602491B

CN110602491B - Intra-frame chroma prediction method, device and equipment and video coding and decoding system

Info

Publication number: CN110602491B
Application number: CN201910817597.1A
Authority: CN
Inventors: 朱林卫; 张云; 李娜; 张欢; 乔宇
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2019-08-30
Filing date: 2019-08-30
Publication date: 2022-07-19
Anticipated expiration: 2039-08-30
Also published as: CN110602491A

Abstract

The embodiment of the application discloses an intra-frame chroma prediction method, an intra-frame chroma prediction device, an intra-frame chroma prediction system, a terminal device, a video encoder, a video decoder and a computer readable storage medium, wherein the method comprises the following steps: acquiring a coded or decoded reconstructed luminance component; downsampling the encoded or decoded reconstructed luminance component; inputting preset parameters into an image coloring subnetwork in a pre-trained chroma prediction convolutional neural network model to obtain chroma components output by the image coloring subnetwork; the preset parameters comprise a coded or decoded and reconstructed brightness component after down-sampling, or comprise a coded or decoded and reconstructed brightness component after down-sampling and target parameters, and the target parameters comprise at least one of a coding distortion degree and a coded or decoded and reconstructed adjacent chroma block; and obtaining a chroma prediction result according to the chroma components. The intra-frame chroma prediction scheme based on the convolutional neural network has high universality and can save code rate.

Description

Intra-frame chroma prediction method, device and equipment and video coding and decoding system

Technical Field

The present application belongs to the technical field of video coding, and in particular, to an intra-frame chrominance prediction method, an intra-frame chrominance prediction device, a video coding and decoding system, a terminal device, a video encoder, a video decoder, and a computer-readable storage medium.

Background

The video coding process mainly comprises modules such as prediction, transformation quantization, entropy coding and the like, wherein the prediction can be divided into intra-frame prediction and inter-frame prediction, and the intra-frame prediction can also comprise intra-frame chroma prediction and intra-frame brightness prediction.

Currently, in the new generation Video Coding standard multifunctional Video Coding (VVC), in order to eliminate redundant information in the YCbCr color space, a corresponding Linear prediction Model CCLM (basic Cross-component Linear mode Chroma-prediction for Video Coding) or a Multi-Model Linear prediction Model MMLM (Multi-Model Based Cross-component Linear mode Chroma-prediction for Video Coding, CCLM) is generally used to perform Intra-frame Chroma prediction by using the Linear correlation between the luma component and the Chroma component in the Coding block. However, the conventional intra chroma prediction method cannot be applied to all cases, and a large number of code rates are required.

Disclosure of Invention

The embodiment of the application provides an intra-frame chroma prediction method and device, a video coding and decoding system, terminal equipment, a video encoder, a video decoder and a computer readable storage medium, so as to solve the problems that the universality of the existing intra-frame chroma prediction mode is low and more code rates need to be consumed.

In a first aspect, an embodiment of the present application provides an intra chroma prediction method, including:

acquiring a coded or decoded reconstructed luminance component;

downsampling the encoded or decoded reconstructed luma component;

inputting preset parameters into an image coloring subnetwork in a pre-trained chroma prediction convolution neural network model to obtain chroma components output by the image coloring subnetwork; the preset parameters comprise a coded or decoded and reconstructed brightness component after down sampling, or comprise a coded or decoded and reconstructed brightness component and target parameters after down sampling, and the target parameters comprise at least one of a coding distortion degree and a coded or decoded and reconstructed adjacent chroma block;

and cutting out a target chroma component block from the chroma components, wherein the target chroma component block is a final chroma prediction result.

With reference to the first aspect, in a possible implementation manner, when the preset parameter includes the encoded or decoded reconstructed neighboring chroma blocks, the method further includes:

cutting out a target brightness component block from the encoded or decoded reconstructed brightness components; carrying out chroma prediction on the target brightness component block in a preset chroma prediction mode to obtain predicted chroma; and taking the predicted chroma as an initial chroma component of the chroma block to be predicted.

With reference to the first aspect, in one possible implementation manner, the chroma prediction convolutional neural network model further includes a luminance downsampling sub-network;

said downsampling said encoded or decoded reconstructed luma component, comprising: downsampling, by the luma downsampling sub-network, the encoded or decoded reconstructed luma component.

With reference to the first aspect, in a possible implementation manner, the deriving a chroma prediction result according to the chroma component includes: and cutting out a target chroma component block from the chroma components, wherein the target chroma component block is a chroma prediction result corresponding to the encoded or decoded and reconstructed brightness component.

In a second aspect, an embodiment of the present application provides an intra chroma prediction method applied to a video encoder, the method including:

coding the brightness component to obtain a brightness code stream;

acquiring a brightness component which is coded and reconstructed, adjacent chroma information which is coded and reconstructed and original chroma information corresponding to a chroma block to be coded;

determining a target chroma prediction mode with the minimum rate distortion cost from at least two chroma prediction modes through rate distortion optimization; the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, wherein the first chroma prediction mode is the intra-frame chroma prediction method according to any one of claims 1 to 3;

generating indication information corresponding to the target chromaticity prediction mode according to the incidence relation between the chromaticity prediction mode and the indication information;

carrying out subtraction operation on the original chroma information and chroma information obtained by prediction to obtain chroma residual error information; the chroma information obtained by prediction is chroma information obtained by chroma prediction in the target chroma prediction mode;

and coding the indication information and the chroma residual error information to obtain a chroma code stream, and combining the chroma code stream and the brightness code stream to obtain a video code stream.

With reference to the second aspect, in a possible implementation manner, the indication information is specifically a flag bit value;

the generating of the indication information corresponding to the target chroma prediction mode through the incidence relation between the chroma prediction mode and the indication information includes:

and setting the zone bit as a corresponding value according to the incidence relation between the chroma prediction mode and the zone bit value so as to obtain the indication information corresponding to the target chroma prediction mode.

In a third aspect, an embodiment of the present application provides an intra chroma prediction method applied to a video decoder, where the method includes:

acquiring a video code stream output by a video encoder;

decoding the video code stream to obtain a decoded and reconstructed brightness component, decoded and reconstructed adjacent chroma information and indication information for determining a chroma prediction mode;

determining a target chroma prediction mode from at least two chroma prediction modes according to the indication information, wherein the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, and the first chroma prediction mode is the intra chroma prediction method according to any one of claims 1 to 3;

according to the decoded and reconstructed brightness component and the decoded and reconstructed adjacent chroma information, chroma prediction is carried out on the chroma component in the target chroma prediction mode to obtain a chroma prediction result;

and carrying out chroma reconstruction according to a residual error obtained by decoding chroma residual error information in the video code stream and the chroma prediction result to obtain output chroma.

With reference to the third aspect, in a possible implementation manner, the indication information is specifically a flag bit value;

the determining a target chroma prediction mode from at least two chroma prediction modes according to the indication information comprises:

when the flag bit value is a first value, determining the first-class chroma prediction mode as the target chroma prediction mode;

and when the flag bit value is a second value, determining the second type of chroma prediction mode as the target chroma prediction mode.

In a fourth aspect, an embodiment of the present application provides an intra chroma prediction method, including:

the video encoder encodes the brightness component to obtain a brightness code stream; acquiring a brightness component which is coded and reconstructed, adjacent chroma information which is coded and reconstructed and original chroma information corresponding to a chroma block to be coded; determining a target chroma prediction mode with the minimum rate distortion cost from at least two chroma prediction modes through rate distortion optimization; the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, wherein the first chroma prediction mode is the intra chroma prediction method according to any one of claims 1 to 3; generating indication information of the target chromaticity prediction mode according to the incidence relation between the chromaticity prediction mode and the indication information; carrying out subtraction operation on the original chroma information and chroma information obtained by prediction to obtain chroma residual error information; the chroma information obtained by prediction is chroma information obtained by chroma prediction in the target chroma prediction mode; coding the indication information and the chroma residual error to obtain a chroma code stream, and combining the chroma code stream and a brightness code stream to obtain a video code stream;

a video decoder acquires the video code stream; decoding the video code stream to obtain a decoded and reconstructed brightness component, decoded and reconstructed adjacent chroma information and the indication information; determining the target chroma prediction mode from the at least two chroma prediction modes according to the indication information; according to the decoded and reconstructed brightness component and the decoded and reconstructed adjacent chroma information, chroma prediction is carried out on the chroma component in the target chroma prediction mode to obtain a chroma prediction result; and carrying out chroma reconstruction according to a residual error obtained after the chroma residual error information in the video code stream is decoded and the chroma prediction result to obtain output chroma.

In a fifth aspect, an embodiment of the present application provides a video coding and decoding system, including a video encoder and a video decoder;

the video encoder is used for encoding the brightness component to obtain a brightness code stream; acquiring a brightness component which is coded and reconstructed, adjacent chroma information which is coded and reconstructed and original chroma information corresponding to a chroma block to be coded; determining a target chroma prediction mode with the minimum rate distortion cost from at least two chroma prediction modes through rate distortion optimization; the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, wherein the first chroma prediction mode is the intra-frame chroma prediction method according to any one of claims 1 to 3; generating indication information of the target chromaticity prediction mode according to the incidence relation between the chromaticity prediction mode and the indication information; carrying out subtraction operation on the original chroma information and chroma information obtained by prediction to obtain chroma residual error information; the chroma information obtained by prediction is chroma information obtained by chroma prediction in the target chroma prediction mode; coding the indication information and the chroma residual error to obtain a chroma code stream, and combining the chroma code stream and a brightness code stream to obtain a video code stream;

the video decoder is used for acquiring the video code stream; decoding the video code stream to obtain a decoded and reconstructed brightness component, decoded and reconstructed adjacent chroma information and the indication information; determining the target chroma prediction mode from the at least two chroma prediction modes according to the indication information; according to the decoded and reconstructed brightness component and the decoded and reconstructed adjacent chroma information, chroma prediction is carried out on the chroma component in the target chroma prediction mode to obtain a chroma prediction result; and carrying out chroma reconstruction according to a residual error obtained after the chroma residual error information in the video code stream is decoded and the chroma prediction result to obtain output chroma.

In a sixth aspect, an embodiment of the present application provides an intra chroma prediction apparatus, including:

a luminance component obtaining module for obtaining a luminance component which has been encoded or decoded and reconstructed;

a down-sampling module for down-sampling the encoded or decoded reconstructed luma component;

the coloring module is used for inputting preset parameters into an image coloring subnetwork in a pre-trained chroma prediction convolutional neural network model to obtain chroma components output by the image coloring subnetwork; the preset parameters comprise a down-sampled encoded or decoded reconstructed luminance component, or comprise a down-sampled encoded or decoded reconstructed luminance component and target parameters, and the target parameters comprise at least one of an encoding distortion degree and an encoded or decoded reconstructed adjacent chroma block;

and the prediction module is used for cutting out a target chroma component block from the chroma components, wherein the target chroma component block is a final chroma prediction result.

In a seventh aspect, an embodiment of the present application provides an intra chroma prediction apparatus, including:

the brightness coding module is used for coding the brightness component to obtain a brightness code stream;

the acquisition module is used for acquiring the encoded and reconstructed brightness component, the encoded and reconstructed adjacent chroma information and the original chroma information corresponding to the chroma block to be encoded;

the second determining module is used for determining a target chroma prediction mode with the minimum rate distortion cost from the at least two chroma prediction modes through rate distortion optimization; the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, wherein the first chroma prediction mode is the intra chroma prediction method according to any one of claims 1 to 3;

the generating module is used for generating the indicating information corresponding to the target chroma prediction mode according to the incidence relation between the chroma prediction mode and the indicating information;

the subtraction module is used for carrying out subtraction operation on the original chroma information and chroma information obtained by prediction to obtain chroma residual error information; the chroma information obtained by prediction is chroma information obtained by chroma prediction in the target chroma prediction mode;

and the coding module is used for coding the indication information and the chroma residual error information to obtain a chroma code stream, and combining the chroma code stream and the brightness code stream to obtain a video code stream.

In an eighth aspect, an embodiment of the present application provides an intra chroma prediction apparatus, including:

the code stream acquisition module is used for acquiring a video code stream output by the video encoder;

the decoding module is used for decoding the video code stream to obtain a brightness component which is decoded and reconstructed, adjacent chroma information which is decoded and reconstructed and indication information used for determining a chroma prediction mode;

a first determining module, configured to determine, according to the indication information, a target chroma prediction mode from at least two chroma prediction modes, where the at least two chroma prediction modes include a first chroma prediction mode and a second chroma prediction mode, and the first chroma prediction mode is the intra chroma prediction method according to any one of claims 1 to 3;

the chroma prediction module is used for carrying out chroma prediction on the chroma component in the target chroma prediction mode according to the decoded and reconstructed brightness component and the decoded and reconstructed adjacent chroma information to obtain a chroma prediction result;

and the chrominance reconstruction module is used for carrying out chrominance reconstruction according to a residual error obtained by decoding chrominance residual error information in the video code stream and the chrominance prediction result to obtain output chrominance.

In a ninth aspect, an embodiment of the present application provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor, when executing the computer program, implements the intra chroma prediction method according to any one of the first aspect.

In a tenth aspect, embodiments of the present application provide a video encoder, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor, when executing the computer program, implements the intra chroma prediction method according to any one of the second aspects.

In an eleventh aspect, an embodiment of the present application provides a video decoder, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor, when executing the computer program, implements the intra chroma prediction method according to any one of the third aspects.

In a twelfth aspect, embodiments of the present application provide a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the intra chroma prediction method according to any one of the first aspect, the second aspect, or the third aspect.

In a thirteenth aspect, embodiments of the present application provide a computer program product, which, when run on a terminal device or a video encoder or a video decoder, causes the terminal device or the video encoder or the video decoder to perform the intra chroma prediction method according to any one of the first aspect, the second aspect, or the third aspect.

Compared with the prior art, the embodiment of the application has the advantages that: and chroma prediction is carried out through an image coloring sub-network in the chroma prediction convolutional neural network model and the input corresponding parameters, namely, the chroma prediction problem is modeled into an image coloring problem, and the universality is higher. In addition, the chroma prediction based on the image coloring sub-network can save code rate.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the embodiments or the prior art description will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings may be obtained according to these drawings without inventive labor.

Fig. 1 is a schematic block diagram illustrating a flow of an intra chrominance prediction method according to an embodiment of the present application;

fig. 2 is a schematic block diagram of a flow of a neighboring chroma block reconstruction process according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of a neighboring chroma block reconstruction process according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram of an intra chrominance prediction method based on a convolutional neural network according to an embodiment of the present application;

fig. 5 is a block diagram schematically illustrating a structure of an intra chroma prediction apparatus according to an embodiment of the present disclosure;

fig. 6 is a schematic block diagram illustrating a flow of an intra chroma prediction method according to an embodiment of the present application;

fig. 7 is a schematic diagram illustrating an encoding process of a video encoder according to an embodiment of the present application;

fig. 8 is a block diagram schematically illustrating a structure of an intra chroma prediction apparatus according to an embodiment of the present disclosure;

fig. 9 is a schematic block diagram illustrating a flow of an intra chroma prediction method according to an embodiment of the present application;

fig. 10 is a schematic diagram illustrating a decoding process of a video decoder according to an embodiment of the present application;

fig. 11 is a block diagram schematically illustrating a structure of an intra chroma prediction apparatus according to an embodiment of the present disclosure;

fig. 12 is a schematic block diagram illustrating a structure of a video encoding and decoding system according to an embodiment of the present application;

fig. 13 is a schematic diagram of an interaction between a video encoder and a video decoder according to an embodiment of the present application;

fig. 14 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 15 is a schematic structural diagram of a video encoder according to an embodiment of the present application;

fig. 16 is a schematic structural diagram of a video decoder according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application.

The technical solutions provided in the embodiments of the present application will be described below with specific embodiments.

Example one

Referring to fig. 1, a schematic block diagram of a flowchart of an intra chroma prediction method according to an embodiment of the present application is provided, where the method includes the following steps:

step 101, obtaining a luminance component of the encoded or decoded reconstruction.

It should be noted that the above-mentioned encoded or decoded reconstructed luma component may be a luma component corresponding to any color space and any video format, that is, the intra chroma prediction method provided in the embodiment of the present application may be applied to any color space and any video format. For example, the above encoded or decoded reconstructed luma component is YCbCr 4: 2: luminance component Y in 0 format.

The encoded or decoded reconstructed luma component is downsampled, step 102.

The downsampling method may be any of the downsampling methods in the related art, and may also be luminance downsampling by a convolutional neural network.

In some embodiments, the chroma prediction convolutional neural network model described above may also include a luma downsampling sub-network. At this time, the specific process of downsampling the encoded or decoded reconstructed luma component may include: the encoded or decoded reconstructed luma component is down-sampled by a luma down-sampling sub-network.

The chroma prediction convolutional neural network model can also comprise a luminance down-sampling sub-network besides the following image coloring sub-network. The luminance down-sampling sub-network can down-sample the input encoded or decoded reconstructed luminance component to obtain the down-sampled encoded or decoded reconstructed luminance component. For example, 4N × 4N encoded or decoded reconstructed luma components are input to a luma downsampling sub-network, and after downsampling by the luma downsampling sub-network, 2N × 2N encoded or decoded reconstructed luma components are output, where N is 64.

In addition, the output layer of the luminance downsampling sub-network may comprise one or more kernel functions, i.e. the luminance downsampling network may output one or more downsampled results, one corresponding to one downsampled encoded or decoded reconstructed luminance component. The chroma prediction by using a plurality of downsampled results may further improve the chroma prediction performance compared to one downsampled result.

The luminance down-sampling sub-network may be embodied as a convolutional neural network, and when the color space and video format is YCbCr 4: 2: format 0, the hyper-parameters and structure of the luminance down-sampling sub-network may be specified as shown in table 1 below.

TABLE 1

It should be noted that the structure and the hyper-parameters of the luminance down-sampling sub-network shown in table 1 are merely exemplary. In a specific application, the hyper-parameters and the structure in the luminance down-sampling sub-network can be adjusted according to actual needs. For example, when the color space and video format is YCbCr 4: 4: 4, the second layer step in the luminance downsampling subnetwork is set to 1, when the color space and video format is YCbCr 4: 2: at 2, the luminance down-sampling sub-network can only be performed in the vertical or horizontal direction.

It should be noted that, compared to the conventional downsampling method, the downsampling performed by the luminance downsampling sub-network can obtain more luminance information, so as to further improve the performance of the subsequent chroma prediction.

103, inputting preset parameters into an image coloring subnetwork in a pre-trained chroma prediction convolution neural network model to obtain chroma components output by the image coloring subnetwork;

the preset parameters comprise a down-sampled encoded or decoded reconstructed luminance component, or comprise a down-sampled encoded or decoded reconstructed luminance component and target parameters, and the target parameters comprise at least one of an encoding distortion degree and an encoded or decoded reconstructed adjacent chroma block.

It should be noted that the preset parameters may include only the down-sampled encoded or decoded reconstructed luma component, may include the down-sampled encoded or decoded reconstructed luma component and the coding distortion, may include the down-sampled encoded or decoded reconstructed luma component and the neighboring chroma blocks, and may also include the down-sampled encoded or decoded reconstructed luma component, the coding distortion and the encoded or decoded reconstructed neighboring chroma blocks.

The adjacent chroma blocks which are coded or decoded and reconstructed can improve the chroma prediction performance of the coloring network of the image and the training speed of the network model, and the coding distortion degree can eliminate the negative influence caused by compression distortion. Based on this, when the preset parameters simultaneously include the encoded or decoded reconstructed brightness component after down-sampling, the encoding distortion factor and the adjacent chroma block, the performance of the image coloring network is optimal, namely the intra-frame chroma prediction performance is best; when the preset parameters comprise the brightness component and the coding distortion degree of the coded or decoded reconstruction after the down-sampling, or comprise the brightness component and the adjacent chroma block of the coded or decoded reconstruction after the down-sampling, the performance of the image coloring network is the second; when the preset parameter only includes the encoded or decoded reconstructed luminance component after downsampling, the performance of the coloring network on the image is the worst, namely the intra-frame chroma prediction performance is the worst.

It should be understood that even if the preset parameters only include the encoded or decoded reconstructed luma component after downsampling, the chroma prediction result can still be obtained by the image coloring network, i.e. the object of the embodiments of the present application can still be achieved.

It is noted that the coding distortion may be embodied as an image block characterized by a quantization parameter. The value of the coding distortion degree can be any one of 0-51. For example, the coding distortion is 10, and the coding distortion is specifically a 2N × 2N image block, where the value of each pixel in the image block is 10.

The above-mentioned adjacent chrominance blocks refer to image blocks including adjacent chrominance information, and the adjacent chrominance blocks are previously reconstructed. In some embodiments, when the preset parameter includes an encoded or decoded reconstructed neighboring chroma block, referring to the flowchart of the neighboring chroma block reconstruction process shown in fig. 2, the intra chroma prediction method may further include:

step 201, cropping a target luminance component block from the encoded or decoded reconstructed luminance components.

It should be noted that the target luma component block mentioned above generally refers to a luma component block located at the lower right of the luma component that has been encoded or decoded and reconstructed. For example, the encoded or decoded reconstructed luma component is a 4N × 4N luma block, the 4N × 4N luma block is divided into 4 2N × 2N luma component blocks, and the lower right 2N × 2N luma component block is the target luma component block.

Step 202, performing chroma prediction on the target brightness component block through a preset chroma prediction mode to obtain predicted chroma.

It should be noted that the preset chroma prediction mode may be specifically any chroma prediction mode in the prior art, for example, a linear prediction model CCLM or a multidirectional linear model MDLM. And performing chroma prediction on the target brightness component block by using a traditional linear chroma prediction model to obtain predicted chroma Cb and Cr.

And step 203, taking the predicted chroma as initial chroma information of the chroma block to be predicted.

To better describe the above neighboring chroma block reconstruction process, the following description will be made with reference to the neighboring chroma block reconstruction process diagram shown in fig. 3.

As shown in fig. 3, the encoded or decoded reconstructed luma component 31 has a size of 4N × 4N, which includes luma component blocks of size 2N × 2N encoded respectively to 1, 2, 3, 4, wherein the luma component block 1 is located at the top left, the luma component block 2 is located at the top right, the luma component block 3 is located at the bottom left, and the luma component block 4 is located at the bottom right. The luma component block 32 is obtained by clipping, then the 2N × 2N luma component block 32 is input into the linear prediction model CCLM to obtain predicted chroma Cb and Cr 34, and then the N × N chroma blocks Cb and Cr are respectively filled into the vacant parts of the corresponding chroma blocks 35, that is, the chroma blocks Cb and Cr are respectively filled into the positions of question marks in fig. 3 to serve as the initial chroma information of the chroma block 2N × 2N to be predicted.

The input of the color sub-network is a gray scale image, and the output is a corresponding color image. In the embodiment of the application, the chroma prediction problem is modeled into an image coloring problem, namely, the purpose of intra-frame chroma prediction is achieved through image coloring.

By way of example and not limitation, the structure and hyper-parameters of the color sub-network on the image may be as shown in table 2 below.

TABLE 2

It should be understood that the structure and hyper-parameters of the color sub-networks on the image shown in table 2 above are only one example. In a specific application, the hyper-parameters and the structure in the color sub-network on the image can be adjusted according to requirements.

The chroma prediction convolutional neural network model comprises the image coloring sub-network and, in some embodiments, may further comprise a luminance down-sampling sub-network, the chroma prediction convolutional neural network model being pre-trained.

When the chroma prediction convolutional neural network model comprises a luminance downsampling sub-network and an image coloring sub-network, in the training process of the chroma convolutional neural network model, a loss function is specifically as follows:

L₂＝λ||Cb‘-Cb||²+(1-λ)||Cr‘-Cr||²where λ is the weight, and Cb 'and Cr' are the chrominance components clipped from the output of the image coloring network, and their size is N × N. Cb. Cr is the true value of the chrominance component of size N × N.

Wherein Cb', Cr ═ F₂(F₁(Y),D,Cb,Cr)，F₂Coloring the image with a network, F₁(Y) 2 nx 2N size of the downsampled encoded or decoded reconstructed luma component; d is coding distortion degree, and the size is 2 Nx 2N; cb, Cr are adjacent chroma information in adjacent chroma blocks, and the adjacent chroma block size is 2N × 2N. The batch size and learning rate during training are set to 128 and 1 × 10, respectively^-4. λ may be set to 0.5. The training sample data set may include 886 images from the UCID database and 400 images from the DIV2K database.

And 104, cutting out a target chroma component block from the chroma components, wherein the target chroma component block is a final chroma prediction result.

Specifically, after the preset parameters are input into the image coloring subnetwork, the image coloring subnetwork will output the corresponding chrominance components, and then cut out the corresponding chrominance component blocks from the output of the image coloring subnetwork to obtain the predicted chrominance. That is, in some embodiments, the specific process of deriving the chroma prediction result according to the chroma component may include: and cutting out a target chroma component block from the chroma components, wherein the target chroma component block is a chroma prediction result corresponding to the encoded or decoded and reconstructed brightness component.

For example, when the size of the chrominance components output by the color sub-network on an image is 2N × 2N, an N × N target chrominance component block, which is a lower right chrominance block of the 2N × 2N chrominance components, is cut out from the 2N × 2N chrominance components.

In order to better describe the intra chroma prediction method provided in the embodiment of the present application, the following description will be made with reference to a schematic diagram of the convolutional neural network based intra chroma prediction method shown in fig. 4.

As shown in fig. 4, the chroma prediction convolutional neural network model includes a luma downsampling sub-network 41 and an image coloring sub-network 42, the size of the encoded or decoded reconstructed luma component 43 is 4N × 4N, and the 4 luma component blocks of 2N × 2N are included, and the 4 luma component blocks of 2N × 2N are numbered 1, 2, 3, and 4, respectively. A 2N × 2N luma component block 4 is cut out from the encoded or decoded reconstructed luma components 43 as a target luma component block, the target luma component block is input to the linear chroma prediction model CCLM to obtain output results Cb and Cr of the linear chroma prediction model CCLM, and the output results Cb and Cr are filled into a vacant portion in an adjacent chroma block to be used as an initial chroma component of a chroma block to be predicted.

The encoded or decoded reconstructed luma component 43 is input to a luma downsampling network 41, resulting in a plurality of downsampled encoded or decoded reconstructed luma components. Then, a plurality of 2N × 2N downsampled encoded or decoded reconstructed luminance components 44, reconstructed 2N × 2N adjacent chrominance blocks 45, and 2N × 2N encoding distortion 46 are input to the image rendering sub-network 42, which outputs two 2N × 2N chrominance components 47, where N × N Cb 'and Cr' are cut out from the two 2N × 2N chrominance components, respectively, and the cut out N × N Cb 'and Cr' are the final chrominance prediction results.

Accordingly, referring to a schematic block diagram of a structure of an intra chroma prediction apparatus shown in fig. 5, the apparatus may include:

a luminance component obtaining module 51 for obtaining a luminance component that has been encoded or decoded and reconstructed;

a down-sampling module 52 for down-sampling the encoded or decoded reconstructed luma component;

the coloring module 53 is configured to input preset parameters into an image coloring subnetwork in the pre-trained chroma prediction convolutional neural network model, so as to obtain a chroma component output by the image coloring subnetwork; the preset parameters comprise a coded or decoded and reconstructed brightness component after down-sampling, or comprise a coded or decoded and reconstructed brightness component after down-sampling and target parameters, and the target parameters comprise at least one of a coding distortion degree and a coded or decoded and reconstructed adjacent chroma block;

and the prediction module 54 is configured to cut out a target chroma component block from the chroma components, where the target chroma component block is a final chroma prediction result.

In some embodiments, when the preset parameter includes a neighboring chroma block, the apparatus may further include:

a cropping module for cropping a target luminance component block from the encoded or decoded reconstructed luminance components;

the chroma prediction module is used for carrying out chroma prediction on the target brightness component block in a preset chroma prediction mode to obtain predicted chroma;

and the reconstruction module takes the predicted chroma as an initial chroma component of the chroma block to be predicted.

In some embodiments, the chroma prediction convolutional neural network model further comprises a luma downsampling sub-network; the down-sampling module is specifically configured to: the encoded or decoded reconstructed luminance component is down-sampled by a luminance down-sampling sub-network.

It should be noted that the intra-frame chroma prediction apparatus and the intra-frame chroma prediction method are in one-to-one correspondence, and for related introduction, reference is made to the above corresponding contents, which are not described herein again.

It can be seen that, in the intra-frame chroma prediction scheme based on the convolutional neural network provided by the embodiment of the application, chroma prediction is performed through the image coloring subnetwork and the input corresponding parameters, so that a chroma prediction problem is modeled as an image coloring problem, and the universality is high. In addition, the chroma prediction based on the image coloring sub-network can save code rate. Through experiments, the intra-frame chroma prediction scheme based on the convolutional neural network can save 4.235% of code rate on average compared with the existing chroma prediction mode.

Example two

The intra-frame chroma prediction scheme based on the convolutional neural network can be applied to the video coding and decoding process. In order to further improve the video coding and decoding performance, rate-distortion cost competition can be carried out between an intra-frame chroma prediction method based on a convolutional neural network and a traditional chroma prediction method, and video coding and decoding are carried out in a chroma prediction mode with the minimum rate-distortion cost. The present embodiment will describe a chroma encoding process.

Referring to fig. 6, a schematic block diagram of a flow chart of an intra chroma prediction method provided in an embodiment of the present application, which may be specifically applied to a video encoder, may include the following steps:

601, coding the brightness component to obtain a brightness code stream;

step 602, obtaining the encoded and reconstructed luminance component, the encoded and reconstructed adjacent chrominance information, and the original chrominance information corresponding to the chrominance block to be encoded.

It is to be understood that the above encoded or decoded reconstructed luma component Y, chroma components Cb, Cr and neighboring chroma information may be included in an encoded block. The adjacent chrominance information is embodied as an adjacent chrominance block.

Step 603, determining a target chroma prediction mode with the minimum rate distortion cost from at least two chroma prediction modes through rate distortion optimization; the at least two chroma prediction modes include a first chroma prediction mode and a second chroma prediction mode, and the first chroma prediction mode is the intra-frame chroma prediction method according to any one of the above embodiments.

It should be noted that the video encoder includes at least two chroma prediction modes, and the at least two chroma prediction modes include a first chroma prediction mode and a second chroma prediction mode. The first chroma prediction mode refers to an intra chroma prediction method based on a convolutional neural network provided in the embodiment of the present application, and the second chroma prediction mode may refer to a conventional intra chroma prediction method, where the conventional intra chroma prediction method includes angle prediction, a linear model CCLM, a multi-directional linear model MDLM, and so on. It should be understood that the second type of chroma prediction approach described above may include one or more conventional intra chroma prediction methods.

And determining the chroma prediction mode with the minimum rate-distortion cost from the multiple chroma prediction modes through rate-distortion optimization. In some embodiments, the specific process of determining the target chroma prediction mode with the smallest rate-distortion cost from the at least two chroma prediction modes through rate-distortion optimization may include: respectively calculating rate distortion cost values corresponding to at least two chrominance prediction modes; and determining the chroma prediction mode with the minimum rate distortion cost value as a target chroma prediction mode.

And step 604, generating indication information corresponding to the target chroma prediction mode according to the association relation between the chroma prediction mode and the indication information.

It should be noted that the association relationship is established in advance, and the indication information corresponding to each chroma prediction mode can be determined through the association relationship. For example, when the indication information is a binary flag bit value, a mapping relationship between each chroma prediction mode and the corresponding value is pre-established, which is specifically expressed as: the flag bit value corresponding to the first chroma prediction mode is 00, the flag bit value corresponding to the second chroma prediction mode is 01, and so on.

After the indication information corresponding to the chroma prediction mode is acquired through the association relationship, the corresponding indication information can be generated according to the corresponding indication information content. For example, when the value corresponding to the target chroma prediction mode is 1, the value of the binary flag bit is set to 1 to generate the indication information corresponding to the target chroma prediction mode. The indication information is used for indicating which chroma prediction mode is selected for chroma prediction.

In some embodiments, the indication information is a flag bit value. The specific process of generating the indication information corresponding to the target chroma prediction mode according to the association relationship between the chroma prediction mode and the indication information may include: and setting the zone bit as a corresponding value through the incidence relation between the chroma prediction mode and the zone bit value so as to obtain the indication information corresponding to the target chroma prediction mode.

In this embodiment, the binary flag bit value corresponding to the intra-frame chroma prediction method based on the convolutional neural network in the first embodiment may be set to 1, and the binary flag bit value corresponding to the second chroma prediction mode may be set to 0. At this time, if it is determined through rate distortion optimization that the chroma prediction mode based on the convolutional neural network has the minimum rate distortion cost, the value of the binary flag bit is set to 1, and conversely, if the chroma prediction mode of the second type has the minimum rate distortion cost, the value of the binary flag bit is set to 0. It should be understood that when the second type of chroma prediction mode includes multiple traditional chroma prediction modes, two or three binary flag bits may be used to indicate corresponding chroma prediction modes, for example, two binary flag bits are used to indicate corresponding chroma prediction modes, a value of a binary flag bit corresponding to a first traditional chroma prediction mode is 00, a value of a binary flag bit corresponding to a second traditional chroma prediction mode is 01, and so on.

605, performing subtraction operation on the original chrominance information and the predicted chrominance information to obtain chrominance residual error information; and the chroma information obtained by prediction is chroma information obtained by chroma prediction in a target chroma prediction mode.

The chroma information obtained by prediction is obtained by performing chroma prediction on the chroma block to be predicted in the determined target chroma prediction mode, and the specific process is not repeated herein.

And 606, coding the indication information and the chroma residual error information to obtain a chroma code stream, and combining the chroma code stream with the brightness code stream to obtain a video code stream.

Specifically, lossless distortion coding is performed on the indication information, and corresponding residual coding is performed on the chroma residual information to obtain an output code stream of the video encoder.

To better describe the encoding process of the video encoder provided in this embodiment, the following description will be made with reference to the encoding process schematic diagram of the video encoder shown in fig. 7.

As shown in fig. 7, the luminance component is encoded to obtain a luminance code stream; a chroma coding block to be predicted 71, which includes information such as a luma component already encoded and reconstructed and adjacent chroma information already encoded and reconstructed, is input to the video encoder 72. The video encoder respectively executes a traditional intra-frame chroma prediction mode and an intra-frame chroma prediction mode based on a convolutional neural network, rate distortion cost values of all the chroma prediction modes are calculated through rate distortion optimization, then the rate distortion cost values are compared, and the chroma prediction mode corresponding to the minimum rate distortion cost value is selected as a target chroma prediction mode; and setting the binary flag bit as a corresponding value based on a target chroma prediction mode, and then coding the binary flag bit and carrying out residual coding on a chroma coding block to be predicted to obtain a chroma code stream 73. The chrominance code stream and the luminance code stream are combined into a video code stream which is transmitted to a video decoder to carry out a corresponding decoding process.

Accordingly, referring to a schematic block diagram of a structure of an intra chroma prediction apparatus shown in fig. 8, the apparatus may include:

a brightness encoding module 81, configured to encode the brightness component to obtain a brightness code stream;

an obtaining module 82, configured to obtain a luminance component that has been encoded and reconstructed, adjacent chroma information that has been encoded and reconstructed, and original chroma information corresponding to a chroma block to be encoded;

a second determining module 83, configured to determine, through rate distortion optimization, a target chroma prediction mode with a minimum rate distortion cost from the at least two chroma prediction modes; the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, wherein the first chroma prediction mode is the intra-frame chroma prediction method of any one of the above embodiments;

a generating module 84, configured to generate indication information corresponding to the target chroma prediction mode according to an association relationship between the chroma prediction mode and the indication information;

a subtraction module 85, configured to perform subtraction operation on the original chrominance information and the predicted chrominance information to obtain chrominance residual error information; the chroma information obtained by prediction is chroma information obtained by chroma prediction in a target chroma prediction mode;

and the coding module 86 is configured to code the indication information and the chroma residual information to obtain a chroma code stream, and combine the chroma code stream and the luma code stream to obtain a video code stream.

In some embodiments, the second determining module is specifically configured to: respectively calculating rate distortion cost values corresponding to at least two chrominance prediction modes; and determining the chroma prediction mode with the minimum rate distortion cost value as a target chroma prediction mode.

In some embodiments, the indication information is a flag bit value. The generating module is specifically configured to: and setting the zone bit as a corresponding numerical value through the incidence relation between the chroma prediction mode and the numerical value of the zone bit so as to obtain the indication information corresponding to the target chroma prediction mode.

It should be noted that the intra chroma prediction apparatus and the intra chroma prediction method are in one-to-one correspondence, and for related introduction, reference is made to the above corresponding contents, which are not described herein again.

It can be seen that, by performing rate-distortion cost competition between the conventional intra-frame chroma prediction mode and the intra-frame chroma prediction mode based on the convolutional neural network provided in the embodiment of the present application, and adding indication information for indicating which chroma prediction mode to select, the chroma coding performance can be further improved.

EXAMPLE III

After the video encoding process is described, the present embodiment describes a video decoding process. The video decoding process of this embodiment corresponds to the video encoding process of the second embodiment described above.

Referring to fig. 9, a schematic flow chart of an intra chroma prediction method provided in an embodiment of the present application, which may be applied to a video decoder, may include the following steps:

and step 901, acquiring a code stream output by a video encoder.

And step 902, decoding the video code stream to obtain a decoded and reconstructed brightness component, decoded and reconstructed adjacent chroma information and indication information for determining a chroma prediction mode.

Specifically, the video decoder receives a video code stream output by the video encoder, and then decodes the code stream to obtain corresponding information. The indication information may be a binary flag bit.

Step 903, according to the indication information, determining a target chroma prediction mode from at least two chroma prediction modes, where the at least two chroma prediction modes include a first chroma prediction mode and a second chroma prediction mode, and the first chroma prediction mode is the intra chroma prediction method according to any one of the above embodiments.

Specifically, after the indication information is obtained by decoding, the selected target chroma prediction mode can be determined according to the indication information. For example, when the indication information is specifically a flag bit value; the specific process of determining the target chroma prediction mode from the at least two chroma prediction modes according to the indication information may include: when the flag bit value is a first value, determining a first type of chroma prediction mode as a target chroma prediction mode; and when the value of the flag bit is a second value, determining the second type of chroma prediction mode as the target chroma prediction. Wherein, the first value can be 1, and correspondingly, the second value is 0; the first value may also be 0 and correspondingly the second value is 1.

And 904, carrying out chroma prediction on the chroma component in a target chroma prediction mode according to the decoded and reconstructed brightness component and the decoded and reconstructed adjacent chroma information to obtain a chroma prediction result.

It can be understood that, after the target chroma prediction mode is selected according to the indication information, the target chroma prediction mode may be executed to perform chroma prediction so as to obtain a corresponding chroma prediction result. If the target chroma prediction mode is the intra-frame chroma prediction method based on the convolutional neural network in the first embodiment, the specific process of chroma prediction may refer to the corresponding contents above, and is not described herein again.

And 905, carrying out chroma reconstruction according to a residual error obtained by decoding chroma residual error information in the code stream and a chroma prediction result to obtain output chroma.

To better describe the video decoding process, the following description will be made with reference to the decoding process diagram of the video decoder shown in fig. 10.

As shown in fig. 10, a video decoder 101 receives an input code stream 102, first decodes a luminance component, then performs binary flag bit decoding to obtain a binary flag bit, and determines whether to perform chroma prediction by using a conventional intra-frame chroma prediction mode or by selecting an intra-frame chroma prediction mode based on a convolutional neural network according to the binary flag bit; then executing the selected target chromaticity prediction mode to carry out chromaticity prediction to obtain a chromaticity prediction result; and performing chroma reconstruction based on the obtained chroma prediction result and the residual decoding result to obtain output chroma 103.

Accordingly, referring to the schematic block diagram of the structure of an intra chroma prediction apparatus shown in fig. 11, the apparatus may include:

a code stream obtaining module 111, configured to obtain a code stream output by a video encoder;

a decoding module 112, configured to decode the video code stream to obtain a decoded and reconstructed luminance component, decoded and reconstructed adjacent chrominance information, and indication information for determining a chrominance prediction mode;

a first determining module 113, configured to determine, according to the indication information, a target chroma prediction mode from at least two chroma prediction modes, where the at least two chroma prediction modes include a first chroma prediction mode and a second chroma prediction mode, and the first chroma prediction mode is an intra-frame chroma prediction method as in any one of the above embodiments;

the chroma prediction module 114 is configured to perform chroma prediction on the chroma component in a target chroma prediction manner according to the decoded and reconstructed luma component and the decoded and reconstructed adjacent chroma information, so as to obtain a chroma prediction result;

and the chrominance reconstruction module 115 is configured to perform chrominance reconstruction according to a residual error obtained by decoding chrominance residual error information in the video code stream and a chrominance prediction result, so as to obtain an output chrominance.

In some embodiments, the indication information is a flag bit value; the first determining module is specifically configured to: when the flag bit value is a first value, determining a first type of chroma prediction mode as a target chroma prediction mode; and when the flag bit value is a second value, determining the second type of chroma prediction mode as a target chroma prediction mode.

It should be noted that the intra chroma prediction apparatus corresponds to the intra chroma prediction method in the above embodiment one to one, and the related introduction refers to the above corresponding contents, which are not described herein again.

Example four

Referring to fig. 12, a schematic block diagram of a video codec system provided in an embodiment of the present application may include a video encoder 121 and a video decoder 122. Of course, the system further includes an encoding transmission subsystem 123 for transmitting the code stream, which is interposed between the video encoder and the video decoder, for transmitting the code stream output from the video encoder to the video decoder.

The video encoder 121 is configured to encode the luminance component to obtain a luminance code stream; acquiring a brightness component which is coded and reconstructed, adjacent chroma information which is coded and reconstructed and original chroma information corresponding to a chroma block to be coded; determining a target chroma prediction mode with the minimum rate distortion cost from at least two chroma prediction modes through rate distortion optimization; the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, wherein the first chroma prediction mode is the intra-frame chroma prediction method of any one of the embodiments; generating indication information of a target chromaticity prediction mode according to the incidence relation between the chromaticity prediction mode and the indication information; carrying out subtraction operation on the original chrominance information and the predicted chrominance information to obtain chrominance residual error information; the chroma information obtained by prediction is chroma information obtained by chroma prediction in a target chroma prediction mode; coding the indication information and the chroma residual error to obtain a chroma code stream, and combining the chroma code stream and the brightness code stream to obtain a video code stream;

the video decoder 122 is configured to obtain a code stream; decoding the code stream to obtain a decoded and reconstructed brightness component, decoded and reconstructed adjacent chroma information and indication information; according to the indication information, determining a target chroma prediction mode from at least two chroma prediction modes; according to the decoded and reconstructed brightness component and the decoded and reconstructed adjacent chroma information, chroma prediction is carried out on the chroma component in a target chroma prediction mode to obtain a chroma prediction result; and carrying out chroma reconstruction according to the residual error obtained by decoding the chroma residual error information in the code stream and a chroma prediction result to obtain output chroma.

It should be noted that, for the intra-frame chroma prediction method based on the convolutional neural network, the encoding process of the video encoder, and the decoding process of the video decoder, reference may be made to the above corresponding contents, which are not described herein again.

Accordingly, referring to the schematic interaction diagram between the video encoder and the video decoder shown in fig. 13, the interaction flow of the intra chroma prediction system may include the following steps:

and step 1301, the video encoder encodes the brightness component to obtain a brightness code stream.

Step 1302, obtaining the encoded and reconstructed luma component and the encoded and reconstructed neighboring chroma information, and the original chroma information corresponding to the chroma block to be encoded.

Step 1303, the video encoder determines a target chroma prediction mode with the minimum rate distortion cost from at least two chroma prediction modes through rate distortion optimization; the at least two chroma prediction modes include a first chroma prediction mode and a second chroma prediction mode, and the first chroma prediction mode is the intra-frame chroma prediction method of any one of the first aspect.

And 1304, generating indication information of the target chroma prediction mode by the video encoder according to the association relation between the chroma prediction mode and the indication information.

And step 1305, the video encoder performs subtraction operation on the original chrominance information and the predicted chrominance information to obtain chrominance residual error information.

And step 1306, the video encoder encodes the indication information and the chroma residual error information to obtain a chroma code stream, and the chroma code stream and the brightness code stream are combined to obtain a video code stream.

Step 1307, the video decoder obtains the video code stream output by the video encoder.

Step 1308, the video decoder decodes the video code stream to obtain the decoded and reconstructed brightness component, the decoded and reconstructed adjacent chroma information and the indication information.

Step 1309, the video decoder determines the target chroma prediction mode from at least two chroma prediction modes according to the indication information.

Step 1310, the video decoder performs chroma prediction on the chroma component in a target chroma prediction mode according to the decoded and reconstructed luma component and the decoded and reconstructed adjacent chroma information to obtain a chroma prediction result.

And 1311, performing chroma reconstruction by the video decoder according to a residual error obtained by decoding chroma residual error information in the code stream and a chroma prediction result to obtain output chroma.

It should be noted that, the interaction flow between the video encoder and the video decoder is the same as or similar to that in the above embodiments, and reference may be made to the above corresponding contents, which is not described herein again.

It should be noted that, for the information interaction and execution process between the above-mentioned devices and units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and thus reference may be made to the part of the embodiment of the method, and details are not described here.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by functions and internal logic of the process, and should not constitute any limitation to the implementation process of the embodiments of the present application.

EXAMPLE five

Fig. 14 is a schematic structural diagram of a terminal device according to an embodiment of the present application. As shown in fig. 14, the terminal device 14 of this embodiment includes: at least one processor 140, a memory 141, and a computer program 142 stored in the memory 141 and executable on the at least one processor 140, wherein the processor 140 executes the computer program 142 to implement the steps of any of the embodiments of the intra chroma prediction method in the first embodiment.

The terminal device 14 may be a computing device such as a desktop computer, a notebook computer, or a palm computer. The terminal device may include, but is not limited to, a processor 140, a memory 141. Those skilled in the art will appreciate that fig. 14 is merely an example of the terminal device 14, and does not constitute a limitation of the terminal device 14, and may include more or less components than those shown, or combine some of the components, or different components, such as an input-output device, a network access device, etc.

The Processor 140 may be a Central Processing Unit (CPU), and the Processor 140 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The storage 141 may be an internal storage unit of the terminal device 14 in some embodiments, for example, a hard disk or a memory of the terminal device 14. The memory 141 may also be an external storage device of the terminal device 14 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the terminal device 14. Further, the memory 141 may also include both an internal storage unit and an external storage device of the terminal device 14. The memory 141 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as a program code of the computer program. The memory 141 may also be used to temporarily store data that has been output or is to be output.

Fig. 15 is a schematic structural diagram of a video encoder according to an embodiment of the present application. As shown in fig. 15, the video encoder 15 of this embodiment includes: at least one processor 150, a memory 151, and a computer program 152 stored in the memory 151 and executable on the at least one processor 150, wherein the processor 150, when executing the computer program 152, implements the steps in any of the embodiments of the intra chroma prediction method of the second embodiment.

The video encoder may include, but is not limited to, a processor 150, a memory 151. Those skilled in the art will appreciate that fig. 15 is merely an example of the video encoder 15, and does not constitute a limitation of the video encoder 15, and may include more or less components than those shown, or some of the components may be combined, or different components may be included, such as an input output device, a network access device, etc.

The Processor 150 may be a Central Processing Unit (CPU), and the Processor 150 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The storage 151 may in some embodiments be an internal storage unit of the video encoder 15, such as a hard disk or a memory of the video encoder 15. In other embodiments, the memory 151 may also be an external storage device of the video encoder 15, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the video encoder 15. Further, the memory 151 may also include both an internal storage unit of the video encoder 15 and an external storage device. The memory 151 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as program codes of the computer programs. The memory 151 may also be used to temporarily store data that has been output or is to be output.

Fig. 16 is a schematic structural diagram of a video decoder according to an embodiment of the present application. As shown in fig. 16, the video decoder 16 of this embodiment includes: at least one processor 160, a memory 161, and a computer program 162 stored in the memory 161 and executable on the at least one processor 160, wherein the processor 160 implements the steps of any of the embodiments of the intra chroma prediction method in the third embodiment when the computer program 162 is executed by the processor 160.

The video decoder may include, but is not limited to, a processor 160, a memory 161. Those skilled in the art will appreciate that fig. 16 is merely an example of the video decoder 16, and does not constitute a limitation to the video decoder 16, and may include more or less components than those shown, or combine some components, or different components, such as input output devices, network access devices, etc.

The Processor 160 may be a Central Processing Unit (CPU), and the Processor 160 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 161 may be an internal storage unit of the video decoder 16 in some embodiments, such as a hard disk or a memory of the video decoder 16. The memory 161 may also be an external storage device of the video decoder 16 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the video decoder 16. Further, the memory 161 may also include both an internal storage unit of the video decoder 16 and an external storage device. The memory 161 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as program codes of the computer program. The memory 161 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the intra chroma prediction method according to any one of the first embodiment or the second embodiment or the third embodiment is implemented.

The present embodiment further provides a computer program product, which when running on a terminal device or a video encoder or a video decoder, causes the terminal device or the video encoder or the video decoder to perform the intra chroma prediction method according to any one of the first embodiment or the second embodiment or the third embodiment.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include at least: any entity or apparatus capable of carrying computer program code to a terminal device, recording medium, computer Memory, Read-Only Memory (ROM), Random-Access Memory (RAM), electrical carrier wave signals, telecommunications signals, and software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In some jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and proprietary practices.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus, and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the embodiments of the present application, and they should be construed as being included in the present application.

Claims

1. An intra chroma prediction method, comprising:

acquiring a coded or decoded reconstructed luminance component;

downsampling the encoded or decoded reconstructed luma component;

cutting out a target brightness component block from the coded or decoded and reconstructed brightness components;

carrying out chroma prediction on the target brightness component block in a preset chroma prediction mode, and filling an output result into a vacant part in an adjacent chroma block to be used as predicted chroma; the preset chroma prediction mode is as follows: a linear prediction model CCLM or a multidirectional linear model MDLM;

the predicted chroma is used as an initial chroma component of a chroma block to be predicted, and a reconstructed adjacent chroma block is obtained;

inputting a plurality of down-sampled encoded or decoded reconstructed luminance components, reconstructed adjacent chrominance blocks and encoding distortion degrees into an image coloring subnetwork in a pre-trained chrominance prediction convolutional neural network model to obtain chrominance components output by the image coloring subnetwork; wherein the coding distortion factor is represented as an image block characterized by a quantization parameter;

cutting out a target chroma component block from the chroma components, wherein the target chroma component block is a final chroma prediction result;

wherein the chroma prediction convolutional neural network model further comprises a luma downsampling sub-network; said downsampling said encoded or decoded reconstructed luma component, comprising:

and downsampling the encoded or decoded and reconstructed luminance component through the luminance downsampling sub-network, and outputting a plurality of sampling results, wherein the output layer of the luminance downsampling sub-network comprises a plurality of kernel functions, and one downsampling result corresponds to one downsampled encoded or decoded and reconstructed luminance component.

2. An intra chroma prediction method applied to a video encoder, the method comprising:

coding the brightness component to obtain a brightness code stream;

determining a target chroma prediction mode with the minimum rate distortion cost from at least two chroma prediction modes through rate distortion optimization; the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, wherein the first chroma prediction mode is the intra-frame chroma prediction method according to claim 1;

3. The method according to claim 2, wherein the indication information is a flag bit value;

the generating of the indication information corresponding to the target chroma prediction mode through the association relationship between the chroma prediction mode and the indication information comprises:

and setting the zone bit as a corresponding numerical value through the incidence relation between the chroma prediction mode and the numerical value of the zone bit so as to obtain the indication information corresponding to the target chroma prediction mode.

4. A method for intra chroma prediction for use in a video decoder, the method comprising:

acquiring a video code stream output by a video encoder;

according to the indication information, determining a target chroma prediction mode from at least two chroma prediction modes, wherein the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, and the first chroma prediction mode is the intra chroma prediction method according to claim 1;

5. The method according to claim 4, wherein the indication information is a flag bit value;

when the flag bit value is a first value, determining the first chroma prediction mode as the target chroma prediction mode;

6. A method for intra chroma prediction, comprising:

the video encoder encodes the brightness component to obtain a brightness code stream; acquiring a brightness component which is coded and reconstructed, adjacent chroma information which is coded and reconstructed and original chroma information corresponding to a chroma block to be coded; determining a target chroma prediction mode with the minimum rate distortion cost from at least two chroma prediction modes through rate distortion optimization; the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, wherein the first chroma prediction mode is the intra-frame chroma prediction method according to claim 1; generating indication information of the target chromaticity prediction mode according to an incidence relation between the chromaticity prediction mode and the indication information; carrying out subtraction operation on the original chroma information and chroma information obtained by prediction to obtain chroma residual error information; the chroma information obtained by prediction is chroma information obtained by chroma prediction in the target chroma prediction mode; coding the indication information and the chroma residual error to obtain a chroma code stream, and combining the chroma code stream and the brightness code stream to obtain a video code stream;

a video decoder acquires the video code stream; decoding the video code stream to obtain a decoded and reconstructed brightness component, decoded and reconstructed adjacent chroma information and the indication information; determining the target chroma prediction mode from the at least two chroma prediction modes according to the indication information; according to the decoded and reconstructed brightness component and the decoded and reconstructed adjacent chroma information, chroma prediction is carried out on the chroma component in the target chroma prediction mode to obtain a chroma prediction result; and carrying out chroma reconstruction according to a residual error obtained by decoding chroma residual error information in the video code stream and the chroma prediction result to obtain output chroma.

7. A video coding and decoding system is characterized by comprising a video coder and a video decoder;

the video encoder is used for encoding the brightness component to obtain a brightness code stream; acquiring a brightness component which is coded and reconstructed, adjacent chroma information which is coded and reconstructed and original chroma information corresponding to a chroma block to be coded; determining a target chroma prediction mode with the minimum rate distortion cost from at least two chroma prediction modes through rate distortion optimization; the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, wherein the first chroma prediction mode is the intra-frame chroma prediction method according to claim 1; generating indication information of the target chromaticity prediction mode according to the incidence relation between the chromaticity prediction mode and the indication information; carrying out subtraction operation on the original chroma information and chroma information obtained by prediction to obtain chroma residual error information; the chroma information obtained by prediction is chroma information obtained by chroma prediction in the target chroma prediction mode; coding the indication information and the chroma residual error to obtain a chroma code stream, and combining the chroma code stream and a brightness code stream to obtain a video code stream;

the video decoder is used for acquiring the video code stream; decoding the video code stream to obtain a decoded and reconstructed brightness component, decoded and reconstructed adjacent chroma information and the indication information; determining the target chroma prediction mode from the at least two chroma prediction modes according to the indication information; according to the decoded and reconstructed brightness component and the decoded and reconstructed adjacent chroma information, chroma prediction is carried out on the chroma component in the target chroma prediction mode to obtain a chroma prediction result; and carrying out chroma reconstruction according to a residual error obtained by decoding chroma residual error information in the video code stream and the chroma prediction result to obtain output chroma.

8. An apparatus for intra chroma prediction, comprising:

a luminance component acquisition module for acquiring a luminance component which has been encoded or decoded and reconstructed;

a downsampling module for downsampling the encoded or decoded reconstructed luma component;

a clipping module for clipping a target luminance component block from the encoded or decoded reconstructed luminance components;

the chroma prediction module is used for carrying out chroma prediction on the target brightness component block in a preset chroma prediction mode, and filling an output result into a vacant part in an adjacent chroma block to be used as predicted chroma; wherein the preset chroma prediction mode is as follows: a linear prediction model CCLM or a multidirectional linear model MDLM;

the reconstruction module is used for taking the predicted chroma as an initial chroma component of a chroma block to be predicted and obtaining a reconstructed adjacent chroma block;

the coloring module is used for inputting a plurality of downsampled encoded or decoded reconstructed brightness components, reconstructed adjacent chroma blocks and encoding distortion degrees into an image coloring sub-network in a pre-trained chroma prediction convolutional neural network model to obtain chroma components output by the image coloring sub-network; wherein the coding distortion factor is represented as an image block characterized by a quantization parameter;

the prediction module is used for cutting out a target chroma component block from the chroma components, wherein the target chroma component block is a final chroma prediction result;

wherein the chroma prediction convolutional neural network model further comprises a luma downsampling sub-network; the down-sampling module is specifically configured to:

and downsampling the encoded or decoded and reconstructed luminance component through the luminance downsampling sub-network to output a plurality of sampling results, wherein the output layer of the luminance downsampling sub-network comprises a plurality of kernel functions, and one downsampling result corresponds to one downsampled encoded or decoded and reconstructed luminance component.

9. An apparatus for intra chroma prediction, comprising:

the second determining module is used for determining a target chroma prediction mode with the minimum rate distortion cost from at least two chroma prediction modes through rate distortion optimization; the at least two chroma prediction modes comprise a first chroma prediction mode and a second chroma prediction mode, wherein the first chroma prediction mode is the intra-frame chroma prediction method according to claim 1;

10. An apparatus for intra chroma prediction, comprising:

a first determining module, configured to determine a target chroma prediction mode from at least two chroma prediction modes according to the indication information, where the at least two chroma prediction modes include a first chroma prediction mode and a second chroma prediction mode, and the first chroma prediction mode is the intra-frame chroma prediction method according to claim 1;

11. A terminal device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the intra chroma prediction method of claim 1 when executing the computer program.

12. A video encoder comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the intra chroma prediction method of any of claims 2 to 3 when executing the computer program.

13. A video decoder comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the intra chroma prediction method as claimed in any one of claims 4 to 5 when executing the computer program.

14. A computer-readable storage medium, in which a computer program is stored, wherein the computer program, when being executed by a processor, implements the intra chroma prediction method as claimed in any one of claims 1 to 5.