CN118413675B

CN118413675B - Context-based progressive three-plane coding image compression algorithm and terminal equipment

Info

Publication number: CN118413675B
Application number: CN202410879262.3A
Authority: CN
Inventors: 赵作鹏; 胡建峰; 闵冰冰; 刘营
Original assignee: China University of Mining and Technology CUMT
Current assignee: China University of Mining and Technology CUMT
Priority date: 2024-07-02
Filing date: 2024-07-02
Publication date: 2024-09-24
Anticipated expiration: 2044-07-02
Also published as: CN118413675A

Abstract

The invention discloses a context-based progressive three-plane coding image compression algorithm and terminal equipment, which specifically comprise the following steps: s1, compressing a video stream acquired by a visual camera through an MPEG algorithm, and acquiring a visual image with a coding format of H265; s2, sequentially converting the visual image X into a potential tensor Y and a super potential tensor Z through an encoder and a super encoder, and obtaining an average value and a standard deviation of the representation Y by using a super decoder; s3, evaluating values by a probability calculation module by using the average value, the standard deviation and the coded three planes; s4, designing a context-based rate reduction module, predicting the value of each tri-plane, fusing the values with a convolution layer through a residual block, and finally using an activation function; s5, designing a context-based distortion reduction module, and performing image reconstruction X after entropy decoding; and S6, finally, reconstructing the improved potential tensor image through a super decoder.

Description

Context-based progressive three-plane coding image compression algorithm and terminal equipment

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a context-based progressive three-plane coding image compression algorithm and terminal equipment.

Background

Image compression algorithms are a series of techniques for reducing the size of digital image files in order to reduce the storage space and transmission bandwidth requirements without losing image quality as much as possible. It includes two main methods, lossless compression and lossy compression. Lossless compression algorithms allow complete recovery of the original image data, suitable for scenes with very high requirements on image quality. Lossy compression, in turn, achieves higher compression rates by discarding some of the less visually sensitive information, and is widely used in applications requiring the storage and transmission of large amounts of image data. With the development of deep learning and artificial intelligence technology, the field of image compression is coming to be an emerging technology of context-based coding, generation of countermeasure networks and end-to-end learning, and the like, and these advances provide new possibilities for improving compression efficiency and image quality.

The core problem of image compression algorithms is to balance the contradiction between compression efficiency and image quality, and the challenge of keeping the original information as much as possible while reducing the amount of data. Efficient compression is often accompanied by some degree of data loss, which is particularly evident in lossy compression. Meanwhile, complex algorithm designs in the compression process may cause increased encoding and decoding time consumption, which poses challenges for real-time processing and resource-constrained devices. In response to these problems, a more compact progressive encoding scheme is proposed, aiming at gradually refining the compression process through intelligent encoding steps, thereby realizing higher compression ratio while maintaining image quality, and opening up a new path for the progress of image compression technology.

Disclosure of Invention

The invention aims to provide a context-based progressive three-plane coding image compression algorithm and terminal equipment, which can predict third-order probability more accurately by combining context information, thereby realizing more compact coding. Not only improves the compression efficiency, but also helps to preserve richer image details. In addition, the algorithm also comprises a distortion reduction module which can intelligently extract key tensor information from three planes, so that the image quality loss in the compression process is further reduced. This allows for a higher level of compression performance while maintaining high quality image reconstruction, overcoming the problem of incompatibility of conventional tri-planar coding with autoregressive context models.

In order to achieve the above object, the present invention provides a context-based progressive three-plane coded image compression algorithm, comprising the steps of:

s1, compressing an original video stream by adopting an MPEG algorithm, and obtaining a visual image coded into an H265 format;

s2, sequentially converting the visual image X into a potential tensor Y and a super potential tensor Z through an encoder and a super encoder, and using quantization The super decoder obtains the average value and standard deviation of the representation Y;

S3, evaluating the value by using the average value, the standard deviation and the coded three planes through a probability calculation module ；

S4, designing a context-based rate reduction module (Rate Reduction Model, RRM), predicting the value of each tri-plane and fusing the values with a convolution layer through a residual block, and finally determining by using an activation functionA value;

S5, designing a context-based distortion reduction module (Distortion Reduction Model, DRM) and using the bias hidden tensor to perform random after entropy decoding Reconstructing an image by using values in a range;

and S6, finally, reconstructing the improved potential tensor image through a super decoder.

As a further scheme of the invention: the step S1 comprises the following steps:

By adopting an MPEG video stream image compression algorithm, good video quality under high compression ratio and low bit rate is realized through motion compensation, discrete Cosine Transform (DCT) and Huffman coding technology, and particularly, the method is good at processing motion scenes and reducing time redundancy; in addition, the method overcomes the defects of other series of algorithms in the aspects of dynamic video compression efficiency and bandwidth adaptability, and provides a more flexible and efficient video transmission solution.

As a further scheme of the invention: the step S2 comprises the following steps:

the encoder extracts the key features in the image, and finally outputs a potential tensor Y by using the progressive extraction of the image features layer by layer of the convolutional neural network; the super encoder further refines and abstracts the features, further compresses information and the like to obtain a super potential tensor Z;

Furthermore, by Calculation of quantized super potential tensorThe average value M and standard deviation are then obtained using a super decoder, as shown in equation (1)，

(1)；

As a further scheme of the invention: the step S3 comprises the following steps:

Using the mean, standard deviation and encoded tri-plane by the probability calculation module WhereinC represents the number of channels, H and W represent the height and width, respectively;

Using entropy parameter M and standard deviation And the coded trihedral value is used for evaluationWherein entropy parameters are typically used to adjust and control the entropy of information during encoding, thereby affecting the loss of information in compression rate.

As a further scheme of the invention: the step S4 comprises the following steps:

by designing a context-based rate reduction module (Rate Reduction Model, RRM), predicting potential elements in raster scan order is eliminated, but each plane is predicted ，In addition, RRM module pairRefining the probability estimates in (2) to produce updated tensors，Bit ratio required in entropy codingLess, thereby improving the performance of the "rate-distortion" RD.

As shown in fig. 2, RRM is input from the inputs respectivelyAnd extracting features in context and fusing them by a residual block convolution layer, wherein the fusion tensor has a value equal to that of the residual block convolution layerThe same spatial resolution, but the channels areFour times that of an additively added item by channelAnd a scale item；

First, S is converted to B as shown in equation 2:

(2)；

specifically, each element thereof is in In (C) and then willAdded toAnd sums it to produce an updated probability tensor by B modulation。

As a further scheme of the invention: the step S5 comprises the following steps:

The context-based distortion reduction module (Distortion Reduction Model, DRM) is designed to be used after entropy decoding, unlike non-progressive encoders, where the algorithm proposed herein can use a bias hidden tensor For any one ofReconstructing an imageThus, in decodingThe DRM module then uses the context to reduce errors, thereby reducing image distortion.

As shown in FIG. 3, the architecture of DRM is shown, in M andFor context, the DRM module shifts the tensorIs to be ofAs shown in formula (3):

(3)；

Unlike RRM, DRM does not use AndAs a context, as it includesDue to the probability information of (2)Has been decoded and used for reconstruction，AndHardly provideIn addition, the DRM module is a regressor for reducing distortion, and the RRM is a classifier for reducing bit rate.

The DRM module is trained to minimize the loss as shown in equation (4):

(4)；

As a further scheme of the invention: the step S6 comprises the following steps:

finally, the improved latent tensor is passed through the super decoder Image reconstruction is carried out to obtain。

To achieve the above object, another aspect of the present invention provides a terminal device, including a processor, a memory, and a program or instructions stored on the memory and executable on the processor; the processor implements the image compression algorithm described above when executing the program or instructions.

Compared with the prior art, the invention has the following beneficial effects:

context information integration: by developing a context-based probability reduction module, the algorithm can more accurately predict and estimate the cubic probability of potential elements; the integration of the context information provides a more accurate coding basis for the data of three planes, so that the coding process is more efficient and compact.

Intelligent tensor extraction: by designing a context-based distortion reduction module, the module can intelligently identify and extract critical potential tensor information from three planes; the process not only optimizes the data compression efficiency, but also remarkably improves the quality of the reconstructed image, and ensures the maximum reservation of image details in the compression process.

The improvement of the algorithm brings new performance improvement to the field of depth progressive image compression, and is particularly suitable for application scenes with high requirements on compression efficiency and image quality.

Drawings

Fig. 1 shows a flowchart of an overall image compression method according to an embodiment of the present invention;

FIG. 2 illustrates a schematic diagram of a context-based rate reduction module provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of a context-based distortion reduction module according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.

Detailed Description

The invention is further illustrated by the following examples.

As shown in fig. 1, a context-based progressive tri-planar coded image compression algorithm includes the steps of:

Further, S2 includes the following steps:

The encoder extracts the key features in the image, and finally outputs a potential tensor Y by using the progressive extraction of the image features layer by layer of the convolutional neural network; the super-coder further refines and abstracts the features, further compresses the information, etc., to obtain the super-potential tensor Z.

Furthermore, byCalculation of quantized super potential tensorThe average value M and standard deviation are then obtained using a super decoder, as shown in equation (1);

(1)。

Further, S3 includes the following steps:

Further, S4 includes the following steps:

As shown in fig. 2, RRM is input from the input respectivelyAnd extracting features in context and fusing them by a residual block convolution layer, wherein the fusion tensor has a value equal to that of the residual block convolution layerThe same spatial resolution, but the channels areFour times that of an additively added item by channelAnd a scale item；

First, S is converted to B as shown in equation 2:

(2)；

Further, S5 includes the following steps:

As shown in fig. 3, the architecture of DRM is shown. By M andFor context, the DRM module shifts the tensorIs to be ofAs shown in formula (3):

(3)；

The DRM module is trained to minimize the loss as shown in equation (4):

(4)；

As shown in fig. 4, a terminal device includes:

The system comprises a processor, a memory and a program or instructions stored on the memory and capable of running on the processor, wherein the processor comprises one or more than one processing core, and the processor is connected with the memory through a bus; the processor implements the image compression algorithm described above when executing a program or instructions.

The memory may be implemented by any type of volatile or nonvolatile memory device or combination such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

In addition, the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the steps of the context-based progressive three-plane coding image compression algorithm when being executed by a processor. Optionally, the present invention also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of a context-based progressive three-plane coded image compression algorithm of the above aspects.

It will be appreciated by those of ordinary skill in the art that the processes for implementing all or part of the steps of the above embodiments may be implemented by hardware, or may be implemented by a program for instructing the relevant hardware, and the program may be stored in a computer readable storage medium, where the above storage medium may be a read-only memory, a magnetic disk or optical disk, etc.

Claims

1. A context-based progressive three-plane coded image compression algorithm, comprising the steps of:

S4, designing a context-based rate reduction module, predicting the value of each tri-plane, fusing the values through a residual block and a convolution layer, and finally determining by using an activation functionA value;

s5, designing a context-based distortion reduction module, and using the bias hidden tensor to perform random after entropy decoding Reconstructing an image by using values in a range;

s6, finally, reconstructing the improved potential tensor through a super decoder;

the step S1 comprises the following steps:

an MPEG video stream image compression algorithm is adopted, and video quality under high compression ratio and low bit rate is improved through motion compensation, discrete Cosine Transform (DCT) and Huffman coding technology;

The step S2 comprises the following steps:

the encoder extracts the key features in the image by using a convolutional neural network layer by layer, and finally outputs a potential tensor Y; the super encoder further refines and abstracts the features, further compresses information and the like to obtain a super potential tensor Z;

By passing through Calculation of quantized super potential tensorThe average value M and standard deviation are then obtained using a super decoder, as shown in equation (1)；

（1）；

The step S3 comprises the following steps:

Using entropy parameter M and standard deviation And the coded trihedral value is used for evaluationWherein entropy parameters are typically used to adjust and control the entropy of information during encoding, thereby affecting the loss of information in compression rate;

the step S4 comprises the following steps:

By designing a context-based rate reduction module, predicting potential elements in raster scan order is eliminated, but each plane is predicted ，; In addition, the rate reduction module pairRefining the probability estimates in (2) to produce updated tensors，Bit ratio required in entropy codingLess, thereby improving the performance of the "rate-distortion" RD;

the rate reduction modules respectively receive the input And extracting features in context and fusing them by a residual block convolution layer, wherein the fusion tensor has a value equal to that of the residual block convolution layerThe same spatial resolution, but the channels areFour times that of an additively added item by channelAnd a scale item；

First, S is converted to B as shown in formula (2):

（2）；

specifically, each element thereof is in In (C) and then willAdded toAnd sums it to produce an updated probability tensor by B modulation；

The step S5 comprises the following steps:

Designing a context-based distortion reduction module using a bias hidden tensor For any one ofReconstructing an imageThus, in decodingThen, the distortion reduction module uses the context to reduce errors, thereby reducing image distortion;

architecture of the distortion reduction module is M sum For context, the distortion reduction module shifts the tensorIs to be ofAs shown in formula (3):

（3）；

The distortion reduction module is trained to minimize losses as shown in equation (4):

（4）；

the step S6 comprises the following steps:

Improved latent tensor by super decoder Image reconstruction is carried out to obtain。

2. A terminal device, comprising:

A processor, a memory, and a program or instructions stored on the memory and executable on the processor; the processor, when executing the program or instructions, implements the image compression algorithm of claim 1.