CN105791980B - Films and television programs renovation method based on increase resolution - Google Patents
Films and television programs renovation method based on increase resolution Download PDFInfo
- Publication number
- CN105791980B CN105791980B CN201610109909.XA CN201610109909A CN105791980B CN 105791980 B CN105791980 B CN 105791980B CN 201610109909 A CN201610109909 A CN 201610109909A CN 105791980 B CN105791980 B CN 105791980B
- Authority
- CN
- China
- Prior art keywords
- resolution video
- expert
- resolution
- video frame
- low
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000009418 renovation Methods 0.000 title abstract description 6
- 238000013507 mapping Methods 0.000 claims abstract description 36
- 238000000638 solvent extraction Methods 0.000 claims abstract 2
- 230000006870 function Effects 0.000 claims description 80
- 238000012549 training Methods 0.000 claims description 23
- 230000006872 improvement Effects 0.000 claims description 15
- 238000004422 calculation algorithm Methods 0.000 claims description 14
- 238000007781 pre-processing Methods 0.000 claims description 7
- 230000001131 transforming effect Effects 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000012804 iterative process Methods 0.000 claims description 2
- 238000012886 linear function Methods 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 7
- 230000008901 benefit Effects 0.000 abstract description 4
- 230000000694 effects Effects 0.000 abstract description 4
- 239000000203 mixture Substances 0.000 abstract 1
- 238000012545 processing Methods 0.000 description 7
- 230000000007 visual effect Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 4
- 230000015556 catabolic process Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000010792 warming Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440263—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440236—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Image Processing (AREA)
- Television Systems (AREA)
Abstract
Films and television programs of the present invention for lower resolution ratio, lower clarity, it is proposed that a kind of films and television programs renovation method based on increase resolution, concrete scheme are:First, the resolution ratio and target resolution for obtaining original video, calculate scaling;Secondly, input video is divided into set of frames by certain partitioning scheme;Then, it is converted according to pre-stored mapping relations, obtains high-resolution video frame;Finally, high-resolution video frame is combined into high-resolution video wherein, pre-stored mapping relations are obtained based on Mixture of expert model learning, the process in a computer offline complete the method for the invention have many advantages, such as adaptively it is good, speed is fast, effect is good, expansible.
Description
Technical Field
The invention belongs to the field of computer vision and image processing, relates to a method for renewing film and television works, and particularly relates to a method and a system for renewing the film and television works based on resolution improvement.
Background
With the development of video acquisition, transmission, storage and display technologies, movie and television works are continuously developing towards high resolution. People enjoy video with higher and higher taste, and continuously pursue high-resolution and high-definition film and television works. Meanwhile, the advent of high-resolution display devices (such as 4K, 5K televisions and monitors) has made possible the popularization of high-resolution film and television works.
On the other hand, however, many of the older classic film and television works still have lower resolution, lower definition, and poorer visual effect due to technical means limitations. Meanwhile, due to the fact that the age is long, the film is long in storage time, and various quality degradation such as damage, flicker, noise, jitter and the like can occur due to the fact that damage of external factors such as natural disasters, war and the like is caused. On one hand, people want to warm classical film and television works, and on the other hand, people have new requirements on the quality of films. In order to meet the requirements of people on warming classical film and television works and pursuing high-quality videos, the film and television work renovating technology is produced. The essence of the renovation of film and television works is to apply image/video processing technology to process the original video and eliminate various quality degradation so as to improve the visual effect of the original video.
Like books, film and television works are important cultural carriers of human society, and some classical film and television works have irreplaceable cultural value even though the times are long. Therefore, the refreshing and remapping of the film and television works which are long in the past have very important significance. Specifically, the meaning of the renovation of the film and television works comprises the following aspects:
1. some classical film and television works, such as documentaries, are precious historical data. The historical data can be better stored and transmitted by renewing the film and television works.
2. The method is an important form for cultural and artistic inheritance by renovating classical film and television works and enabling more modern people to appreciate the classic film and television works.
3. The classic film and television works are renovated, the classic art works are glowing again, and the classic film and television works are the greatest respect and souvenir for artists.
The existing method for improving the visual effect of film and television works mainly focuses on video enhancement means, such as noise removal, blur removal, interlacing removal, contrast enhancement, color enhancement and the like. The methods can enhance the visual effect of the original video, but do not improve the resolution of the video, so the method does not essentially meet the requirement of people on the renovation of the classic film and television works.
The resolution enhancement is to generate a high-resolution video quickly and effectively by a certain method from a low-resolution video (or video frame). The difficulty is how to break through the limitation of the number of pixels of the original low-resolution video, fill the pixels which do not exist originally, and keep the structure and the texture of the original low-resolution video and make the video more natural and reasonable to the human eyes.
The traditional resolution improvement method mainly comprises interpolation-based, reconstruction-based and learning-based methods. The interpolation-based method is to linearly combine the existing pixel points to serve as the missing pixel points. The interpolation algorithm is simple and rapid, but a mosaic effect or an over-smooth phenomenon is easy to occur; the reconstruction-based algorithm carries out registration reconstruction by utilizing the similarity of multi-frame images, but the algorithm is usually only simple combination of the multi-frame images, and the effect is not ideal; the learning-based algorithm mainly utilizes a certain amount of training data and obtains the mapping relation from the low-resolution video to the high-resolution video according to specific algorithm training, and the algorithm has higher requirements on a model, is easy to over-fit or under-fit, and has large calculation amount, low speed and low practicability. It can be said that the above-mentioned video resolution improvement problem is always puzzling the users.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the invention provides a method and a system for renewing film and television works based on video resolution improvement, which convert low-resolution film and television works (usually lower than 720P) into higher-resolution videos (such as 1080P, 4K and the like) through a resolution improvement technology to realize the renewal of the film and television works.
The technical solution of the invention is as follows: the specific scheme of the invention is as follows: firstly, acquiring the resolution and the target resolution of an original video, and calculating the scaling; secondly, dividing the input video into image frames according to a certain dividing mode; then, transforming according to a pre-stored mapping relation to obtain a high-resolution video frame; finally, the high resolution video frames are combined into a high resolution video. The pre-stored mapping relation model is obtained based on mixed expert model learning, and the model training process is completed in a computer in an off-line mode. The specific steps include:
learning a mapping relation model:
(1) preprocessing training video
(1.1) selecting a high-resolution video as a training sample, and splitting the high-resolution video into high-resolution video frames;
(1.2) convolving the high-resolution video frame obtained in the step (1.1) by using Gaussian kernel
(1.3) calculating magnification times according to the original low-resolution video and the target high-resolution video, and carrying out interlaced sampling according to the obtained times to obtain corresponding low-resolution video frames;
and (1.4) respectively dividing the high-resolution video frame and the sampled low-resolution video frame into blocks as training data.
(2) Obtaining a mapping relation model based on a hybrid expert model
(2.1) initializing a hybrid expert model. The hybrid expert model comprises two parts of an expert and a gate function, and the structure of the hybrid expert model is a tree shape, as shown in the attached figure 2. The leaf nodes in the tree structure in the graph are called experts and are responsible for mapping and transforming data; the root node is called the gate function and is responsible for selecting the appropriate expert for the data. The present invention uses a linear function as an expert function:
y=Wx
where W is an expert function parameter and x and y represent a block of low resolution video frames and a corresponding block of high resolution video frames, respectively.
The gate function is responsible for deciding which expert to select for transforming the data, and in the present invention, the ith gate function is expressed as:
where x and y represent a block of low resolution video frames and a corresponding block of high resolution video frames, respectively, viRepresenting the ith gate function parameter, vjAnd expressing the jth gate function parameter, wherein K is the number of experts in the mixed expert model, namely the number of leaf nodes in the tree structure. The initialization of the hybrid expert model specifically comprises the following steps:
(2.1.1) specifying a number K of experts;
(2.1.2) assume that the probability distribution of each expert follows a Gaussian distribution: p (y | x, W)i)=N(y(x,Wi) σ) wherein WiRepresents the parameter of the ith expert, and σ is the standard deviation of the gaussian distribution. Assume parameter WiThe distribution of (c) also follows a gaussian distribution: p (W)i) N (0, μ), where μ denotes the mean of the gaussian distribution.
(2.1.3) clustering the training data according to the number K of experts by adopting a K-means algorithm, wherein the initial value W of the parameter of each experti (0)Specifying the initial value v of each gate function parameter as the slope within classi (0)Designating as a cluster center;
(2.1.4) calculate the initial value of each gate function:
where x denotes a block of low resolution video frames, vi (0)And expressing the initial value of the ith gate function parameter, wherein K is the number of experts in the mixed expert model, namely the number of leaf nodes in the tree structure.
And (2.2) using the training data obtained in the step (1.4) to perform iterative optimization on the hybrid expert model until the iterative process is converged, wherein the finally obtained model parameters are the mapping relation model. The mapping relation model includes a gate function parameter and an expert parameter.
(2.2.1) specifying an allowable error epsilon at the termination of the iteration;
(2.2.2) calculating the posterior probability of each gate function in the current iteration:
where k is the number of iteration steps, pi(y|x,Wi (k)) And pj(y|x,Wj (k)) Probability distribution, g, representing expertsi (k)(x,vi (k)) Representing the value of the kth iteration of the ith gate function.
(2.2.3) updating each expert parameter:
where k is the number of iteration steps, X is the vector formed by all low resolution video frame blocks X in the training data, Y is the vector formed by all high resolution video frame blocks Y in the training data, X is the number of iteration stepsTDenotes the transpose of X, I denotes the identity matrix, Hi (k+1)And (3) representing a vector formed by the posterior probabilities of all the low-resolution video frame blocks x corresponding to the ith expert in the (k + 1) th step.
(2.2.4) updating each gate function parameter:
whereinRepresenting the ith gate function parameter in the kth iteration,is the posterior probability, x, of the ith gate function in the kth iteration(t)Representing the t-th block of low resolution video frames.
(2.2.5) calculating the output of each gate function in the current iteration:
(2.2.6) calculating the likelihood probability in the current iteration:
wherein p isi(y|x,Wi (k+1)) Probability distribution, p (W), representing expertsi (k+1)) Representing the probability distribution of the expert parameters.
(2.2.7) judging whether the iteration converges. And ending the iteration when the absolute value of the difference between the likelihood probability of the iteration of the current round and the likelihood probability of the iteration of the previous round is smaller than the allowable error epsilon when the iteration is ended. Otherwise, repeating the steps (2.2.2) - (2.2.7).
Gate function parameter v obtained at the end of iterationiTogether with the number of experts K, the expert parameter WiThe standard deviation sigma of the probability distribution of the expert and the mean mu of the probability distribution of the expert parameters are stored in a disk as a final mapping relation model.
After the mapping relation model is learned and stored, the resolution of the video is improved by using the stored mapping relation model:
(3) pre-processing low resolution video to be processed
(3.1) splitting the low-resolution video into low-resolution video frames;
(3.2) dividing the low-resolution video frame obtained in the step (3.1) into low-resolution video frame blocks;
(4) and (3) upgrading the low-resolution video into a high-resolution video according to the mapping relation model obtained in the step (2), and the method comprises the following steps:
(4.1) taking the low-resolution video frame block obtained in the step (3) as an input of a gate function, and calculating the output of each gate function by using gate function parameters in the mapping relation model obtained in the step (2):
where x is the incoming low resolution video frame block. K is the number of experts in the mixed expert model, viRepresenting the ith gate function parameter, are obtained by step (2.2).
(4.2) calculating the corresponding high-resolution video frame block by using the parameter of the expert function corresponding to the gate function with the maximum output value, wherein the parameter of the expert function is obtained in the step (2);
(4.2.1) calculating the number of the gate function that obtains the maximum output value: i ═ arg max (g)i)
Wherein, giThe output for the ith gate function is obtained by step (4.1).
(4.2.2) computing a high resolution video frame block using the ith expert function: y ═ Wix
Wherein, WiAnd y is the high-resolution video frame block corresponding to the input low-resolution video frame block x as the parameter of the ith expert function.
(4.3) carrying out resolution enhancement on each low-resolution video frame block according to the steps of (4.1) and (4.2) to obtain a corresponding high-resolution video frame block, and splicing all the high-resolution video frame blocks into corresponding high-resolution video frames according to the positions of the corresponding low-resolution video frame blocks in the low-resolution video frames;
and (4.4) obtaining high-resolution video frames corresponding to all the low-resolution video frames, and combining the high-resolution video frames into a high-resolution video.
In the step (4), there is no dependency relationship between video frame blocks and between video frames, so that the step can be accelerated in parallel by using a GPU processor.
The method for renewing the film and television works based on the resolution improvement can be used in the form of a computer software player, and can also be integrated into a hardware platform (such as a set top box, an intelligent television and the like) for use.
The method for renewing the film and television works based on resolution improvement can be matched with other video screen enhancement methods to be used as a preprocessing or post-processing means, and the visual effect can be further improved.
Compared with the prior art, the invention has the advantages that: from the visual effect, the high-resolution video obtained by implementing the scheme of the invention has complete details, clear edges, good texture maintenance, and is fast and stable. Specifically, the features of the present invention include:
1. and (4) self-adapting. The scheme of the invention adaptively calculates the scaling factor and can adapt to different resolution ratio improvement requirements.
2. The speed is high. Since there is no dependency between video frame sequences, the processing speed can be increased by parallel processing. In addition, the algorithm processes the video frame sequence to be subjected to linear mapping transformation, and the used mapping parameters can be stored in a memory in advance, so that the processing speed can be further improved.
3. The effect is good. The mapping parameters used in the resolution improvement process are obtained based on the hybrid expert model learning, and the defect that the division and the sub-model learning are separated in the traditional resolution improvement algorithm based on the learning is overcome. Meanwhile, the robustness advantage of statistics and the accuracy advantage based on a learning algorithm are combined, the defect that a large amount of data information cannot be utilized based on the learning algorithm in the past is overcome, the precision is higher than that of a pure statistical method, and even a video with high resolution can be well treated and processed at a high speed.
4. And (4) the expansion is realized. Because the video frame sequences do not have dependency relationship, parallel processing can be realized by applying technical means such as GPU acceleration and the like, and the processing speed is improved. In addition, the algorithm provided by the invention can be directly applied to the field of image resolution improvement.
Drawings
Fig. 1 is a flowchart of a method for refreshing a movie or television work based on resolution enhancement according to the present invention.
FIG. 2 is a schematic diagram of a hybrid expert model according to the present invention.
Fig. 3 is a schematic diagram illustrating the division of a video frame into video frame blocks according to the present invention.
Detailed Description
The process according to the invention is illustrated in the following detailed description by way of example.
According to the method for refreshing the film and television works based on resolution enhancement, the process of enhancing a part of video with the original resolution of 768 × 432 to 3072 × 1728 comprises the following steps:
(1) preprocessing training video
(1.1) selecting a high-resolution movie and television work, reading in the video stream of the movie and television work by using video processing software, and storing each frame in the video stream as a video frame, wherein in the embodiment, the length of the movie and television work is 1200 seconds, the frame rate is 25 frames/second, and the total number of the obtained video frames is: 1200 × 25 ═ 15000;
(1.2) performing convolution on the video frame obtained in the step (1.1) by using a Gaussian kernel with the average value of 0 and the standard deviation of 1;
and (1.3) acquiring the resolution of the original low-resolution video and the target resolution, and calculating the magnification according to the resolution of the original low-resolution video and the target resolution. Original resolution is 768 × 432, target resolution is 3072 × 1728, magnification is: 3072/768 ═ 4. Accordingly, the convolved video frames are downsampled to the original size of 1/4 to obtain the corresponding low resolution video frames.
(1.4) each of the low resolution video frames obtained is divided into non-overlapping small blocks of 10 × 10 pixels by the existing division standard, as shown in fig. 3, and 1,000,000 blocks are selected as training data.
(2) Obtaining a mapping relation model based on a hybrid expert model
(2.1) initializing the hybrid expert model
(2.1.1) the number of experts K is specified. In this embodiment, K is taken to be 100;
(2.1.2) parameters σ and μ that specify the probability distribution of the expert and the probability distribution of the expert parameter, where σ is 0.32 and μ is 0.58 in the present embodiment;
(2.1.3) clustering the training data according to the number K of experts by adopting a K-means algorithm, wherein the W of each experti (0)The parameter is initialized to the slope in class, the gate function parameter vi (0)Initializing to a cluster center;
(2.1.4) calculating the initial value of each gate function according to:
where x denotes a block of low resolution video frames, vi (0)And expressing the initial value of the ith gate function parameter, wherein K is the number of experts in the mixed expert model, namely the number of leaf nodes in the tree structure.
(2.2) using the training data obtained in the step (1.4) to carry out iterative optimization on the hybrid expert model obtained in the step (2.1):
(2.2.1) specify the allowable error ε at the end of the iteration. In this embodiment, the error epsilon allowed by the termination of the model iteration is 0.005.
(2.2.2) calculating the posterior probability of each gate function in the current iteration:
whereink is the number of iteration steps, pi(y|x,Wi (k)) And pj(y|x,Wj (k)) Probability distribution, g, representing expertsi (k)(x,vi (k)) Representing the value of the kth iteration of the ith gate function.
(2.2.3) updating each expert parameter:
where k is the number of iteration steps, X is the vector formed by all low resolution video frame blocks X in the training data, Y is the vector formed by all high resolution video frame blocks Y in the training data, X is the number of iteration stepsTDenotes the transpose of X, I denotes the identity matrix, Hi (k+1)And (3) representing a vector formed by the posterior probabilities of all the low-resolution video frame blocks x corresponding to the ith expert in the (k + 1) th step.
(2.2.4) updating each gate function parameter:
whereinRepresenting the ith gate function parameter in the kth iteration,is the posterior probability, x, of the ith gate function in the kth iteration(t)Representing the t-th block of low resolution video frames.
(2.2.5) calculating the output of each gate function in the current iteration:
(2.2.6) calculating the likelihood probability in the current iteration:
wherein p isi(y|x,Wi (k+1)) Probability distribution, p (W), representing expertsi (k+1)) Representing the probability distribution of the expert parameters.
(2.2.7) judging whether the iteration converges. And ending the iteration when the absolute value of the difference between the likelihood probability of the iteration of the current round and the likelihood probability of the iteration of the previous round is smaller than the allowable error epsilon when the iteration is ended. Otherwise, repeating the steps (2.2.2) - (2.2.7).
Gate function parameter v obtained at the end of iterationiTogether with the number of experts K, the expert parameter WiThe standard deviation sigma of the probability distribution of the expert and the mean mu of the probability distribution of the expert parameters are stored in a disk as a final mapping relation model. Wherein v isiK, σ, μ are referred to as gate function parameters of the mapping relation model, WiReferred to as expert parameters of the mapping relation model.
(3) Pre-processing low resolution video to be processed
(3.1) splitting the low-resolution video to be processed into low-resolution video frames, wherein in this embodiment, the length of the movie work is 2000 seconds, the frame rate is 25 frames/second, and the total number of the obtained video frames is: 2000 × 25 ═ 50000;
(3.2) dividing the low resolution video frame obtained in step (3.1) into 10 × 10 video frame blocks, as shown in fig. 3;
(4) mapping the low-resolution video to a high-resolution video, comprising:
(4.1) taking the low-resolution video frame block obtained in the step (3) as an input of a gate function, and calculating the output of each gate function by using gate function parameters in the mapping relation model obtained in the step (2):
where x is the incoming low resolution video frame block. K is the number of experts in the mixed expert model, viRepresenting the ith gate function parameter, are obtained by step (2.2).
(4.2) calculating a corresponding high-resolution video frame block by using the expert function parameter corresponding to the gate function with the maximum output value;
(4.2.1) calculating the number of the gate function that obtains the maximum output value: i ═ arg max (g)i). Wherein, giThe output for the ith gate function is obtained by step (4.1).
(4.2.2) computing a high resolution video frame block using the ith expert function:
y=Wix
wherein, WiAnd (3) obtaining the ith expert function parameter through the step (2). x is the input low resolution video frame block of size 10 x 10, y is the resolution up-scaled high resolution video frame block of size 40 x 40.
(4.3) carrying out resolution enhancement on each low-resolution video frame block according to the steps of (4.1) and (4.2) to obtain a corresponding high-resolution video frame block, and splicing all the high-resolution video frame blocks into corresponding high-resolution video frames according to the positions of the corresponding low-resolution video frame blocks in the low-resolution video frames;
and (4.4) obtaining high-resolution video frames corresponding to all the low-resolution video frames, and combining the high-resolution video frames into a high-resolution video.
Claims (3)
1. A method for renewing film and television works based on resolution improvement is characterized in that: the method comprises two parts of learning a mapping relation model and carrying out resolution improvement according to the mapping relation model;
the method for learning the mapping relation model comprises the following two steps:
(1) pre-processing a training video, comprising:
(1.1) selecting a high-resolution video as a training sample, and splitting the high-resolution video into high-resolution video frames;
(1.2) convolving the high-resolution video frame obtained in the step (1.1) by using a Gaussian kernel;
(1.3) calculating magnification times according to the original low-resolution video and the target high-resolution video, and carrying out interlaced sampling according to the obtained times to obtain corresponding low-resolution video frames;
(1.4) respectively dividing the high-resolution video frame and the sampled low-resolution video frame into blocks as training data;
(2) obtaining a mapping relation model based on a hybrid expert model, comprising:
(2.1) initializing a hybrid expert model, comprising the steps of:
① specifies the number of experts K;
② assume that the probability distribution of each expert follows a Gaussian distribution of p (y | x, W)i)=N(y(x,Wi) σ), where x denotes a low resolution video frame block, y denotes a high resolution video frame block, WiA parameter representing the ith expert, σ being the standard deviation of the Gaussian distribution; assume parameter WiThe distribution of (c) also follows a gaussian distribution: p (W)i) N (0, μ), where μ represents the mean of a gaussian distribution;
③ clustering training data according to the number K of experts by K-means algorithm, and setting the initial value W of each expert parameteri (0)Specifying the initial value v of each gate function parameter as the slope within classi (0)Designating as a cluster center;
④ calculate the initial value of each gate function:
wherein v isi (0)Expressing the initial value of the ith gate function parameter, wherein K is the number of experts in the mixed expert model, namely the number of leaf nodes in the tree structure;
(2.2) using the training data obtained in the step (1.4) to perform iterative optimization on the hybrid expert model until the iterative process is converged, wherein the finally obtained model parameters are the mapping relation model; the mapping relation model comprises a gate function parameter and an expert parameter; the iterative optimization of the model comprises the following steps:
① specifies the allowable error ε at the end of the iteration;
② the posterior probability of each gate function in the current iteration is calculated:
where k is the number of iteration steps, pi(y|x,Wi (k)) And pj(y|x,Wj (k)) Probability distribution, g, representing expertsi (k)(x,vi (k)) Representing the kth iteration value of the ith gate function;
③ update each expert parameter:
where k is the number of iteration steps, X is the vector formed by all low resolution video frame blocks X in the training data, Y is the vector formed by all high resolution video frame blocks Y in the training data, X is the number of iteration stepsTDenotes the transpose of X, I denotes the identity matrix, Hi (k+1)Representing a vector formed by the posterior probabilities of all the low-resolution video frame blocks x corresponding to the ith expert in the (k + 1) th step;
④ update each gate function parameter:
whereinRepresenting the ith gate function parameter in the kth iteration,is the posterior probability, x, of the ith gate function in the kth iteration(t)Representing the tth low resolution video frame block;
⑤ the output of each gate function in the current iteration is calculated:
⑥ likelihood probabilities in the current iteration are calculated:
wherein p isi(y|x,Wi (k+1)) Probability distribution, p (W), representing expertsi (k+1)) A probability distribution representing expert parameters;
⑦, judging whether the iteration is convergent, when the absolute value of the difference between the likelihood probability of the iteration and the likelihood probability of the previous iteration is less than the allowable error epsilon when the iteration is terminated, ending the iteration, otherwise, repeating the steps ② - ⑦;
gate function parameter v obtained at the end of iterationiTogether with the number of experts K, the expert parameter WiThe standard deviation sigma of the probability distribution of the expert and the mean value mu of the probability distribution of the expert parameters are used as a final mapping relation model to be stored in a magnetic disk;
the resolution improvement according to the mapping relation model comprises the following two steps:
(3) pre-processing low resolution video to be processed, comprising:
(3.1) splitting a low-resolution video to be processed into low-resolution video frames;
(3.2) partitioning the low resolution video frame into blocks;
(4) and (3) upgrading the low-resolution video into a high-resolution video according to the mapping relation model obtained in the step (2), and the method comprises the following steps:
(4.1) taking the low-resolution video frame block obtained in the step (3) as the input of a mixed expert model gate function, and calculating the output of each gate function by using the gate function parameters in the mapping relation model obtained in the step (2);
(4.2) calculating the corresponding high-resolution video frame block by using the parameter of the expert function corresponding to the gate function with the maximum output value, wherein the parameter of the expert function is obtained in the step (2);
(4.3) carrying out resolution enhancement on each low-resolution video frame block according to the steps of (4.1) and (4.2) to obtain a corresponding high-resolution video frame block, and splicing all the high-resolution video frame blocks into corresponding high-resolution video frames according to the positions of the corresponding low-resolution video frame blocks in the low-resolution video frames;
and (4.4) obtaining high-resolution video frames corresponding to all the low-resolution video frames, and combining the high-resolution video frames into a high-resolution video.
2. The method for refreshing a movie or television work based on resolution enhancement according to claim 1, wherein the hybrid expert model of step (2.1) comprises two parts of an expert and a gate function;
the expert is responsible for mapping and transforming the data, and the mapping and transforming in the invention uses a linear function as an expert function:
y=Wx
wherein W is an expert parameter, and x and y represent a low resolution video frame block and a corresponding high resolution video frame block, respectively;
the gate function is responsible for deciding which expert to select for transforming the data, and the ith gate function in the invention is expressed as:
wherein v isiRepresenting the ith gate function parameter, vjAnd expressing the jth gate function parameter, wherein K is the number of experts in the mixed expert model.
3. The method for refreshing a movie or television work based on resolution enhancement according to claim 1, wherein: the step (4.2) of calculating the corresponding high resolution video frame block by using the parameter of the expert function corresponding to the gate function with the maximum output value comprises the following steps:
① calculating the gate function number for obtaining the maximum output value, i ═ arg max (g)i)
Wherein,githe output of the ith gate function is obtained through the step (4.1);
② use the ith expert function to calculate the block of high resolution video frames y-Wix
Wherein, WiAnd y is the high-resolution video frame block corresponding to the input low-resolution video frame block x as the parameter of the ith expert function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610109909.XA CN105791980B (en) | 2016-02-29 | 2016-02-29 | Films and television programs renovation method based on increase resolution |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610109909.XA CN105791980B (en) | 2016-02-29 | 2016-02-29 | Films and television programs renovation method based on increase resolution |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105791980A CN105791980A (en) | 2016-07-20 |
CN105791980B true CN105791980B (en) | 2018-09-14 |
Family
ID=56403789
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610109909.XA Active CN105791980B (en) | 2016-02-29 | 2016-02-29 | Films and television programs renovation method based on increase resolution |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105791980B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109151339B (en) * | 2018-08-27 | 2021-08-13 | 武汉有文心文化传媒有限公司 | Method for synthesizing characters in recommendation video and related products |
EP3861497A4 (en) * | 2018-11-06 | 2022-06-08 | Film It Live, Inc. | High resolution film creation and management system |
CN109981991A (en) * | 2019-04-17 | 2019-07-05 | 北京旷视科技有限公司 | Model training method, image processing method, device, medium and electronic equipment |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8520736B2 (en) * | 2009-04-14 | 2013-08-27 | Fastvdo, Llc | Real-time superresolution and video transmission |
CN101639937B (en) * | 2009-09-03 | 2011-12-14 | 复旦大学 | Super-resolution method based on artificial neural network |
CN104778671B (en) * | 2015-04-21 | 2017-09-22 | 重庆大学 | A kind of image super-resolution method based on SAE and rarefaction representation |
-
2016
- 2016-02-29 CN CN201610109909.XA patent/CN105791980B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN105791980A (en) | 2016-07-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110120011B (en) | Video super-resolution method based on convolutional neural network and mixed resolution | |
CN107507134B (en) | Super-resolution method based on convolutional neural network | |
CN105744357B (en) | A kind of reduction network video bandwidth occupancy method based on online increase resolution | |
US20200334894A1 (en) | 3d motion effect from a 2d image | |
CN112543317B (en) | Method for converting high-resolution monocular 2D video into binocular 3D video | |
CN110717868B (en) | Video high dynamic range inverse tone mapping model construction and mapping method and device | |
US20150071545A1 (en) | Image Enhancement Using Self-Examples and External Examples | |
US20240169479A1 (en) | Video generation with latent diffusion models | |
CN113129212B (en) | Image super-resolution reconstruction method and device, terminal device and storage medium | |
CN110958469A (en) | Video processing method and device, electronic equipment and storage medium | |
CN105791980B (en) | Films and television programs renovation method based on increase resolution | |
US20180122047A1 (en) | Super resolution using fidelity transfer | |
CN109886906B (en) | Detail-sensitive real-time low-light video enhancement method and system | |
CN111696034B (en) | Image processing method and device and electronic equipment | |
Liu et al. | Facial image inpainting using multi-level generative network | |
CN115170388A (en) | Character line draft generation method, device, equipment and medium | |
WO2022164680A1 (en) | Simultaneously correcting image degradations of multiple types in an image of a face | |
JP2011070283A (en) | Face image resolution enhancement device and program | |
Wang | Single image super-resolution with u-net generative adversarial networks | |
CN114663285B (en) | Old movie super-resolution system based on convolutional neural network | |
US11928855B2 (en) | Method, device, and computer program product for video processing | |
Barua et al. | ArtHDR-Net: Perceptually Realistic and Accurate HDR Content Creation | |
Park et al. | Dual-stage Super-resolution for edge devices | |
Chen et al. | NLUT: Neural-based 3D Lookup Tables for Video Photorealistic Style Transfer | |
König et al. | Enhancing traffic scene predictions with generative adversarial networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |