[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN105791980B - Films and television programs renovation method based on increase resolution - Google Patents

Films and television programs renovation method based on increase resolution Download PDF

Info

Publication number
CN105791980B
CN105791980B CN201610109909.XA CN201610109909A CN105791980B CN 105791980 B CN105791980 B CN 105791980B CN 201610109909 A CN201610109909 A CN 201610109909A CN 105791980 B CN105791980 B CN 105791980B
Authority
CN
China
Prior art keywords
resolution video
expert
resolution
video frame
low
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610109909.XA
Other languages
Chinese (zh)
Other versions
CN105791980A (en
Inventor
张宏志
赵秋实
左旺孟
石坚
张垒磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Super-Resolution Fx Technology Co Ltd
Original Assignee
Harbin Super-Resolution Fx Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Super-Resolution Fx Technology Co Ltd filed Critical Harbin Super-Resolution Fx Technology Co Ltd
Priority to CN201610109909.XA priority Critical patent/CN105791980B/en
Publication of CN105791980A publication Critical patent/CN105791980A/en
Application granted granted Critical
Publication of CN105791980B publication Critical patent/CN105791980B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440263Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440236Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Processing (AREA)
  • Television Systems (AREA)

Abstract

Films and television programs of the present invention for lower resolution ratio, lower clarity, it is proposed that a kind of films and television programs renovation method based on increase resolution, concrete scheme are:First, the resolution ratio and target resolution for obtaining original video, calculate scaling;Secondly, input video is divided into set of frames by certain partitioning scheme;Then, it is converted according to pre-stored mapping relations, obtains high-resolution video frame;Finally, high-resolution video frame is combined into high-resolution video wherein, pre-stored mapping relations are obtained based on Mixture of expert model learning, the process in a computer offline complete the method for the invention have many advantages, such as adaptively it is good, speed is fast, effect is good, expansible.

Description

Movie and television work renovation method based on resolution improvement
Technical Field
The invention belongs to the field of computer vision and image processing, relates to a method for renewing film and television works, and particularly relates to a method and a system for renewing the film and television works based on resolution improvement.
Background
With the development of video acquisition, transmission, storage and display technologies, movie and television works are continuously developing towards high resolution. People enjoy video with higher and higher taste, and continuously pursue high-resolution and high-definition film and television works. Meanwhile, the advent of high-resolution display devices (such as 4K, 5K televisions and monitors) has made possible the popularization of high-resolution film and television works.
On the other hand, however, many of the older classic film and television works still have lower resolution, lower definition, and poorer visual effect due to technical means limitations. Meanwhile, due to the fact that the age is long, the film is long in storage time, and various quality degradation such as damage, flicker, noise, jitter and the like can occur due to the fact that damage of external factors such as natural disasters, war and the like is caused. On one hand, people want to warm classical film and television works, and on the other hand, people have new requirements on the quality of films. In order to meet the requirements of people on warming classical film and television works and pursuing high-quality videos, the film and television work renovating technology is produced. The essence of the renovation of film and television works is to apply image/video processing technology to process the original video and eliminate various quality degradation so as to improve the visual effect of the original video.
Like books, film and television works are important cultural carriers of human society, and some classical film and television works have irreplaceable cultural value even though the times are long. Therefore, the refreshing and remapping of the film and television works which are long in the past have very important significance. Specifically, the meaning of the renovation of the film and television works comprises the following aspects:
1. some classical film and television works, such as documentaries, are precious historical data. The historical data can be better stored and transmitted by renewing the film and television works.
2. The method is an important form for cultural and artistic inheritance by renovating classical film and television works and enabling more modern people to appreciate the classic film and television works.
3. The classic film and television works are renovated, the classic art works are glowing again, and the classic film and television works are the greatest respect and souvenir for artists.
The existing method for improving the visual effect of film and television works mainly focuses on video enhancement means, such as noise removal, blur removal, interlacing removal, contrast enhancement, color enhancement and the like. The methods can enhance the visual effect of the original video, but do not improve the resolution of the video, so the method does not essentially meet the requirement of people on the renovation of the classic film and television works.
The resolution enhancement is to generate a high-resolution video quickly and effectively by a certain method from a low-resolution video (or video frame). The difficulty is how to break through the limitation of the number of pixels of the original low-resolution video, fill the pixels which do not exist originally, and keep the structure and the texture of the original low-resolution video and make the video more natural and reasonable to the human eyes.
The traditional resolution improvement method mainly comprises interpolation-based, reconstruction-based and learning-based methods. The interpolation-based method is to linearly combine the existing pixel points to serve as the missing pixel points. The interpolation algorithm is simple and rapid, but a mosaic effect or an over-smooth phenomenon is easy to occur; the reconstruction-based algorithm carries out registration reconstruction by utilizing the similarity of multi-frame images, but the algorithm is usually only simple combination of the multi-frame images, and the effect is not ideal; the learning-based algorithm mainly utilizes a certain amount of training data and obtains the mapping relation from the low-resolution video to the high-resolution video according to specific algorithm training, and the algorithm has higher requirements on a model, is easy to over-fit or under-fit, and has large calculation amount, low speed and low practicability. It can be said that the above-mentioned video resolution improvement problem is always puzzling the users.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the invention provides a method and a system for renewing film and television works based on video resolution improvement, which convert low-resolution film and television works (usually lower than 720P) into higher-resolution videos (such as 1080P, 4K and the like) through a resolution improvement technology to realize the renewal of the film and television works.
The technical solution of the invention is as follows: the specific scheme of the invention is as follows: firstly, acquiring the resolution and the target resolution of an original video, and calculating the scaling; secondly, dividing the input video into image frames according to a certain dividing mode; then, transforming according to a pre-stored mapping relation to obtain a high-resolution video frame; finally, the high resolution video frames are combined into a high resolution video. The pre-stored mapping relation model is obtained based on mixed expert model learning, and the model training process is completed in a computer in an off-line mode. The specific steps include:
learning a mapping relation model:
(1) preprocessing training video
(1.1) selecting a high-resolution video as a training sample, and splitting the high-resolution video into high-resolution video frames;
(1.2) convolving the high-resolution video frame obtained in the step (1.1) by using Gaussian kernel
(1.3) calculating magnification times according to the original low-resolution video and the target high-resolution video, and carrying out interlaced sampling according to the obtained times to obtain corresponding low-resolution video frames;
and (1.4) respectively dividing the high-resolution video frame and the sampled low-resolution video frame into blocks as training data.
(2) Obtaining a mapping relation model based on a hybrid expert model
(2.1) initializing a hybrid expert model. The hybrid expert model comprises two parts of an expert and a gate function, and the structure of the hybrid expert model is a tree shape, as shown in the attached figure 2. The leaf nodes in the tree structure in the graph are called experts and are responsible for mapping and transforming data; the root node is called the gate function and is responsible for selecting the appropriate expert for the data. The present invention uses a linear function as an expert function:
y=Wx
where W is an expert function parameter and x and y represent a block of low resolution video frames and a corresponding block of high resolution video frames, respectively.
The gate function is responsible for deciding which expert to select for transforming the data, and in the present invention, the ith gate function is expressed as:
where x and y represent a block of low resolution video frames and a corresponding block of high resolution video frames, respectively, viRepresenting the ith gate function parameter, vjAnd expressing the jth gate function parameter, wherein K is the number of experts in the mixed expert model, namely the number of leaf nodes in the tree structure. The initialization of the hybrid expert model specifically comprises the following steps:
(2.1.1) specifying a number K of experts;
(2.1.2) assume that the probability distribution of each expert follows a Gaussian distribution: p (y | x, W)i)=N(y(x,Wi) σ) wherein WiRepresents the parameter of the ith expert, and σ is the standard deviation of the gaussian distribution. Assume parameter WiThe distribution of (c) also follows a gaussian distribution: p (W)i) N (0, μ), where μ denotes the mean of the gaussian distribution.
(2.1.3) clustering the training data according to the number K of experts by adopting a K-means algorithm, wherein the initial value W of the parameter of each experti (0)Specifying the initial value v of each gate function parameter as the slope within classi (0)Designating as a cluster center;
(2.1.4) calculate the initial value of each gate function:
where x denotes a block of low resolution video frames, vi (0)And expressing the initial value of the ith gate function parameter, wherein K is the number of experts in the mixed expert model, namely the number of leaf nodes in the tree structure.
And (2.2) using the training data obtained in the step (1.4) to perform iterative optimization on the hybrid expert model until the iterative process is converged, wherein the finally obtained model parameters are the mapping relation model. The mapping relation model includes a gate function parameter and an expert parameter.
(2.2.1) specifying an allowable error epsilon at the termination of the iteration;
(2.2.2) calculating the posterior probability of each gate function in the current iteration:
where k is the number of iteration steps, pi(y|x,Wi (k)) And pj(y|x,Wj (k)) Probability distribution, g, representing expertsi (k)(x,vi (k)) Representing the value of the kth iteration of the ith gate function.
(2.2.3) updating each expert parameter:
where k is the number of iteration steps, X is the vector formed by all low resolution video frame blocks X in the training data, Y is the vector formed by all high resolution video frame blocks Y in the training data, X is the number of iteration stepsTDenotes the transpose of X, I denotes the identity matrix, Hi (k+1)And (3) representing a vector formed by the posterior probabilities of all the low-resolution video frame blocks x corresponding to the ith expert in the (k + 1) th step.
(2.2.4) updating each gate function parameter:
whereinRepresenting the ith gate function parameter in the kth iteration,is the posterior probability, x, of the ith gate function in the kth iteration(t)Representing the t-th block of low resolution video frames.
(2.2.5) calculating the output of each gate function in the current iteration:
(2.2.6) calculating the likelihood probability in the current iteration:
wherein p isi(y|x,Wi (k+1)) Probability distribution, p (W), representing expertsi (k+1)) Representing the probability distribution of the expert parameters.
(2.2.7) judging whether the iteration converges. And ending the iteration when the absolute value of the difference between the likelihood probability of the iteration of the current round and the likelihood probability of the iteration of the previous round is smaller than the allowable error epsilon when the iteration is ended. Otherwise, repeating the steps (2.2.2) - (2.2.7).
Gate function parameter v obtained at the end of iterationiTogether with the number of experts K, the expert parameter WiThe standard deviation sigma of the probability distribution of the expert and the mean mu of the probability distribution of the expert parameters are stored in a disk as a final mapping relation model.
After the mapping relation model is learned and stored, the resolution of the video is improved by using the stored mapping relation model:
(3) pre-processing low resolution video to be processed
(3.1) splitting the low-resolution video into low-resolution video frames;
(3.2) dividing the low-resolution video frame obtained in the step (3.1) into low-resolution video frame blocks;
(4) and (3) upgrading the low-resolution video into a high-resolution video according to the mapping relation model obtained in the step (2), and the method comprises the following steps:
(4.1) taking the low-resolution video frame block obtained in the step (3) as an input of a gate function, and calculating the output of each gate function by using gate function parameters in the mapping relation model obtained in the step (2):
where x is the incoming low resolution video frame block. K is the number of experts in the mixed expert model, viRepresenting the ith gate function parameter, are obtained by step (2.2).
(4.2) calculating the corresponding high-resolution video frame block by using the parameter of the expert function corresponding to the gate function with the maximum output value, wherein the parameter of the expert function is obtained in the step (2);
(4.2.1) calculating the number of the gate function that obtains the maximum output value: i ═ arg max (g)i)
Wherein, giThe output for the ith gate function is obtained by step (4.1).
(4.2.2) computing a high resolution video frame block using the ith expert function: y ═ Wix
Wherein, WiAnd y is the high-resolution video frame block corresponding to the input low-resolution video frame block x as the parameter of the ith expert function.
(4.3) carrying out resolution enhancement on each low-resolution video frame block according to the steps of (4.1) and (4.2) to obtain a corresponding high-resolution video frame block, and splicing all the high-resolution video frame blocks into corresponding high-resolution video frames according to the positions of the corresponding low-resolution video frame blocks in the low-resolution video frames;
and (4.4) obtaining high-resolution video frames corresponding to all the low-resolution video frames, and combining the high-resolution video frames into a high-resolution video.
In the step (4), there is no dependency relationship between video frame blocks and between video frames, so that the step can be accelerated in parallel by using a GPU processor.
The method for renewing the film and television works based on the resolution improvement can be used in the form of a computer software player, and can also be integrated into a hardware platform (such as a set top box, an intelligent television and the like) for use.
The method for renewing the film and television works based on resolution improvement can be matched with other video screen enhancement methods to be used as a preprocessing or post-processing means, and the visual effect can be further improved.
Compared with the prior art, the invention has the advantages that: from the visual effect, the high-resolution video obtained by implementing the scheme of the invention has complete details, clear edges, good texture maintenance, and is fast and stable. Specifically, the features of the present invention include:
1. and (4) self-adapting. The scheme of the invention adaptively calculates the scaling factor and can adapt to different resolution ratio improvement requirements.
2. The speed is high. Since there is no dependency between video frame sequences, the processing speed can be increased by parallel processing. In addition, the algorithm processes the video frame sequence to be subjected to linear mapping transformation, and the used mapping parameters can be stored in a memory in advance, so that the processing speed can be further improved.
3. The effect is good. The mapping parameters used in the resolution improvement process are obtained based on the hybrid expert model learning, and the defect that the division and the sub-model learning are separated in the traditional resolution improvement algorithm based on the learning is overcome. Meanwhile, the robustness advantage of statistics and the accuracy advantage based on a learning algorithm are combined, the defect that a large amount of data information cannot be utilized based on the learning algorithm in the past is overcome, the precision is higher than that of a pure statistical method, and even a video with high resolution can be well treated and processed at a high speed.
4. And (4) the expansion is realized. Because the video frame sequences do not have dependency relationship, parallel processing can be realized by applying technical means such as GPU acceleration and the like, and the processing speed is improved. In addition, the algorithm provided by the invention can be directly applied to the field of image resolution improvement.
Drawings
Fig. 1 is a flowchart of a method for refreshing a movie or television work based on resolution enhancement according to the present invention.
FIG. 2 is a schematic diagram of a hybrid expert model according to the present invention.
Fig. 3 is a schematic diagram illustrating the division of a video frame into video frame blocks according to the present invention.
Detailed Description
The process according to the invention is illustrated in the following detailed description by way of example.
According to the method for refreshing the film and television works based on resolution enhancement, the process of enhancing a part of video with the original resolution of 768 × 432 to 3072 × 1728 comprises the following steps:
(1) preprocessing training video
(1.1) selecting a high-resolution movie and television work, reading in the video stream of the movie and television work by using video processing software, and storing each frame in the video stream as a video frame, wherein in the embodiment, the length of the movie and television work is 1200 seconds, the frame rate is 25 frames/second, and the total number of the obtained video frames is: 1200 × 25 ═ 15000;
(1.2) performing convolution on the video frame obtained in the step (1.1) by using a Gaussian kernel with the average value of 0 and the standard deviation of 1;
and (1.3) acquiring the resolution of the original low-resolution video and the target resolution, and calculating the magnification according to the resolution of the original low-resolution video and the target resolution. Original resolution is 768 × 432, target resolution is 3072 × 1728, magnification is: 3072/768 ═ 4. Accordingly, the convolved video frames are downsampled to the original size of 1/4 to obtain the corresponding low resolution video frames.
(1.4) each of the low resolution video frames obtained is divided into non-overlapping small blocks of 10 × 10 pixels by the existing division standard, as shown in fig. 3, and 1,000,000 blocks are selected as training data.
(2) Obtaining a mapping relation model based on a hybrid expert model
(2.1) initializing the hybrid expert model
(2.1.1) the number of experts K is specified. In this embodiment, K is taken to be 100;
(2.1.2) parameters σ and μ that specify the probability distribution of the expert and the probability distribution of the expert parameter, where σ is 0.32 and μ is 0.58 in the present embodiment;
(2.1.3) clustering the training data according to the number K of experts by adopting a K-means algorithm, wherein the W of each experti (0)The parameter is initialized to the slope in class, the gate function parameter vi (0)Initializing to a cluster center;
(2.1.4) calculating the initial value of each gate function according to:
where x denotes a block of low resolution video frames, vi (0)And expressing the initial value of the ith gate function parameter, wherein K is the number of experts in the mixed expert model, namely the number of leaf nodes in the tree structure.
(2.2) using the training data obtained in the step (1.4) to carry out iterative optimization on the hybrid expert model obtained in the step (2.1):
(2.2.1) specify the allowable error ε at the end of the iteration. In this embodiment, the error epsilon allowed by the termination of the model iteration is 0.005.
(2.2.2) calculating the posterior probability of each gate function in the current iteration:
whereink is the number of iteration steps, pi(y|x,Wi (k)) And pj(y|x,Wj (k)) Probability distribution, g, representing expertsi (k)(x,vi (k)) Representing the value of the kth iteration of the ith gate function.
(2.2.3) updating each expert parameter:
where k is the number of iteration steps, X is the vector formed by all low resolution video frame blocks X in the training data, Y is the vector formed by all high resolution video frame blocks Y in the training data, X is the number of iteration stepsTDenotes the transpose of X, I denotes the identity matrix, Hi (k+1)And (3) representing a vector formed by the posterior probabilities of all the low-resolution video frame blocks x corresponding to the ith expert in the (k + 1) th step.
(2.2.4) updating each gate function parameter:
whereinRepresenting the ith gate function parameter in the kth iteration,is the posterior probability, x, of the ith gate function in the kth iteration(t)Representing the t-th block of low resolution video frames.
(2.2.5) calculating the output of each gate function in the current iteration:
(2.2.6) calculating the likelihood probability in the current iteration:
wherein p isi(y|x,Wi (k+1)) Probability distribution, p (W), representing expertsi (k+1)) Representing the probability distribution of the expert parameters.
(2.2.7) judging whether the iteration converges. And ending the iteration when the absolute value of the difference between the likelihood probability of the iteration of the current round and the likelihood probability of the iteration of the previous round is smaller than the allowable error epsilon when the iteration is ended. Otherwise, repeating the steps (2.2.2) - (2.2.7).
Gate function parameter v obtained at the end of iterationiTogether with the number of experts K, the expert parameter WiThe standard deviation sigma of the probability distribution of the expert and the mean mu of the probability distribution of the expert parameters are stored in a disk as a final mapping relation model. Wherein v isiK, σ, μ are referred to as gate function parameters of the mapping relation model, WiReferred to as expert parameters of the mapping relation model.
(3) Pre-processing low resolution video to be processed
(3.1) splitting the low-resolution video to be processed into low-resolution video frames, wherein in this embodiment, the length of the movie work is 2000 seconds, the frame rate is 25 frames/second, and the total number of the obtained video frames is: 2000 × 25 ═ 50000;
(3.2) dividing the low resolution video frame obtained in step (3.1) into 10 × 10 video frame blocks, as shown in fig. 3;
(4) mapping the low-resolution video to a high-resolution video, comprising:
(4.1) taking the low-resolution video frame block obtained in the step (3) as an input of a gate function, and calculating the output of each gate function by using gate function parameters in the mapping relation model obtained in the step (2):
where x is the incoming low resolution video frame block. K is the number of experts in the mixed expert model, viRepresenting the ith gate function parameter, are obtained by step (2.2).
(4.2) calculating a corresponding high-resolution video frame block by using the expert function parameter corresponding to the gate function with the maximum output value;
(4.2.1) calculating the number of the gate function that obtains the maximum output value: i ═ arg max (g)i). Wherein, giThe output for the ith gate function is obtained by step (4.1).
(4.2.2) computing a high resolution video frame block using the ith expert function:
y=Wix
wherein, WiAnd (3) obtaining the ith expert function parameter through the step (2). x is the input low resolution video frame block of size 10 x 10, y is the resolution up-scaled high resolution video frame block of size 40 x 40.
(4.3) carrying out resolution enhancement on each low-resolution video frame block according to the steps of (4.1) and (4.2) to obtain a corresponding high-resolution video frame block, and splicing all the high-resolution video frame blocks into corresponding high-resolution video frames according to the positions of the corresponding low-resolution video frame blocks in the low-resolution video frames;
and (4.4) obtaining high-resolution video frames corresponding to all the low-resolution video frames, and combining the high-resolution video frames into a high-resolution video.

Claims (3)

1. A method for renewing film and television works based on resolution improvement is characterized in that: the method comprises two parts of learning a mapping relation model and carrying out resolution improvement according to the mapping relation model;
the method for learning the mapping relation model comprises the following two steps:
(1) pre-processing a training video, comprising:
(1.1) selecting a high-resolution video as a training sample, and splitting the high-resolution video into high-resolution video frames;
(1.2) convolving the high-resolution video frame obtained in the step (1.1) by using a Gaussian kernel;
(1.3) calculating magnification times according to the original low-resolution video and the target high-resolution video, and carrying out interlaced sampling according to the obtained times to obtain corresponding low-resolution video frames;
(1.4) respectively dividing the high-resolution video frame and the sampled low-resolution video frame into blocks as training data;
(2) obtaining a mapping relation model based on a hybrid expert model, comprising:
(2.1) initializing a hybrid expert model, comprising the steps of:
① specifies the number of experts K;
② assume that the probability distribution of each expert follows a Gaussian distribution of p (y | x, W)i)=N(y(x,Wi) σ), where x denotes a low resolution video frame block, y denotes a high resolution video frame block, WiA parameter representing the ith expert, σ being the standard deviation of the Gaussian distribution; assume parameter WiThe distribution of (c) also follows a gaussian distribution: p (W)i) N (0, μ), where μ represents the mean of a gaussian distribution;
③ clustering training data according to the number K of experts by K-means algorithm, and setting the initial value W of each expert parameteri (0)Specifying the initial value v of each gate function parameter as the slope within classi (0)Designating as a cluster center;
④ calculate the initial value of each gate function:
wherein v isi (0)Expressing the initial value of the ith gate function parameter, wherein K is the number of experts in the mixed expert model, namely the number of leaf nodes in the tree structure;
(2.2) using the training data obtained in the step (1.4) to perform iterative optimization on the hybrid expert model until the iterative process is converged, wherein the finally obtained model parameters are the mapping relation model; the mapping relation model comprises a gate function parameter and an expert parameter; the iterative optimization of the model comprises the following steps:
① specifies the allowable error ε at the end of the iteration;
② the posterior probability of each gate function in the current iteration is calculated:
where k is the number of iteration steps, pi(y|x,Wi (k)) And pj(y|x,Wj (k)) Probability distribution, g, representing expertsi (k)(x,vi (k)) Representing the kth iteration value of the ith gate function;
③ update each expert parameter:
where k is the number of iteration steps, X is the vector formed by all low resolution video frame blocks X in the training data, Y is the vector formed by all high resolution video frame blocks Y in the training data, X is the number of iteration stepsTDenotes the transpose of X, I denotes the identity matrix, Hi (k+1)Representing a vector formed by the posterior probabilities of all the low-resolution video frame blocks x corresponding to the ith expert in the (k + 1) th step;
④ update each gate function parameter:
whereinRepresenting the ith gate function parameter in the kth iteration,is the posterior probability, x, of the ith gate function in the kth iteration(t)Representing the tth low resolution video frame block;
⑤ the output of each gate function in the current iteration is calculated:
⑥ likelihood probabilities in the current iteration are calculated:
wherein p isi(y|x,Wi (k+1)) Probability distribution, p (W), representing expertsi (k+1)) A probability distribution representing expert parameters;
⑦, judging whether the iteration is convergent, when the absolute value of the difference between the likelihood probability of the iteration and the likelihood probability of the previous iteration is less than the allowable error epsilon when the iteration is terminated, ending the iteration, otherwise, repeating the steps ② - ⑦;
gate function parameter v obtained at the end of iterationiTogether with the number of experts K, the expert parameter WiThe standard deviation sigma of the probability distribution of the expert and the mean value mu of the probability distribution of the expert parameters are used as a final mapping relation model to be stored in a magnetic disk;
the resolution improvement according to the mapping relation model comprises the following two steps:
(3) pre-processing low resolution video to be processed, comprising:
(3.1) splitting a low-resolution video to be processed into low-resolution video frames;
(3.2) partitioning the low resolution video frame into blocks;
(4) and (3) upgrading the low-resolution video into a high-resolution video according to the mapping relation model obtained in the step (2), and the method comprises the following steps:
(4.1) taking the low-resolution video frame block obtained in the step (3) as the input of a mixed expert model gate function, and calculating the output of each gate function by using the gate function parameters in the mapping relation model obtained in the step (2);
(4.2) calculating the corresponding high-resolution video frame block by using the parameter of the expert function corresponding to the gate function with the maximum output value, wherein the parameter of the expert function is obtained in the step (2);
(4.3) carrying out resolution enhancement on each low-resolution video frame block according to the steps of (4.1) and (4.2) to obtain a corresponding high-resolution video frame block, and splicing all the high-resolution video frame blocks into corresponding high-resolution video frames according to the positions of the corresponding low-resolution video frame blocks in the low-resolution video frames;
and (4.4) obtaining high-resolution video frames corresponding to all the low-resolution video frames, and combining the high-resolution video frames into a high-resolution video.
2. The method for refreshing a movie or television work based on resolution enhancement according to claim 1, wherein the hybrid expert model of step (2.1) comprises two parts of an expert and a gate function;
the expert is responsible for mapping and transforming the data, and the mapping and transforming in the invention uses a linear function as an expert function:
y=Wx
wherein W is an expert parameter, and x and y represent a low resolution video frame block and a corresponding high resolution video frame block, respectively;
the gate function is responsible for deciding which expert to select for transforming the data, and the ith gate function in the invention is expressed as:
wherein v isiRepresenting the ith gate function parameter, vjAnd expressing the jth gate function parameter, wherein K is the number of experts in the mixed expert model.
3. The method for refreshing a movie or television work based on resolution enhancement according to claim 1, wherein: the step (4.2) of calculating the corresponding high resolution video frame block by using the parameter of the expert function corresponding to the gate function with the maximum output value comprises the following steps:
① calculating the gate function number for obtaining the maximum output value, i ═ arg max (g)i)
Wherein,githe output of the ith gate function is obtained through the step (4.1);
② use the ith expert function to calculate the block of high resolution video frames y-Wix
Wherein, WiAnd y is the high-resolution video frame block corresponding to the input low-resolution video frame block x as the parameter of the ith expert function.
CN201610109909.XA 2016-02-29 2016-02-29 Films and television programs renovation method based on increase resolution Active CN105791980B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610109909.XA CN105791980B (en) 2016-02-29 2016-02-29 Films and television programs renovation method based on increase resolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610109909.XA CN105791980B (en) 2016-02-29 2016-02-29 Films and television programs renovation method based on increase resolution

Publications (2)

Publication Number Publication Date
CN105791980A CN105791980A (en) 2016-07-20
CN105791980B true CN105791980B (en) 2018-09-14

Family

ID=56403789

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610109909.XA Active CN105791980B (en) 2016-02-29 2016-02-29 Films and television programs renovation method based on increase resolution

Country Status (1)

Country Link
CN (1) CN105791980B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109151339B (en) * 2018-08-27 2021-08-13 武汉有文心文化传媒有限公司 Method for synthesizing characters in recommendation video and related products
EP3861497A4 (en) * 2018-11-06 2022-06-08 Film It Live, Inc. High resolution film creation and management system
CN109981991A (en) * 2019-04-17 2019-07-05 北京旷视科技有限公司 Model training method, image processing method, device, medium and electronic equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8520736B2 (en) * 2009-04-14 2013-08-27 Fastvdo, Llc Real-time superresolution and video transmission
CN101639937B (en) * 2009-09-03 2011-12-14 复旦大学 Super-resolution method based on artificial neural network
CN104778671B (en) * 2015-04-21 2017-09-22 重庆大学 A kind of image super-resolution method based on SAE and rarefaction representation

Also Published As

Publication number Publication date
CN105791980A (en) 2016-07-20

Similar Documents

Publication Publication Date Title
CN110120011B (en) Video super-resolution method based on convolutional neural network and mixed resolution
CN107507134B (en) Super-resolution method based on convolutional neural network
CN105744357B (en) A kind of reduction network video bandwidth occupancy method based on online increase resolution
US20200334894A1 (en) 3d motion effect from a 2d image
CN112543317B (en) Method for converting high-resolution monocular 2D video into binocular 3D video
CN110717868B (en) Video high dynamic range inverse tone mapping model construction and mapping method and device
US20150071545A1 (en) Image Enhancement Using Self-Examples and External Examples
US20240169479A1 (en) Video generation with latent diffusion models
CN113129212B (en) Image super-resolution reconstruction method and device, terminal device and storage medium
CN110958469A (en) Video processing method and device, electronic equipment and storage medium
CN105791980B (en) Films and television programs renovation method based on increase resolution
US20180122047A1 (en) Super resolution using fidelity transfer
CN109886906B (en) Detail-sensitive real-time low-light video enhancement method and system
CN111696034B (en) Image processing method and device and electronic equipment
Liu et al. Facial image inpainting using multi-level generative network
CN115170388A (en) Character line draft generation method, device, equipment and medium
WO2022164680A1 (en) Simultaneously correcting image degradations of multiple types in an image of a face
JP2011070283A (en) Face image resolution enhancement device and program
Wang Single image super-resolution with u-net generative adversarial networks
CN114663285B (en) Old movie super-resolution system based on convolutional neural network
US11928855B2 (en) Method, device, and computer program product for video processing
Barua et al. ArtHDR-Net: Perceptually Realistic and Accurate HDR Content Creation
Park et al. Dual-stage Super-resolution for edge devices
Chen et al. NLUT: Neural-based 3D Lookup Tables for Video Photorealistic Style Transfer
König et al. Enhancing traffic scene predictions with generative adversarial networks

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant