CN107483920B

CN107483920B - A kind of panoramic video appraisal procedure and system based on multi-layer quality factor

Info

Publication number: CN107483920B
Application number: CN201710683578.5A
Authority: CN
Inventors: 王晶; 杨舒; 费泽松; 张博
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2017-08-11
Filing date: 2017-08-11
Publication date: 2018-12-21
Anticipated expiration: 2037-08-11
Also published as: CN107483920A

Abstract

The invention relates to a panoramic video evaluation method and system based on multi-level quality factors, and belongs to the field of multimedia technology. The present invention will input a piece of lossless panoramic video and a piece of damaged video with the same content, output the quality evaluation result of the damaged video, and realize the automatic evaluation of the damaged video; the idea is to calculate multiple quality factors based on multi-level regions of interest to deal with The important area in the panoramic video has a great influence on the video quality; then the multi-level quality factors are fused through the fusion model, and the model parameters can be learned from subjective data to cope with the subjective initiative of the panoramic video user. This method is suitable for panoramic video quality assessment: due to the consideration and integration of the influence of multi-level user interest regions on video quality, the quality evaluation estimation of damaged video is more consistent with the evaluation results of subjective experiments, and is more suitable for panoramic video. Automatic evaluation of video quality.

Description

A kind of panoramic video appraisal procedure and system based on multi-layer quality factor

Technical field

It is the present invention relates to a kind of panoramic video method for evaluating quality, in particular to a kind of based on the complete of multi-layer quality factor Scape video evaluations method and system, belong to multimedia technology field.

Background technique

With the development of virtual reality (Virtual Reality, abbreviation VR) technology, common planar video just gradually by Replaced 360 degree of panoramic videos.Panoramic video, which refers to, provides 360 degree of horizontal extent for a fixed point of observation, vertical model The video that 180 degree is free to navigate through is enclosed, the stronger feeling of immersion of VR user and experience more on the spot in person can be given.With this Universal, development of the user experience quality of panoramic video for key technology in virtual reality system of kind Novel multimedia business And the optimization of transmission network is of great significance.However, panoramic video quality evaluation is a challenge, because compared to Common plane video, the experience of panoramic video viewer will receive the influence of many factors, including more psychology and physiology because The influence of the subjective factors such as influence, the area-of-interest of element.Traditional video quality evaluation method not can accurately reflect aphorama The quality of frequency.The appraisal procedure for being suitable for panoramic video and system are studied for the development of VR technology and is popularized with important meaning Justice.

Summary of the invention

The purpose of the present invention is assessing the panoramic video quality in virtual reality system, propose a kind of based on multilayer The panoramic video appraisal procedure and system of grade quality factor, the system input one section of lossless panoramic video and one section of same content Video is damaged, the quality assessment result of output damage video realizes the automatic assessment to damage video.

Idea of the invention is that the area-of-interest based on multi-layer calculates multiple quality factors, to cope in panoramic video Important area the problem of being affected to video quality；Then multi-layer quality factor is merged by Fusion Model, Model parameter can be learnt to obtain by subjective data, to cope with the subjective initiative of panoramic video user.

The purpose of the present invention is what is solved by the following technical programs: a kind of panoramic video based on multi-layer quality factor Appraisal procedure and system include a kind of panoramic video appraisal procedure based on multi-layer quality factor and a kind of based on multi-layer matter The panoramic video of the amount factor comments system.

Wherein, a kind of panoramic video appraisal procedure based on multi-layer quality factor, abbreviation this method；One kind being based on multilayer The panoramic video of grade quality factor comments system, abbreviation this system.

This system include panoramic video input module, region of interesting extraction module, multi-layer quality factor computing module, Time domain processing module and multi-layer quality factor Fusion Module.

The connection relationship of each module of this system is as follows:

Panoramic video input module is connected with region of interesting extraction module, region of interesting extraction module and multi-layer matter Factor computing module is measured to be connected；Multi-layer quality factor computing module is connected with time domain processing module；Time domain processing module and more Level quality factor Fusion Module is connected.

The function of each module of this system is as follows:

The function of panoramic video input module is to be decoded to obtain panoramic frame image pair to the video file of input；Feel emerging The function of interesting region extraction module is to extract the multi-layer region of interest domain matrix of panoramic picture；Multi-layer quality factor calculates mould The function of block is that the quality factor of panoramic picture is calculated according to region of interest domain matrix；The function of time domain processing module is according to complete The quality factor of scape image calculates the quality factor of panoramic video；The function of multi-layer quality factor Fusion Module is by aphorama The quality factor of frequency is merged to obtain the automatic assessment result of damage video.

A kind of panoramic video appraisal procedure based on multi-layer quality factor, comprising the following steps:

Step 1: panoramic video input module to input this system a pair of of panoramic video source file carry out video processing and Decoding process obtains panoramic frame image pair；

Wherein, the panoramic video in a pair of of panoramic video source file of input be one section lossless reference video S ' and one section and The identical damage video S of reference video content, the damage damaged in video S include be artificially introduced fuzzy plus make an uproar and encode Based on processing caused by damage caused by the reason of damaging, also include in network transmission process as packet loss and error code being main；

Wherein, lossless reference video is also referred to as reference video；

Step 1.1 judge to input a pair of of panoramic video source file resolution ratio whether having the same of this system, frame per second and Duration and identical mapping format, including the mapping based on the mapping of longitude and latitude figure, hexahedron mapping and rectangular pyramid mapping, and Corresponding operating is carried out according to judging result:

If 1.1A inputs a pair of of panoramic video source file resolution ratio, frame per second and duration having the same of this system, and Identical mapping format, then skip to step 1.2；

A pair of of panoramic video source file that 1.1B inputs this system does not have identical resolution ratio, frame per second and duration, and Identical mapping format then become with picture element interpolation, duplicated frame image, mapping to damage video in panoramic video input module It is changed to main video processing, so that damage video and reference video resolution ratio having the same, frame per second and duration and identical Mapping format；

Step 1.2 uses the decoding tool based on ffmpeg, according to a pair of of panoramic video source file of input this system Coded format, be decoded processing, each panoramic video be decoded as multiple image, to obtain panoramic frame image pair, In, the video frame number of panoramic video source file is N, and obtained panoramic frame image is to for N group, including the N obtained by reference video A reference frame image and the N number of damage frame image obtained by damage video, the width of each panoramic frame image and high respectively W and H；

Step 2: region of interesting extraction module exports step 1 using image procossing and computer vision algorithms make Panoramic frame image exports multi-layer area-of-interest set of matrices to region of interesting extraction is carried out；

Specific: the reference frame image I ' carry out area-of-interest of the panoramic frame image pair of step 1 output, i.e. ROI are mentioned It takes；

Wherein, multi-layer area-of-interest set of matrices is low-level area-of-interest set of matrices Middle level area-of-interest set of matricesHigh-level area-of-interest set of matricesTime domain level area-of-interest set of matricesAnd mapping level is interested Matrix of areas M_pIn all set of matrices, wherein M indicates that size is the two-dimensional matrix of H × W, i.e. the one of image I ' is interested Matrix of areas, the element value range in M are [0,1], and the numerical value of value, that is, matrix the i-th row jth column of M (i, j) is bigger, indicates ginseng Examine the easier influence degree noticed by viewer to video quality of pixel I ' (i, j) of corresponding position in frame image I ' It is bigger；The subscript l, m, h of M, it is by the area-of-interest of basic, normal, high, time domain and mapping level that t, p, which respectively indicate the matrix, What extracting method obtained, superscript 1,2 ... the n of M indicate the matrix be obtained by the n method of place level, wherein n_l,n_m,n_h,n_tThe integer more than or equal to 1 is taken, indicates that basic, normal, high, time domain level can be obtained using one or more kinds of methods To one or more region of interest domain matrixs, and mapping level only can select a kind of method to obtain area-of-interest square Battle array；

It is above that step 1 is exported for a reference frame image I ' for the explanation of region of interest domain matrix number N group panoramic frame image pair N number of reference frame image, step 2 output region of interest domain matrix number be (n_l+n_m+n_h+ n_t+1)×N；

Multi-layer region of interest domain matrix is generated by step 2.1 to step 2.5 respectively, specifically:

Step 2.1 calculates the low-level area-of-interest of reference frame image, output using pixel scale image processing method Low-level area-of-interest set of matrices

Wherein, pixel scale image processing method is based on color contrast and edge detection；

Step 2.2 calculates the middle level area-of-interest of reference frame image, level in output using super-pixel processing method Area-of-interest set of matrices

Wherein, super-pixel processing method is based on the sequence of super-pixel block conspicuousness；

Step 2.3 calculates the high-level area-of-interest of reference frame image using computer vision methods, and usually viewer holds The region based on people, animal and vehicle easily paid close attention to exports high-level area-of-interest set of matrices

Wherein, computer vision methods are based on Target Segmentation and semantic segmentation；

Step 2.4 calculates temporal levels area-of-interest using adjacent two frames reference picture using image processing method, leads to The moving object of concern, output time-domain level area-of-interest set of matrices are often easy for viewer

Wherein, image processing method is based on light stream estimation and estimation；

Step 2.5 selects corresponding weight square according to the mapping format of a pair of of panoramic video source file of input this system Battle array, output weight matrix is as mapping level area-of-interest matrix M_p；

It is for longitude and latitude figure mapping format, and weight ratio equator, the two poles of the earth weight of corresponding weight matrix is small, tetragonous Cone mapping The bottom surface weight ratio conical surface weight of the corresponding weight matrix of format is big；

Wherein, the mapping level area-of-interest matrix that step 2.5 exports is only related with video mapping format, with frame image Itself is unrelated, once it is determined that the video mapping format of input, then the region of interest domain matrix of each frame is identical；

Step 3: multi-layer quality factor computing module, using quality evaluation algorithm, the multi-layer based on step 2 output Area-of-interest set of matrices calculates the weighted difference of the panoramic frame image pair of step 1 output, exports the more of N framing image pair Level quality factor set；

Wherein, multi-layer quality factor set is low-level quality factor setMiddle level quality Factor setHigh-level quality factor setTime domain level quality factor collection It closesAnd mapping level quality factor set f_pIn all numerical value set, wherein f indicate one be greater than 0 Natural number, subscript is consistent with the upper subscript of the M in step 2 thereon, indicates the quality factor by corresponding region of interest Domain matrix obtains the treatment process and is specifically completed by following steps:

The basic, normal, high of panoramic frame image pair and step 2 output, time domain and the mapping sense that step 3.1 exports step 1 are emerging Interesting matrix of areas obtains N group according to the sequence of frame

Each group includes: a lossless panorama sketch, a width damage panorama sketch and more Level area-of-interest set of matrices；

Step 3.2 calculates lossless and damages the quality difference matrix D of panorama sketch using pixel difference appraisal procedure, D be H × The two-dimensional matrix of W, D (i, j) indicate color/luminance difference that is lossless and damaging in panorama sketch pixel at the position (i, j), can make It is calculated with Euclidean distance method；

Each region of interest domain matrix M is multiplied by step 3.3 with difference matrix D corresponding element, the difference square weighted Battle array set

Step 3.4 uses traditional images objective quality assessment method by the difference matrix compound mapping of weighting for damage image Multi-layer quality factor set

Wherein, traditional images objective quality assessment method is based on MSE, PSNR and SSIM；

Step 4: time domain processing module, the N group image multi-layer quality factor set that input step three obtains, according to when Domain processing method, fusion become one group, export the multi-layer quality factor set of video S

Wherein, time-domain processing method is based on average and weighted average；

Step 5: multi-layer quality factor Fusion Module, the multi-layer quality factor that input step four obtains, using fusion Model Fusion is a quality evaluation result

Export result Q, the i.e. matter of video S Measure evaluation result；

Wherein, Fusion Model is based on linear regression, nonlinear regression and neural network model；

The parameter of the Fusion Model can be obtained by Experience Design, can also be trained and be obtained by way of machine learning, Wherein the method based on machine learning can mainly be completed by following steps: design a BP neural network structure first, then The parameter of BP network is obtained using training data training, so that the result of these quality factors fusion is obtained close to subjective Point；

Wherein, the quality score for some panoramic videos that the training data utilized is obtained specifically by subjective experiment, and The video quality factor obtained by step 1 to step 4；

So far, by step 1 to step 5, this method, i.e., a kind of aphorama based on multi-layer quality factor are completed Frequency appraisal procedure.

Beneficial effect

A kind of panoramic video appraisal procedure and system based on multi-layer quality factor of the present invention, compared with prior art, It has the following beneficial effects:

This method is suitable for panoramic video quality evaluation: with existing ordinary video method for evaluating quality and existing panorama Video quality evaluation method is compared, and method of the invention is due to considering and having merged user's area-of-interest of multi-layer to video The influence of quality, the quality evaluation valuation of obtained damage video and the evaluation result of subjective experiment are more consistent, are more suitable for pair The automatic Evaluation of panoramic video quality.

Detailed description of the invention

Fig. 1 is a kind of module map of the panoramic video quality evaluation system based on multi-layer quality factor of the present invention；

Fig. 2 is in a kind of panoramic video appraisal procedure and system specific embodiment based on multi-layer quality factor of the present invention The 5th frame panoramic picture and its multi-layer area-of-interest figure；

Fig. 3 is in a kind of panoramic video appraisal procedure and system specific embodiment based on multi-layer quality factor of the present invention Multi-layer quality factor Fusion Module structure chart.

Specific embodiment

The present invention is described in detail below in conjunction with drawings and examples, while also describing technical solution of the present invention The technical issues of solution and beneficial effect, it should be pointed out that described embodiment is intended merely to facilitate the understanding of the present invention, And any restriction effect is not played to it.

Embodiment 1

The present embodiment is the lossless panoramic video to the method for the invention and system based on two sections of 4K resolution ratio Concert.mp4 and damaging is illustrated for panoramic video concert_3M.mp4.

Fig. 1 is that a kind of panoramic video appraisal procedure and system based on multi-layer quality factor of the present invention is based on multi-layer matter Measure the panoramic video quality evaluation system module map of the factor.

From figure 1 it appears that this system solves reference video and damage video input panoramic video input module Code processing, is then fed into region of interesting extraction module, extracts low-level, middle level, high-level, time domain level and mapping layer Grade area-of-interest, is then based on these region of interest domain matrixs, calculates panorama sketch in multi-layer quality factor computing module As pair low-level, middle level, high-level, time domain level and mapping level quality factor set, then by these quality because Son is sent into time domain processing module, obtains the multi-layer quality factor set of panoramic video, finally merges in multi-layer quality factor These quality factors are permeated the output of quality score in module, that is, damage the automatic assessment result of video.

Using this system relied on it is a kind of based on the panoramic video appraisal procedure of multi-layer quality factor to the present embodiment In two sections of 4K resolution ratio lossless panoramic video concert.mp4 and damage at panoramic video concert_3M.mp4 Reason, includes the following steps:

Step A: panoramic video input module is decoded processing, two videos to a pair of of panoramic video source file of input It is duration 10 seconds, the longitude and latitude bitmap-format panoramic video of frame per second 30fps, resolution ratio 4096*2048, damage video is by lossless view What frequency obtained after H.264 compressed encoding, the code rate of lossless video is 50Mbps, and the code rate for damaging video is 3Mbps, the two 300 pairs of panorama sketch are obtained after decoding, wide high respectively 4096 and 2048 pixels of image, wherein Fig. 2 (A) is lossless video 5th frame panorama sketch

Step B: region of interesting extraction module carries out region of interesting extraction, the treatment process to 300 lossless images Specifically completed by following steps:

B.1, the method that step uses color contrast to calculate notable figure, is calculated 300 low-levels of 300 images Region of interest domain matrixMatrix size is 2048 × 4096, wherein the result of the 5th frame is mapped to image space (by [0,1] Furthermore the value of range adds another low-level region of interest domain matrix multiplied by 256) such as Fig. 2 (B) is shownIt is one 2048 The all 1's matrix of × 4096 sizes；

B.2, step divides the image into super-pixel, then uses two kinds of super-pixel block conspicuousness sort methods, calculates reference The middle level area-of-interest matrix of frame imageWithIt is mapped to after image space as shown in Fig. 2 (C, D)；

B.3, the method that step uses full convolutional neural networks carries out target semantic segmentation to reference frame image, will divide The mask arrived is as high-level region of interest domain matrix M_h, be mapped as bianry image such as Fig. 2 (E), matrix element be 1 belong to people, Target area based on animal and vehicle, element belong to background area for 0；

Step is B.4 in the present embodiment without using inter motion information, therefore the time domain level in the present embodiment is interested Matrix of areas M_tFor null matrix；

B.5, step is longitude and latitude figure according to the mapping format of input video, selects corresponding weight matrix M_p, be mapped to [0, 255] as shown in Fig. 2 (F), the value of each element of matrix is determined by latitude, as shown in formula (1)；

B.6 B.5 B.1 the present embodiment in step be obtained 6 region of interest domain matrixs of every frame image to step to stepTotally 1800 matrixes.

Step C: multi-layer quality factor computing module, this example use PSNR quality evaluation algorithm, are exported based on step B Multi-layer area-of-interest weighting matrix set, calculate 300 frame images pair weighted difference set of matrices, export multi-layer matter Factor set is measured, which is specifically completed by following steps:

C.1, the multiple semi-cylindrical hills matrix that step exports the step A panoramic picture pair exported and step B, according to frame Sequence obtain 300 groupsEach group includes: a lossless panorama sketch, a width Damage panorama sketch, 6 region of interest domain matrixs；

C.2, step calculates weighted difference set of matrices between two image pixels, as shown in formula (2), I (i, j), I ' (i, j) and M (i, j) are respectively the value for damaging each element in image, lossless image and weighting matrix, and wherein image is if threeway Road then calculates separately weighted difference matrix according to each channel, obtains

D (i, j)=(I (i, j)-I ' (i, j))²×M(i,j) (2)

C.3, step usesCalculate quality factor set As shown in formula (3), the present embodiment then takes triple channel quality factor if triple channel image using the calculation method of PSNR Quality factor of the average value as damage image；

C.4, C.3 C.1 step can be obtained 6 quality factors of every frame damage image by step to step,Output of the set as this module as totally 300.

Step D: time domain processing module, 300 multi-layer quality factor set that input step C is obtained, this example according to The quality factor of corresponding position in each set is averaged by the processing method of time domain average, i.e., shown in formula (4), x, y difference Indicate the area-of-interest method index in the level index and the level of quality factor, output damage video concert_ 3M.mp4 multi-layer quality factor set

Step E: multi-layer quality factor Fusion Module, the multi-layer quality factor set that input step D is obtained, using BP Neural network is merged, and the final quality evaluation score Q (I, I ') of video fengjing_3M.mp4 is obtained.

E.1, BP neural network that step uses is respectively connected to step 4 and obtains as shown in figure 3, network possesses 6 input nodes 6 quality factors arrived, 10 concealed nodes, 1 output node input the quality evaluation result of [0,1] range；

The parameter of the step E.2 Fusion Model is the aphorama frequency by not including test video concert_3M.mp4 It is got according to training.

In this example, the quality assessment value and single-factor result obtained using 6 multi-layer quality factor amalgamation modes It compares, it is more linearly related with subjective results.As shown in table 1, the quality factor for successively removing each level obtains and subjectivity Spearman rank correlation coefficient SROCC ratio uses the SROCC of the quality factor of all levels small.Value in table uses 12 sections of originals 288 sections of damage video BP network parameters of beginning video and corresponding content, then using in other 4 sections of original videos and correspondence The 96 sections of damage videos held are tested, and obtained SROCC is bigger, and the explanation automatic evaluation method is better.

1 multi-layer quality factor of table and a certain level quality factor of reduction compare

Above-described specific descriptions have carried out further specifically the purpose of invention, technical scheme and beneficial effects It is bright, it should be understood that the above is only a specific embodiment of the present invention, the protection model being not intended to limit the present invention It encloses, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should be included in the present invention Protection scope within.

Claims

1. A panoramic video evaluation system based on multi-level quality factors, characterized in that: multiple quality factors are calculated based on multi-level regions of interest to deal with the problem that important areas in panoramic videos have a greater impact on video quality; then The multi-level quality factors are fused through the fusion model, and the model parameters can be learned from subjective data to cope with the subjective initiative of panoramic video users;

The system includes a panoramic video input module, a region of interest extraction module, a multi-level quality factor calculation module, a time domain processing module and a multi-level quality factor fusion module;

The connection relationship of each module of this system is as follows:

The panoramic video input module is connected with the region of interest extraction module, and the region of interest extraction module is connected with the multi-level quality factor calculation module; the multi-level quality factor calculation module is connected with the time-domain processing module; the time-domain processing module is connected with the multi-level quality factor fusion module connected;

The functions of each module of the system are as follows:

The function of the panoramic video input module is to decode the input video file to obtain the panoramic frame image pair; the function of the region of interest extraction module is to extract the multi-level region of interest matrix of the panoramic image; the function of the multi-level quality factor calculation module is based on the sense The area of interest matrix calculates the quality factor of the panoramic image; the function of the time domain processing module is to calculate the quality factor of the panoramic video according to the quality factor of the panoramic image; the function of the multi-level quality factor fusion module is to fuse the quality factor of the panoramic video to obtain the damaged video automatic evaluation results.

2. a panoramic video evaluation method based on multi-level quality factor, is characterized in that: comprise the following steps:

Step 1: The panoramic video input module performs video processing and decoding processing on a pair of panoramic video source files input into the system to obtain a pair of panoramic frame images;

In step 1, the panoramic video in the input pair of panoramic video source files is a lossless reference video S′ and a damaged video S with the same content as the reference video. The damage in the damaged video S includes artificially introduced blur, noise and The damage caused by encoding-based processing also includes the damage caused by packet loss and bit errors during network transmission;

Wherein, the lossless reference video is also referred to as a reference video;

The obtained panoramic frame image pair includes a reference frame image obtained from the reference video, and a damaged frame image obtained from the damaged video;

Step 2: The region of interest extraction module uses image processing and computer vision algorithms to extract the region of interest from the panoramic frame image output in step 1, and outputs a multi-level region of interest matrix set;

Step 3: The multi-level quality factor calculation module adopts a quality assessment algorithm, based on the multi-level ROI matrix set output in step 2, calculates the weighted difference of the panoramic frame image pairs output in step 1, and outputs the multi-level of N groups of frame image pairs set of quality factors;

Step 4: The time-domain processing module, input the multi-level quality factor sets of N groups of images obtained in step 3, merge them into one group according to the time-domain processing method, and output the multi-level quality factor set of the damaged video S

Among them, the time domain processing method is mainly based on average and weighted average;

Step 5: Multi-level quality factor fusion module, input the multi-level quality factor set obtained in step 4, use the fusion model to fuse into a quality evaluation result

The result Q, that is, the quality evaluation result of the damaged video S is output.

3. a kind of panoramic video evaluation method based on multi-level quality factor according to claim 2, it is characterized in that: in step 1, the panoramic video in a pair of panoramic video source files of input is a section of lossless reference video S ' and A piece of damaged video S with the same content as the reference video. The damage in the damaged video S includes artificial blurring, noise addition, and damage caused by encoding-based processing, as well as packet loss and bit errors during network transmission. damage caused by the cause;

Wherein, the lossless reference video is also referred to as a reference video;

Step 1.1 Judging whether a pair of panoramic video source files input into the system have the same resolution, frame rate and duration, and the same mapping format, including latitude and longitude map mapping, hexahedron mapping and pyramid mapping, and according to the judgment As a result, proceed accordingly:

1.1A If a pair of panoramic video source files input into the system have the same resolution, frame rate and duration, and the same mapping format, skip to step 1.2;

1.1B If a pair of panoramic video source files input into this system do not have the same resolution, frame rate and duration, and the same mapping format, the damaged video will be interpolated by pixels, copied frame images, and mapped in the panoramic video input module. Main video processing, so that the damaged video has the same resolution, frame rate and duration, and the same mapping format as the reference video;

Step 1.2 uses ffmpeg as the main decoding tool to perform decoding processing according to the encoding format of a pair of panoramic video source files input into the system, and decode each panoramic video into multiple frames of images, thereby obtaining a pair of panoramic frame images, wherein, The number of video frames of the panoramic video source file is N, and the obtained panoramic frame image pairs are N groups, including N reference frame images obtained from the reference video and N damaged frame images obtained from the damaged video, and the number of each panoramic frame image The width and height are W and H, respectively.

4. a kind of panoramic video evaluation method based on multi-level quality factor according to claim 2, is characterized in that: in step 2, concrete: the reference frame image I ' in the panoramic frame image pair that step 1 outputs senses Region of interest, that is, ROI extraction; multi-level region of interest matrix set is a low-level region of interest matrix set Middle-level region of interest matrix set A collection of high-level region-of-interest matrices time-domain region-of-interest matrix set And all matrix sets in the mapping-level region-of-interest matrix _Mp , where M represents a two-dimensional matrix with a size of H×W, that is, a region-of-interest matrix of image I′, and the value range of elements in M is [0, 1], the value of M(i, j), that is, the larger the value of the i-th row and the j-column of the matrix, it means that the pixel point I'(i, j) at the corresponding position in the reference frame image I' is more likely to be noticed by the viewer , the greater the impact on video quality; the subscripts l, m, h, t, p of M indicate that the matrix is obtained by the ROI extraction method at the low, medium, high, time domain and mapping levels, M The superscripts 1, 2, ... n indicate that the matrix is obtained by the nth method of the level, where n _l , n _m , n _h , n _t are integers greater than or equal to 1;

The above explanation for the number of the region of interest matrix is aimed at a reference frame image I ', for the N reference frame images in the N group of panoramic frame image pairs output by step one, the number of the region of interest matrix output by step two is ( n _l +n _m +n _h +n _t +1)×N;

The multi-level region-of-interest matrix is generated by steps 2.1 to 2.5, specifically:

Step 2.1 Use the pixel-level image processing method to calculate the low-level ROI of the reference frame image, and output the low-level ROI matrix set

Among them, the pixel-level image processing method is mainly based on color contrast and edge detection;

Step 2.2 Use the superpixel processing method to calculate the middle-level ROI of the reference frame image, and output the middle-level ROI matrix set

Among them, the superpixel processing method is mainly based on the saliency ranking of superpixel blocks;

Step 2.3 Use computer vision methods to calculate the high-level ROI of the reference frame image, usually the areas where people, animals and vehicles are easy for viewers to focus on, and output a high-level ROI matrix set

Among them, computer vision methods mainly focus on target segmentation and semantic segmentation;

Step 2.4 Use the image processing method to calculate the region of interest at the time domain level by using two adjacent frames of reference images, which are usually moving objects that are easy for viewers to pay attention to, and output the matrix set of region of interest at the time domain level

Among them, image processing methods are mainly based on optical flow estimation and motion estimation;

Step 2.5 selects the corresponding weight matrix according to the mapping format of a pair of panoramic video source files input into the system, and outputs the weight matrix as the mapping level region of interest matrix M _p ;

For the latitude and longitude map mapping format, the weight of the poles of the corresponding weight matrix is smaller than the weight of the equator, and the weight of the bottom surface of the weight matrix corresponding to the quadrangular pyramid mapping format is greater than the weight of the cone surface;

Among them, the mapping-level region-of-interest matrix output in step 2.5 is only related to the video mapping format, and has nothing to do with the frame image itself. Once the input video mapping format is determined, the region-of-interest matrix of each frame is the same.

5. a kind of panoramic video evaluation method based on multi-level quality factor according to claim 2, is characterized in that: in step 3, multi-level quality factor set is low-level quality factor set Middle-level quality factor set High-level quality factor set A collection of temporal-level quality factors And all numerical sets in the mapping level quality factor set f _p , where f represents a natural number greater than 0, and its upper and lower subscripts are consistent with the upper and lower subscripts of M in step 2.4, indicating that the quality factor is determined by the corresponding ROI matrix get;

The processing process of Step 3 is specifically completed by the following steps:

Step 3.1 Take the panoramic frame image pair output in step 1, and the low, medium, high, time domain and mapped region of interest matrices output in step 2, and obtain N groups in the order of frames

Each group contains: a lossless panorama, a lossy panorama and a set of multi-level ROI matrices;

Step 3.2 Use the pixel difference evaluation method to calculate the quality difference matrix D of the lossless and lossy panorama, D is a two-dimensional matrix of H×W, and D(i,j) represents the (i,j) position in the lossless and lossy panorama The color/brightness difference of the pixel at can be calculated using the Euclidean distance method;

Step 3.3 Multiply each ROI matrix M with the corresponding element of the difference matrix D to obtain a set of weighted difference matrices:

Step 3.4 Use the traditional image objective quality assessment method to map the weighted difference matrix set into a multi-level quality factor set of the damaged image

Among them, the traditional image objective quality assessment methods are mainly based on MSE, PSNR and SSIM.

6. a kind of panoramic video evaluation method based on multi-level quality factor according to claim 2, is characterized in that: in step 5, fusion model is based on linear regression, nonlinear regression and neural network model;

The parameters of the fusion model can be obtained through empirical design, and can also be obtained through machine learning. The method based on machine learning can be mainly completed by the following steps: first design a BP neural network structure, and then use the training data to train the BP network The parameters of these quality factors make the result of the fusion of these quality factors as close as possible to the subjective score;

Wherein, the training data used are specifically the quality scores of some panoramic videos obtained through subjective experiments, and the video quality factors obtained through steps 1 to 4.