CN100563340C

CN100563340C - Multichannel video stream encoder and decoder based on deep image rendering

Info

Publication number: CN100563340C
Application number: CN 200810062865
Authority: CN
Inventors: 骆凯; 李东晓; 张明; 何赛军; 石冰; 冯雅美; 谢贤海; 朱梦尧
Original assignee: Zhejiang University ZJU
Current assignee: Wan D Display Technology (shenzhen) Co Ltd
Priority date: 2008-07-07
Filing date: 2008-07-07
Publication date: 2009-11-25
Anticipated expiration: 2028-07-07
Also published as: CN101309412A

Abstract

本发明公开了一种基于深度图像渲染的多通道视频流编码器和解码器。本发明在对多通道视频流进行编码时，对中心通道视频流图像帧和深度图按照视频编码标准方法进行编码；根据通道重建后的图像帧和深度图，采用深度图像渲染技术得到邻近的待编码的辅助通道的预测图，对遮挡信息进行变换、量化、熵编码。本发明在对多通道视频压缩码流进行解码时，利用人类视觉系统的生理特点，采用深度图像渲染技术，根据两个相邻通道的图像帧和其深度图，得到位于该两个相邻通道中心位置的一个虚拟通道。在采用本发明解码器输出视频流的显示端，每个立体视点由一个高质量通道和一个虚拟通道构成，观众可以获得良好的立体视觉体验。The invention discloses a multi-channel video stream encoder and decoder based on depth image rendering. When encoding the multi-channel video stream, the present invention encodes the image frame and depth map of the central channel video stream according to the video coding standard method; according to the reconstructed image frame and depth map of the channel, the depth image rendering technology is used to obtain the adjacent waiting The prediction map of the encoded auxiliary channel, which transforms, quantizes, and entropy encodes the occlusion information. When the present invention decodes the multi-channel video compression code stream, it utilizes the physiological characteristics of the human visual system, adopts the depth image rendering technology, and obtains the image frames located in the two adjacent channels according to the image frames and their depth maps of the two adjacent channels. A virtual passage in a central location. At the display end where the video stream is output by the decoder of the present invention, each stereoscopic view point is composed of a high-quality channel and a virtual channel, and the audience can obtain a good stereoscopic visual experience.

Description

Multichannel video stream encoder and decoder based on deep image rendering

Technical field

The present invention relates to the moving image treatment technology, relate in particular to a kind of multichannel video stream encoder and decoder based on deep image rendering.

Background technology

Television system has experienced from the black and white to the colour, from the evolution of analog to digital.What so far two-dimentional television system of development offered spectators is the image on plane, and three-dimensional television system can offer spectators more near the viewing experience of natural vision.Therefore will be a nature from the two-dimentional system to the three dimension system, the evolution that can expect be the development to present two-dimensional digital television system.

To studies show that of human visual system (HVS, Human Visual System), when eyes were observed same object, there was parallax in two width of cloth images of formation.There are two theories in formation to human stereoscopic vision: blending theory (Fusion Theory) is thought if eyes are observed respectively be there are differences, and difference limitation image within the specific limits, merges by vision, and the mankind will form stereoscopic vision.Suppress theoretical (SuppressionTheory) and think the human visual system in the process that forms stereoscopic vision, third dimension and stereo-picture total quality depend on the simple eye preferably image of quality.The continuous quality yardstick of double excitation (DSCQS, Double-Stimulus Continuous-Quality Scale) the subjective testing experiment of people such as Lew Stelmach design has confirmed this theory to a certain extent.

Digital video technology is along with the fast development of Internet and mobile communication has obtained increasingly extensive application, but containing much information of digital video information, bandwidth requirement height to transmission network, so generally digital video signal is carried out compressed encoding earlier before storage or transmission, so that save the memory space and the network bandwidth.

Forming stereoscopic vision needs the digital video of two passages at least, present auto-stereoscopic display supports a plurality of spectators to watch simultaneously, its a plurality of stereos copic viewing points (three-dimensional viewpoint) require the digital video of a plurality of passages of input, therefore a kind of good decoding method need be considered the picture quality behind compression ratio, the decoding and rebuilding, a plurality of factors such as stereoscopic vision experience of spectators, under band-limited restriction, obtain the balance of compression ratio and three-dimensional view-point image quality.

At present many (two) passage digital video is carried out Methods for Coding and be broadly divided into four classes, the first kind is based on the MPEG video encoding standard, second class is based on deep image rendering (DIBR, Depth-Image-BasedRendering) technology, the 3rd class is based on object coding, and the 4th class is based on three-dimensional grid technology (3D mesh).

First kind method is based on the MPEG video encoding standard.

The MVP of MPEG-2 (Multi-View Profile) uses time domain telescopic tool (TS, TemporalScalability tool), and the support to two-channel digital video (three-dimensional video-frequency) coding is provided.MVP uses a kind of double-deck coding structure, and as basic layer, right viewpoint passage is as enhancement layer with left viewpoint passage.Referring to: X.Chen and A.Luthra, MPEG-2 Multi-View Profile and its application in 3DTV, in proceedings of SPIE, vol.3021, pp.212-223,1997.Adopt MVP to carry out the coding of multi-channel digital video, its picture frame predict is similar to the multi-vision-point encoding standard (MVC that is studying in the world at present, Muiti-view Video Coding) picture frame predict, but because MVP adopts Moving Picture Experts Group-2 as coding tools, its code efficiency less than present video coding international standard H.264/AVC.

In May, 2003, formulated the video coding international standard H.264/AVC by the common joint video team JVT (JointVideo Team) that forms of the expert of ITU-T and ISO/IEC.H.264 adopted the hybrid encoding frame structure, adopted binary arithmetic coding or the like the advanced technology of minimum 4 * 4 variable-block motion prediction, a plurality of reference image frame, context-adaptive, compare with MPEG-2, under the situation of same picture quality, can obtain higher compression efficiency.

JVT is is studying and defining multi-vision-point encoding (MVC, Muiti-view Video Coding) international standard at present.MVC has utilized the picture frame correlation between viewpoint inside and the different points of view, utilize and H.264/AVC carry out encoding compression, because the associated prediction in employing time and space coding, radio hookup (Simulcast) with each viewpoint absolute coding is compared, experiment at present shows, under different video contents, the space-time unite coding can improve the gain of 0.5dB to 3dB.Referring to: P.Merkle, A.Smolic and K.Muller, Efficientprediction structures for multiview video coding, IEEE Trans.CSVT, vol.17, no.11, pp.1461-1473,2007.

MVC uses parallax to predict the correlation of excavating between viewpoint.But because of the non-identity of the installation site of video camera, camera site, illumination condition, the same area of the picture frame of a plurality of viewpoints of picked-up, its brightness and colourity exist inconsistent.This inconsistent efficient that can influence parallax accuracy for predicting and coding, a kind of method of solution are to add brightness and colourity compensation term in the coupling cost function.Referring to: J.H.Hur, S.Cho andY.L.Lee, Adaptive local illumination change compensation method forH.264/AVC-based multiview video coding, IEEE Trans.CSVT, vol.17, no.11, pp.1496-1505,2007.

The coding structure more complicated of MVC needs big amount of calculation, the encoding time delay of length and big reference frame storing space.MVC each viewpoint passage of need encoding, when the viewpoint number increases, the also corresponding increase of code check.MVC all viewpoints of encoding, transmit, decode link together the size of photographic images and video camera distance picture size and the viewing distance with display end, have limited the flexibility of display end viewing location like this.

2006, AVS (Advanced Video Coding Standard) is confirmed as the video coding national standard.AVS adopts the hybrid encoding frame structure equally, has adopted integer transform, arithmetic coding or the like the advanced technology of variable-block structure, a plurality of reference image frame, pre-convergent-divergent.Also can adopt AVS that multichannel video stream is carried out encoding and decoding.

Second class methods are based on deep image rendering (DIBR) technology.

The advanced three-dimensional television system of European information technology project (IST, Information Society Technologies) (ATTEST, Advanced Three-Dimensional Television System Technology) has adopted the DIBR method.Referring to: C.Fehn, Depth-Image-Based Rendering (DIBR), compressionand transmission for a new approach on 3D-TV, in Proceedings of SPIE, StereoscopicDisplays and Virtual Reality Systems XI, USA, pp.93-104,2004.

The ATTEST system is at coding side only the encode two-dimensional video of a passage (central passage) and the depth map of this passage, adopt the method for DIBR in decoding end, according to depth information and camera parameters, the central passage picture frame that decoding is recovered projects to three dimensions, project to the imaging plane of virtual video camera again, reconstruct a plurality of virtual two-dimensional video passages thus.

DIBR utilizes the depth information of a passage to play up a plurality of video channels, compares with MVC, can obtain higher compression ratio, and can not produce owing to the different brightness that cause with parameter of camera position, colourity do not match.But owing to block, play up the inner meeting of synthetic virtual view channel image frame and the cavity occurs, and since the decline of virtual visual point image quality, the viewing location in the off-center position, and spectators' stereoscopic vision impression is with influenced.

Alleviate the approach of playing up the inner appearance of synthetic picture frame cavity and have three at present, the one, come filling cavity with the texture around the cavity, the 2nd, it is level and smooth that depth map is carried out filtering, the 3rd, the depth map of a plurality of passages of coding transmission, utilize a plurality of channel image frames and depth map to play up the image to be synthesized of same virtual view, the 4th, adopt comparatively complicated multi-level depth map (LDI, Layered Depth Image) technology, referring to: S.U.Yoonand Y.S.Ho, Multiple color and depth video coding using a hierarchicalrepresentation, IEEE Trans.CSVT, vol.17, no.11,2007.

The 3rd class methods are based on object coding.In the middle of MPEG-4, an object video can be represented with shape (shape), motion (motion), three kinds of features of texture (texture), can use auxiliary element (AC, Auxiliary Component) to deposit disparity map.When adopting MPEG-4MAC that the binary channels video is encoded, usually,, deposit parallax information with MAC (Multiple AuxiliaryComponent) with the standard MPEG-4 left paths video of encoding.Referring to: S.Cho, K.Yun, C.Ahn and S.Lee, Disparity-Compensated stereoscopic video coding using the MAC in MPEG-4, RTRIJournal, vol.27, no.3, pp.326-329,2005.When adopting object-based coding techniques that natural scene is encoded, need cut apart a plurality of objects that extract in the scene, its algorithm complexity.

The 4th class methods are based on three-dimensional grid (3D mesh) technology.Adopt triangular mesh (Triangle mesh) to come the piece wire approximation body surface, this approximate error of bringing is closely related with the triangular mesh number, and grid number is many more, and error is more little, brings problem but the grid of enormous quantity also is storage and transmission.Referring to: J.L.Peng, C.S.Kim and C.C.J.Kuo, Technologies for 3D mesh compression:A survey, Journal of Visual Communication and Image Representation, vol.16, no.6, pp.688-733,2005.

Summary of the invention

The objective of the invention is to overcome the deficiencies in the prior art, a kind of multichannel video stream encoder and decoder based on deep image rendering is provided.

Multichannel video stream encoder comprises:

Image correction unit is used for a plurality of passage video streaming image frames of input are proofreaied and correct, so that corresponding points are positioned on the horizontal scanning line;

The channel selecting unit is used for selecting central passage and accessory channel from a plurality of video channels of input;

Degree of depth generation unit is used to generate the depth map of each picture frame in central passage and the accessory channel video flowing;

The accessory channel predicting unit is used for producing the prognostic chart of accessory channel picture frame according to the reconstruction frames of passage reconstruction unit generation and the depth map of degree of depth generation unit generation;

The central passage coding unit, be used for deep stream, encode according to the video encoding standard method to central passage video flowing and depth map composition, generating the central passage code stream, the video encoding standard method comprise video coding international standard MPEG-X, H.26X with video coding national standard AVS;

The accessory channel coding unit is used for the Occlusion Map of accessory channel picture frame is encoded according to the video encoding standard method, to generate the accessory channel code stream;

The passage reconstruction unit, be used for center channel bit stream and accessory channel code stream, decode according to the video encoding standard method, to generate central passage reconstructed image frame, to rebuild depth map and accessory channel reconstruction Occlusion Map, prognostic chart according to the accessory channel picture frame of rebuilding the generation of Occlusion Map and accessory channel predicting unit produces the accessory channel reconstruction frames;

Multiplexer is used for camera parameters, central passage code stream and accessory channel code stream, according to time division multiplexing mode, generates the multi-channel video compressed bit stream.

Described degree of depth generation unit produces the depth map of this time chart picture frame of central passage according to the picture frame of the synchronization of central passage picture frame and any one accessory channel adjacent thereto; According to the passage reconstruction unit produce when the reconstruction frames of prepass with when the reconstruction frames of the synchronization of the adjacent channel of prepass, produce the depth map of reconstruction frames that should be constantly when prepass.

The reconstruction frames that described accessory channel predicting unit produces according to the passage reconstruction unit, the depth map of this reconstruction frames that produces with degree of depth generation unit, according to method, synthesize the prognostic chart of picture frame of synchronization of the adjacent channel of this reconstruction frames place passage based on deep image rendering.

Described accessory channel coding unit, the prognostic chart of this picture frame that accessory channel picture frame and accessory channel predicting unit are produced is poor, and generation accessory channel Occlusion Map, Occlusion Map have reflected owing to blocking does not have the information that occurs on prognostic chart.

The multi-channel video stream decoder comprises:

Demodulation multiplexer is used for the multi-channel video compressed bit stream is decomposed into camera parameters, central passage code stream and accessory channel code stream;

The central passage decoding unit, be used for the center channel bit stream, decode according to the video encoding standard method, generating central passage reconstructed image frame and to rebuild depth map, the video encoding standard method comprise video coding international standard MPEG-X, H.26X with video coding national standard AVS;

Degree of depth generation unit is used to produce the depth map of accessory channel reconstructed image frame;

Channel prediction unit is used to produce the prognostic chart of accessory channel picture frame and the prognostic chart of tunnel picture frame;

The accessory channel decoding unit, be used for the accessory channel code stream is decoded according to the video encoding standard method, to generate the reconstruction Occlusion Map of accessory channel picture frame, this is rebuild the prognostic chart addition of Occlusion Map and channel prediction unit generation, to generate the reconstructed image frame of accessory channel;

The anti-correcting unit of image is used for central passage reconstructed image frame, accessory channel reconstructed image frame and anti-correction of tunnel prognostic chart picture frame to the decoding generation, so that each channel image frame returns to the position of shooting.

Described degree of depth generation unit, the reconstructed image frame of the synchronization of the reconstructed image frame of the current accessory channel that produces according to the accessory channel decoding unit and the adjacent channel of current accessory channel produces the depth map of this reconstructed image frame of current accessory channel.

Described channel prediction unit, the reconstructed image frame of the central passage that produces according to the central passage decoding unit and the depth map of this picture frame, according to method, produce the prognostic chart of picture frame of the synchronization of the contiguous accessory channel of concentricity passage based on deep image rendering; The reconstructed image frame of the current accessory channel that produces according to the accessory channel decoding unit, the depth map of this picture frame that produces with degree of depth generation unit, according to method, produce with the contiguous not prognostic chart of the picture frame of the synchronization of the accessory channel of reconstruction of current accessory channel based on deep image rendering.

Described channel prediction unit, center at two adjacent channels, reconstructed image frame and depth map according to the synchronization of these two passages, produce the prognostic chart of tunnel picture frame, the photocentre of the virtual video camera of this tunnel is in the mid point of line of photocentre of the video camera of two adjacent channel, and the optical axis of this virtual video camera is parallel with the optical axis of the video camera of central passage.

The anti-correcting unit of described image, compressing video frequency flow for N passage of input decoder, the non-compression video stream of 2N-1 passage of output, the reconstruction non-compression video stream of N the passage that recovers comprising decoding and adopt the non-compression video of N-1 the tunnel that the deep image rendering technology synthesizes to flow.

The anti-correcting unit of described image, 2N-1 passage of output is divided into rebuilds passage and tunnel, to N reconstruction passage, by the relative position sequence arrangement of its real camera, to tunnel, be inserted into two centers that rebuild passage contiguous with this tunnel; 2N-1 passage of output can produce 2N-2 three-dimensional viewpoint viewing location altogether, and each three-dimensional viewpoint viewing location rebuilds passage by one and a tunnel is formed.

In multichannel video stream encoder of the present invention, the deep stream that video flowing and its depth map of central passage are formed is encoded according to the video encoding standard method, excavated central passage interior view picture frame and picture frame correlation, depth map and depth map correlation in time in time; To the picture frame of accessory channel, adopt the method for DIBR to synthesize prognostic chart, the Occlusion Map of accessory channel picture frame is encoded according to the video encoding standard method, excavated the picture frame correlation spatially of adjacent channel synchronization.

In multi-channel video stream decoder of the present invention, adopt the DIBR method, synthesize the prognostic chart of a tunnel in the center of two adjacent channels.The prognostic chart of this tunnel is synthetic according to the picture frame and the depth map of two adjacent channels, and the prognostic chart quality greatly improves; The compressed bit stream of N passage of decoder input synthesizes N-1 tunnel predicted flows, can export the non-compressed bit stream of 2N-1 passage altogether, because per two passages can form a three-dimensional viewpoint, has therefore increased the number of the three-dimensional viewpoint of display end; When display end was supported a plurality of three-dimensional viewpoint, each three-dimensional viewpoint comprised central passage or an accessory channel and the tunnel that the picture frame quality is low slightly that picture quality is higher, and spectators will produce the constant visual effect of scene third dimension.

In multi-channel video stream decoder of the present invention, when display end only supports that the plane shows, central passage or arbitrary accessory channel dynamic image distribution to display unit can be shown; When display end is supported binary channels stereo display, any two adjacent channels can be delivered to display unit and show; When display end is supported a plurality of three-dimensional viewpoint, can at the most 2N-1 passage be delivered to display unit and show that N is the passage number of input decoder.

Adopt the three-dimensional television system of encoder of the present invention,, utilize the video encoding standard method to excavate the correlation of viewpoint inside, utilize the degree of depth to play up (DIBR) method and excavate correlation between the viewpoint at coding side; In decoding end, utilize the physiological property of DIBR method and HVS to obtain more three-dimensional viewpoint.Compare with MVC, this system can obtain lower code check, compares with ATTEST, and spectators can obtain better stereoscopic vision and experience.

Description of drawings

Fig. 1 is according to multichannel video stream encoder schematic diagram of the present invention;

Fig. 2 is according to multi-channel video stream decoder schematic diagram of the present invention.

Embodiment

Multichannel video stream encoder comprises:

The multi-channel video stream decoder comprises:

Embodiment

Fig. 1 is the schematic diagram according to multichannel video stream encoder of the present invention.Multichannel video stream encoder carries out compressed encoding, the compressed bit stream behind the output encoder to the multichannel video stream and the camera parameters of input.Encoder comprises image correction unit 11, channel selecting unit 12, degree of depth generation unit 13, accessory channel predicting unit 14, central passage coding unit 15, accessory channel coding unit 16, passage reconstruction unit 17 and multiplexer 18.

Referring to Fig. 1, the encoder encodes multichannel video stream comprises following 11 steps:

Step 1: image correction unit 11 is accepted the multichannel video stream and the camera parameters of input, according to the correcting algorithm of standard, picture frame is proofreaied and correct.The result who proofreaies and correct is at synchronization, and for different channel image frames, its corresponding points are positioned on the horizontal line.

Step 2: a plurality of passages of the 12 pairs of inputs in channel selecting unit are classified, and select 1 central passage, remaining passage as accessory channel.Select the algorithm of central passage as follows: be designated as in order No. 1 taking N the passage that obtains, No. 2 ..., N number, N is a positive integer, N 〉=2; Choose the c passage as central passage:

Symbol

Round under the expression.N-1 remaining passage as accessory channel.

Referring to Fig. 1, note central passage video streaming image frame is I _c, accessory channel video streaming image frame is I _a, the depth map of central passage picture frame is Z _c, the depth map of accessory channel picture frame is Z _a, subscript a is a positive integer, satisfies 1≤a≤N, and a ≠ c.

Step 3: the I of 13 pairs of synchronizations of degree of depth generation unit _cAnd I _aCarry out the solid coupling, be somebody's turn to do depth map Z constantly to generate central passage _c, a can get a=c-1 or a=c+1.

Step 4: central passage coding unit 15 is encoded to the deep stream of central passage video flowing and depth map composition according to the video encoding standard method, generates the central passage code stream.

Step 5: 17 pairs of center channel bit streams of passage reconstruction unit are decoded, and generate central passage reconstructed video stream picture frame and rebuild depth map, are designated as I ' respectively _cAnd Z ' _c

Step 6: accessory channel predicting unit 14, according to I ' _cAnd Z ' _c, adopt method based on deep image rendering (DIBR), synthesize the prognostic chart of the contiguous accessory channel of concentricity passage, be designated as P _a, a can get a=c-1 or a=c+1.

For convenience of description, get a=c-1 earlier, carry out step 7, get a=c+1 again, carry out step 7 to 10 to step 10.The a accessory channel that satisfies a＜c is called the left side accessory channel, a accessory channel that satisfies a＞c is called the right side accessory channel.Coded sequence described below is earlier the left side accessory channel to be encoded, and the right side accessory channel is encoded again.But it is limitation of the invention that following description should not be construed as, and in fact also can encode to the right side accessory channel earlier, the left side accessory channel is encoded, perhaps to left side, the right side accessory channel coding that hockets again.

Step 7: accessory channel coding unit 16 is at first to the I of a secondary channels synchronization _aAnd P _aDiffer from, produce the Occlusion Map of a passage, be designated as R _a, then to R _aEncode according to the video encoding standard method, produce a channel bit stream.If a equals 1, get back to step 6; If a equals N, jump to step 11; If 1＜a＜N carries out next step.

Step 8: 17 pairs of a channel bit streams of passage reconstruction unit are decoded, and produce the reconstruction Occlusion Map of a passage, are designated as R ' _a, R ' _aAnd P _aDo and, produce the picture frame of the reconstructed video stream of a passage, be designated as I ' _a

Step 9: degree of depth generation unit 13, according to the reconstructed image frame I ' of a passage of importing _aAnd with the reconstructed image frame of its adjacent channel, generate the depth map of a passage reconstructed image frame, be designated as Z ' _aIf a＜c, then the a+1 passage is rebuilt prior to a passage, the 13 couples of I ' in unit _aAnd I ' _A+1Carry out the solid coupling, to generate Z ' _aIf a＞c, then the a-1 passage is rebuilt prior to a passage, the 13 couples of I ' in unit _aAnd I ' _A-1Carry out the solid coupling, to generate Z ' a.

Step 10: accessory channel predicting unit 14, according to I ' _aAnd Z ' _a, the method for employing DIBR synthesizes the prognostic chart with the contiguous uncoded accessory channel of a passage, remembers that this accessory channel is j number.If a＜c, j equals a-1, makes a equal a-1; If a＞c, j equals a+1, makes a equal a+1.Get back to step 7.

Step 11: 18 pairs of camera parameters of multiplexer, central passage code stream and accessory channel code stream, take mode according to the time-division, generate the compressed bit stream of N passage.

By above 11 steps, encoder finally generates the compressed bit stream of N passage of input.In above-mentioned step 6 and step 10, accessory channel predicting unit 14 adopts reconstructed image frame I ' _c(perhaps I ' _a) and rebuild depth map Z ' _c(Z ' _a), rather than I _c(perhaps I _a) and Z _c(perhaps Z _a), synthesize the prognostic chart P of a passage _a, be in order to keep consistency with decoder.

In above-mentioned step 7, the Occlusion Map of 16 pairs of accessory channels of accessory channel coding unit is encoded and is transferred to decoding end, can compensate owing to the cavity of adopting the synthetic prognostic chart of DIBR method to produce, reconstruct high-quality accessory channel picture frame in decoding end.

Fig. 2 is the schematic diagram according to multi-channel video stream decoder of the present invention.The multi-channel video stream decoder is decoded to the compressed bit stream and the camera parameters of input, non-compression video stream behind the output decoder and the non-compression video stream that synthesizes through tunnel.Decoder comprises demodulation multiplexer 21, central passage decoding unit 22, degree of depth generation unit 23, channel prediction unit 24, accessory channel decoding unit 25, the anti-correcting unit 26 of image.

Referring to Fig. 2, the decoder decode compressed bit stream comprises following 8 steps:

Step 1: demodulation multiplexer 21 resolves into camera parameters, central passage code stream and accessory channel code stream to the compressed bit stream of input.

Step 2: the central passage code stream of 22 pairs of inputs of central passage decoding unit, according to the video encoding standard method, decoding generates the picture frame I ' of central passage reconstructed video stream _cWith reconstruction depth map Z ' _c

Step 3: channel prediction unit 24 is according to the I ' of input _cAnd Z ' _c, adopt the DIBR method, synthesize the prognostic chart P of the contiguous a accessory channel of concentricity passage _a, a can get a=c-1 or a=c+1.

For convenience of description, get a=c-1 earlier, carry out step 4, get a=c+1 again, carry out step 4 to step 7 to step 7.The accessory channel of a＜c is called the left side accessory channel, the accessory channel of a＞c is called the right side accessory channel.Decoding order described below is earlier the left side accessory channel to be decoded, and the right side accessory channel is decoded again.But it is limitation of the invention that following description should not be construed as, and in fact also can decode to the right side accessory channel earlier, the left side accessory channel is decoded, perhaps to the decoding that hockets of left side, right side accessory channel again.

Step 4: 25 pairs of accessory channel code streams of accessory channel decoding unit, decode according to the video encoding standard method, generate the reconstruction Occlusion Map R ' of a accessory channel _a, to the prognostic chart P of a accessory channel _aWith reconstruction Occlusion Map R ' _aDo and, generate the reconstructed image frame I ' of a accessory channel _a

Step 5: degree of depth generation unit 23, according to a passage reconstructed image frame I ' of input _aAnd the reconstructed image frame of the synchronization of passage adjacent thereto, generate a passage depth map Z ' of reconstructed image frame constantly _aIf a＜c, then the a+1 passage is rebuilt prior to a passage, 23 couples of I ' of degree of depth generation unit _aAnd I ' _A+1Carry out the solid coupling, to generate Z ' _aIf a＞c, then the a-1 passage is rebuilt prior to a passage, 23 couples of I ' of degree of depth generation unit _aAnd I ' _A-1Carry out the solid coupling, to generate Z ' _a

Step 6: channel prediction unit 24 is according to the reconstructed image frame I ' of a passage _aWith depth map Z ' _aAnd the reconstructed image frame I ' of k accessory channel synchronization adjacent thereto _kWith depth map Z ' _k, the method for employing DIBR synthesizes the tunnel prognostic chart that is positioned at a passage and k channel center position, remembers that this tunnel prognostic chart is V _a, the corresponding virtual video camera in this tunnel position, its photocentre is positioned at the point midway of the line of a passage video camera photocentre and k passage video camera photocentre, and its optical axis is parallel to central passage.If a＜c, k equals a+1; If a＞c, k equals a-1.Synthesize V _aAfter, if a equals 1, get back to step 3; If a equals N, jump to step 8; If 1＜a＜N carries out next step.

Step 7: channel prediction unit 24, according to I ' _aAnd Z ' _a, the method for employing DIBR synthesizes the prognostic chart with the contiguous q accessory channel of a passage.If a＜c, q equals a-1, makes a equal a-1; If a＞c, q equals a+1, makes a equal a+1.Get back to step 4.

Step 8: the anti-correcting unit 26 of image, according to the camera parameters of input, to central passage reconstructed image frame, accessory channel reconstructed image frame, tunnel prognostic chart, the anti-correction.The virtual video camera parameter that tunnel needs according to the camera parameters of two passages that are adjacent, is carried out the linear weighted function interpolation and is obtained.

Claims

1. A multi-channel video stream encoder, characterized in that, comprising:

An image correction unit, configured to correct the input multiple channel video stream image frames;

A channel selection unit is used to select a central channel and an auxiliary channel from a plurality of input video channels;

A depth generating unit for generating a depth map of each image frame in the central channel and the auxiliary channel video stream;

The auxiliary channel prediction unit is used to reconstruct the image frame and reconstruct the depth map according to the center channel generated by the channel reconstruction unit, or generate the auxiliary channel image frame according to the auxiliary channel reconstruction image frame generated by the channel reconstruction unit and the depth map generated by the depth generation unit Forecast map;

The central channel encoding unit is used to encode the central channel video stream and the depth stream composed of the depth map according to the video encoding standard method to generate the central channel code stream. The video encoding standard method includes the international video encoding standards MPEG-X, H .26X and national video coding standard AVS;

The auxiliary channel encoding unit is used to encode the occlusion map of the auxiliary channel image frame according to the video encoding standard method to generate the auxiliary channel code stream;

The channel reconstruction unit is used to decode the center channel code stream and the auxiliary channel code stream according to the video coding standard method to generate the center channel reconstruction image frame, the reconstruction depth map and the auxiliary channel reconstruction occlusion map, according to the reconstruction occlusion map and the auxiliary channel The prediction map of the auxiliary channel image frame generated by the prediction unit generates the auxiliary channel reconstruction frame;

A multiplexer is used to generate multi-channel video compression code streams by time-division multiplexing the camera parameters, central channel code stream and auxiliary channel code stream;

The depth generation unit generates the depth map of the image frame of the central channel at this moment according to the image frame of the central channel and the image frame of any adjacent auxiliary channel at the same time; according to the reconstructed frame of the current channel generated by the channel reconstruction unit, Generate a depth map of the reconstructed frame of the current channel at the same moment as the reconstructed frame of the adjacent channel of the current channel;

The auxiliary channel encoding unit makes a difference between the auxiliary channel image frame and the prediction map of the image frame generated by the auxiliary channel prediction unit to generate an auxiliary channel occlusion map, and the occlusion map reflects information that does not appear on the prediction map due to occlusion .

2. A multi-channel video stream decoder, characterized in that, comprising:

The demultiplexer is used to decompose the multi-channel video compression code stream into camera parameters, center channel code stream and auxiliary channel code stream;

The center channel decoding unit is used to decode the center channel code stream according to the video coding standard method to generate the center channel reconstructed image frame and the reconstructed depth map. The video coding standard method includes video coding international standards MPEG-X, H.26X and Video coding national standard AVS;

A depth generating unit, configured to generate a depth map of the auxiliary channel reconstructed image frame;

A channel prediction unit, configured to generate a prediction map of an auxiliary channel image frame and a prediction map of a virtual channel image frame;

The auxiliary channel decoding unit is used to decode the auxiliary channel code stream according to the video coding standard method to generate a reconstructed occlusion map of the auxiliary channel image frame, and add the reconstructed occlusion map and the prediction map generated by the channel prediction unit to generate the reconstructed image frame of the auxiliary channel;

The image inverse correction unit is used to inversely correct the center channel reconstruction image frame, auxiliary channel reconstruction image frame and virtual channel prediction image frame generated by decoding, so that the image frames of each channel can be restored to the shooting position;

The depth generating unit generates a depth map of the reconstructed image frame of the current auxiliary channel according to the reconstructed image frame of the current auxiliary channel generated by the auxiliary channel decoding unit and the reconstructed image frame of the adjacent channel of the current auxiliary channel at the same time;

The channel prediction unit, according to the reconstructed image frame of the center channel generated by the center channel decoding unit and the depth map of the image frame, according to the method based on depth image rendering, generates the image frame of the auxiliary channel adjacent to the center channel at the same time According to the reconstructed image frame of the current auxiliary channel generated by the auxiliary channel decoding unit and the depth map of the image frame generated by the depth generation unit, according to the method based on depth image rendering, an unreconstructed image adjacent to the current auxiliary channel is generated A prediction map of the image frame of the auxiliary channel at the same moment;

The channel prediction unit, at the center of two adjacent channels, generates a prediction map of a virtual channel image frame according to the reconstructed image frame and depth map of the two channels at the same moment, and the optical center of the virtual camera of the virtual channel Located at the midpoint of the line connecting the optical centers of the cameras of two adjacent channels, the optical axis of the virtual camera is parallel to the optical axis of the camera of the central channel.

3. a kind of multi-channel video stream decoder according to claim 2, it is characterized in that described image anti-correction unit, for the compressed video stream of N channels of input decoder, output the non- Compressed video streams, including reconstructed uncompressed video streams of N channels recovered by decoding, and uncompressed video streams of N-1 virtual channels synthesized by deep image rendering technology.

4. a kind of multi-channel video stream decoder according to claim 2, is characterized in that described image anti-correction unit, the 2N-1 channels of output are divided into reconstruction channel and virtual channel, to N reconstruction channels, Arranged in sequence according to the relative positions of the real cameras, for the virtual channel, insert it into the center of the two reconstructed channels adjacent to the virtual channel; the output 2N-1 channels can generate 2N-2 stereoscopic viewing positions in total, Each stereoscopic viewing position consists of a reconstruction channel and a virtual channel.