CN102209243B - Depth map intra prediction method based on linear model - Google Patents
Depth map intra prediction method based on linear model Download PDFInfo
- Publication number
- CN102209243B CN102209243B CN 201110140471 CN201110140471A CN102209243B CN 102209243 B CN102209243 B CN 102209243B CN 201110140471 CN201110140471 CN 201110140471 CN 201110140471 A CN201110140471 A CN 201110140471A CN 102209243 B CN102209243 B CN 102209243B
- Authority
- CN
- China
- Prior art keywords
- centerdot
- sigma
- pixel
- present encoding
- encoding piece
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a depth map intra prediction method based on a linear model. The gray values and coordinates of the previous line and left row of adjacent pixels of a current coded block are utilized to determine a linear model parameter; and according to the parameter and the pixel coordinate of the current coded block, the pixel gray value of the current coded block is predicted. According to the spatial character of a depth map, the depth map intra prediction method based on a linear model has the advantages of accurate prediction; meanwhile, because the previous line and left row of adjacent pixels of the current coded block are adopted to calculate the model parameter, a coding end does not need to code the model parameter; and a decoding end can directly determine the model parameter. The depth map intra prediction method can be applied to the coding standard of a three-dimensional video.
Description
Technical field
The present invention relates to the depth map intra-frame prediction method in a kind of 3 D stereo video coding standard, belong to communication technical field.
Background technology
The 3 D stereo video is meant that as main video applications in future the user can enjoy real 3 D stereo video content through 3 D stereo video display device.The correlation technique of 3 D video, such as, the technology such as demonstration of the collection of 3 D stereo video, 3 D stereo video coding, 3 D stereo video are paid close attention to widely.In order to promote the standardization of 3 D stereo video technology; 2002; (the Motion Picture Experts Group of Motion Picture Experts Group; MPEG) (it can provide vividly real, interactively 3 D stereo audiovisual system for Free View Television, notion FTV) to propose any viewpoint TV.The user can watch the 3 D stereo video of this angle from different angles, makes the user have and incorporates the sense of reality in the video scene.FTV can be widely used in fields such as broadcast communication, amusement, education, medical treatment and video monitoring.In order to make the user can watch 3 D stereo video at any angle, FTV system service end uses the video camera array of having demarcated to obtain the video on certain viewpoint.And, utilize corrected video information to generate the virtual view of virtual view through the virtual view synthetic technology to the video correction on the different points of view.MPEG suggestion is at present specifically used based on the degree of depth-image (Depth-Image Based Rendering, virtual view synthetic technology DIBR).Depth information is generally represented through depth map.The main process that virtual view synthesizes is following:
1). confirm to want the relative position of virtual view in video camera array.
2). confirm to be used for the texture video of synthetic virtual view
3). confirm step 2) the corresponding depth map of texture video
4). according to step 2) with 3) and in texture video and depth map, adopt the DIBR technology, synthetic virtual view.
The standardization effort of FTV is divided into two stages to carry out.Phase I is 2006 to 2008 expansion scheme of being formulated by JVT (Joint Video Team, joint video code sets) H.264/AVC: MVC (Multi-View Video Coding, model-view-controller).MVC can encode to many viewpoints texture video.But to finally realize the function of FTV system, also must encode depth information.The standardization formulation work of FTV has at present got into second stage, i.e. 3DVC (Three Dimensional Video Coding).3DVC mainly pays close attention to the expression and the coding of depth information, and the combined coding of texture video and depth information.Among the 3DVC, depth information is represented through depth map.
The leading indicator of weighing the 3DVC performance is the quality of synthetic virtual view, and the encoder bit rate of texture video, depth map.The quality of virtual view: adopt usually Y-PSNR (Peak Signal-to-Noise Ratio PSNR) weighs the quality of video, the computing formula of PSNR shown in 1. formula,
Wherein MSE representes the mean square error between original view and the synthetic virtual view, is used for weighing the distortion of virtual view, and the coding distortion of the coding distortion of texture video, depth map.
In practical application, the view of virtual view is non-existent, does not also promptly have original view.But,, at first adopt the existing texture video of un-encoded and the corresponding synthetic virtual view V of depth map thereof therefore for weighing the performance of 3DVC because 3DVC mainly pays close attention to coding efficiency
Orig, the depth map that adopts the texture video of the reconstruction after process is encoded and the back of encoding to rebuild then synthesizes virtual view V
Rec, at last through calculating V
RecWith V
OrigBetween MSE, and then obtain PSNR, to weigh the performance of 3DVC.
The encoder bit rate of texture video, depth map:
Encoder bit rate R is meant total bit number (B of texture video, depth map encoding
T, B
D) divided by video frame rate F (being that how many width of cloth images per second shows), shown in 2. formula.
Encoder bit rate R also can be expressed as the encoder bit rate R of texture video
TWith depth map encoding code check R
DWith, shown in 3. formula,
R=R
T+R
D ③
R wherein
TWith R
DCan be expressed as respectively 4., the 5. form shown in the formula,
The present invention pays close attention to the infra-prediction techniques of depth map, improves the code efficiency of depth map, makes under synthetic virtual view condition identical in quality, reduces the encoding rate of depth map as far as possible.
Existing depth map infra-prediction techniques is the infra-prediction techniques that H.264/AVC adopts; In video encoding standard H.264/AVC; Every two field picture be divided into a plurality of macro blocks (macroblock, MB), and each MB can be divided into 16 4 * 4 sub-pieces; Or 48 * 8 sub-pieces, perhaps keep original size (16 * 16).The predictive mode of dissimilar pieces, as shown in Figure 1.
Fig. 1 (a) is depicted as 8 kinds of intra prediction modes of 4 * 4,8 * 8 of brightness.0,1,3~8 represent 8 kinds of prediction direction respectively, and (Code Number, CN), A~L is used to predict that the neighborhood pixels of current block is called predict pixel to also corresponding respectively 8 kinds of coding sequence numbers.Direct current (DC) predictive mode (CN is 2) with the brightness average of pixel A~L as predicting the outcome.The predictive mode that brightness is 8 * 8 is identical with the predictive mode of 4 * 4 of brightness.Fig. 1 (b) is respectively for 4 kinds of predictive modes of 16 * 16 of brightness, corresponding CN: 0 (vertically), 1 (level), 2 (direct currents), 3 (planes).V and H represent the contiguous and last neighborhood pixels in the left side of current block respectively.The predictive mode of chrominance block (size is 8 * 8) is identical with the predictive mode of 16 * 16 of brightness, but CN is different: 0 (direct current), 1 (level), 2 (vertically), 3 (planes).(Rate Distortion Optimization RDO) can select optimum predictive mode [2] to the percent of pass aberration optimizing.In coding brightness 4 * 4; During 8 * 8 predictive mode, at first infer the most probable coding mode of current block (most probable mode, MPM) [1] according to the predictive mode of contiguous block; If the CN that the CN of current block predictive mode and MPM are corresponding is identical, the need label information of 1 bit of encoding only; The CN of current block predictive mode if the CN of current block predictive mode less than the CN of MPM correspondence, encodes; Otherwise coding CN-1.H.264/AVC with the predictive mode of 16 * 16 of brightness and coded block pattern (whether the quantization parameter that is used to identify current block is encoded for Coded Block Pattern, CBP) combined coding [1], and to chrominance block, the CN of its predictive mode of direct coding then.
H.264/AVC the intra-frame prediction method that adopts is not considered the inherent characteristic of depth map, so the code efficiency of depth map remains further to be improved.
Summary of the invention
Adopt the inherent characteristic of not considering depth map of intra-frame prediction method existence H.264/AVC and cause the low problem of code efficiency to existing depth map; The present invention proposes the high depth map intra-frame prediction method based on linear model of a kind of code efficiency according to the spatial character of depth map.
Depth map intra-frame prediction method based on linear model of the present invention is neighborhood pixels gray value and the pixel coordinate thereof according to the present encoding piece, calculates PARAMETERS IN THE LINEAR MODEL; And then, calculate the pixel grey scale predicted value of present encoding piece according to the pixel coordinate of model parameter and present encoding piece; Concrete steps are following:
1. obtain the adjacent row in the left side of present encoding piece and above the coordinate (x of pixel of adjacent delegation
i, y
i) and brightness value L
i
The coordinate and the gray value thereof of the pixel that 2. 1. obtains according to step, set up following equation group:
And adopt linear regression to calculate parameter a, and b, c, wherein n representes the quantity of neighborhood pixels;
Described employing linear regression calculating parameter a, b, the process of c is carried out through separating following equation group,
3. the parameter a that 2. tries to achieve according to step, b, c, and the coordinate (x of the pixel in the present encoding piece
i', y
i') calculate the gray scale predicted value L of each pixel of current block
i',
Wherein m representes the quantity of the pixel in the present encoding piece;
4. deduct the gray scale predicted value that 3. step calculates with the grey scale pixel value in the present encoding piece, obtain the difference of present encoding piece;
5. to step 4. the difference signal of gained carry out discrete cosine transform, quantification and entropy coding, calculate code check R
D
6. the entropy coding data that produce in the step (5) are decoded, and carry out inverse quantization and inverse discrete cosine transformation, rebuild the grey scale pixel value of current block;
7. the grey scale pixel value of the reconstruction of step (6) is deducted the original pixels gray value of present encoding piece, calculated distortion D
D
8. according to the 5. and 7. code check R of gained of step
DWith distortion D
D, the rate distortion costs J of calculating present encoding piece, J=D
D+ λ R
D, wherein, λ is a Lagrange multiplier;
The rate distortion costs of other Forecasting Methodologies that define in the rate distortion costs of the present encoding piece that 9. step is calculated in 8. and the video encoding standard H.264 compares; The Forecasting Methodology of selection rate distortion cost minimum is as final Forecasting Methodology, and the Forecasting Methodology of the final selection of mark in code stream.
The present invention compared with prior art has the following advantages:
1) the present invention makes that the predicted value of present encoding piece is more accurate owing to utilized the spatial characteristics of depth map, makes encoder bit rate littler, has improved the code efficiency of depth map.
2) the present invention makes the parameter a that calculates owing to adopt the neighborhood pixels computation model parameter of the lastrow and left side one row of current block, b, and c is more accurate.
3) the present invention makes decoding end directly to obtain parameter a, b, c through calculating owing to adopt the neighborhood pixels computation model parameter of the lastrow and left side one row of current block.
Description of drawings
Fig. 1 is the existing H.264/AVC key diagram of intra-frame prediction method.
Fig. 2 is the flow chart of coding step of the present invention;
Fig. 3 is the flow chart of decoding step of the present invention;
Fig. 4 is the rate distortion curve comparison diagram that adopts respectively after method of the present invention and method are H.264/AVC encoded to depth map.
Embodiment
Depth map intra-frame prediction method based on linear model of the present invention is neighborhood pixels gray value and the pixel coordinate thereof according to the present encoding piece, calculates PARAMETERS IN THE LINEAR MODEL; And then, calculate the pixel grey scale predicted value of present encoding piece according to the pixel coordinate of model parameter and present encoding piece; Need change encoder simultaneously, comprise the implementation process of coding side and decoding end.
The implementation process of coding side is as shown in Figure 2, comprises the steps:
Step 1 is learnt through analysis, and the spatial distribution characteristic of depth map can represent in order to drag,
L=a·x+b·y+c,
Wherein, L representes the gray value of pixel in the depth map, (x, y) remarked pixel coordinate, a, b, c are model parameter.
Step 2, obtain the present encoding piece the adjacent row in left side and above the coordinate (x of pixel of adjacent delegation
i, y
i) and brightness value L
i, coordinate and gray value thereof.
And adopt linear regression to calculate parameter a, and b, c, wherein n representes the quantity of neighborhood pixels; Linear regression calculating parameter a, b, the process of c is promptly separated the process of following equation group,
Wherein m representes the quantity of the pixel in the present encoding piece.
Step 6 is carried out discrete cosine transform to the difference signal of step 5 gained, quantizes, and entropy coding calculates code check R
D
Step 9 is according to the code check R of step 6 and 8 gained
DWith distortion D
D, calculate the rate distortion costs J of present encoding piece under the condition that adopts Forecasting Methodology of the present invention, J=D
D+ λ R
D, wherein, λ is a Lagrange multiplier.
Step 10, the rate distortion costs of the present encoding piece that step 9 is calculated and the rate distortion costs of existing Forecasting Methodology compare, and the minimum Forecasting Methodology of selection rate distortion cost is as final Forecasting Methodology.
Decoding end step of the present invention is as shown in Figure 3, practical implementation comprise the steps:
Step 1 is resolved code stream, obtains the Forecasting Methodology of current decoding block.
Step 2, if current decoding block adopts Forecasting Methodology of the present invention, then read the adjacent row in the left side of current decoding block and above the coordinate (x of pixel of adjacent delegation
i, y
i) and brightness value L
i, coordinate and gray value thereof.
And adopt linear regression to calculate parameter a, and b, c, wherein n representes the quantity of neighborhood pixels; Linear regression calculating parameter a, b, the process of c is promptly separated the process of following equation group,
Wherein m representes the quantity of the pixel in the current decoding block;
Effect of the present invention can further specify through experiment.
Experiment test under the different quantized parameters conditions, the encoder bit rate after adopting the present invention that depth map is encoded and the objective quality PSNR of synthetic virtual view.Fig. 4 has compared and adopts the present invention and method H.264/AVC that depth map is carried out the rate distortion curve behind the intraframe predictive coding.Wherein Fig. 4 (a) is the experimental result that the depth map of 3 D video sequence B alloons is encoded, and Fig. 4 (b) is the experimental result that the depth map of 3 D video sequence D ancer is encoded.Visible by Fig. 4; Compare with intra-frame prediction method coding result H.264/AVC, adopt infra-frame prediction side of the present invention to encode after, under the identical condition of the objective quality of synthetic virtual view; The encoder bit rate of depth map is lower, explains that the present invention has improved the code efficiency of depth map.As far as 3 D video sequence B alloons, the encoder bit rate of depth map on average descends 5.81%, and as far as 3 D video sequence D ancer, the encoder bit rate of depth map on average descends 7.68%.
Claims (1)
1. depth map intra-frame prediction method based on linear model, neighborhood pixels gray value and pixel coordinate thereof according to the present encoding piece calculate PARAMETERS IN THE LINEAR MODEL; And then, calculate the pixel grey scale predicted value of present encoding piece according to the pixel coordinate of model parameter and present encoding piece; Concrete steps are following:
1. obtain the adjacent row in the left side of present encoding piece and above the coordinate (x of pixel of adjacent delegation
i, y
i) and brightness value L
i
The coordinate and the gray value thereof of the pixel that 2. 1. obtains according to step, set up following equation group:
And adopt linear regression to calculate parameter a, and b, c, wherein n representes the quantity of neighborhood pixels;
Described employing linear regression calculating parameter a, b, the process of c is carried out through separating following equation group,
3. the parameter a that 2. tries to achieve according to step, b, c, and the coordinate (x of the pixel in the present encoding piece
i', y
i') calculate the gray scale predicted value L of each pixel of current block
i',
Wherein m representes the quantity of the pixel in the present encoding piece;
4. deduct the gray scale predicted value that 3. step calculates with the grey scale pixel value in the present encoding piece, obtain the difference of present encoding piece;
5. to step 4. the difference signal of gained carry out discrete cosine transform, quantification and entropy coding, calculate code check R
D
6. the 5. middle entropy coding data that produce of step are decoded, and carry out inverse quantization and inverse discrete cosine transformation, rebuild the grey scale pixel value of current block;
7. the grey scale pixel value of step reconstruction is 6. deducted the original pixels gray value of present encoding piece, calculated distortion D
D
8. according to the 5. and 7. code check R of gained of step
DWith distortion D
D, the rate distortion costs J of calculating present encoding piece, J=D
D+ λ R
D, wherein, λ is a Lagrange multiplier;
The rate distortion costs of other Forecasting Methodologies that define in the rate distortion costs of the present encoding piece that 9. step is calculated in 8. and the video encoding standard H.264 compares; The Forecasting Methodology of selection rate distortion cost minimum is as final Forecasting Methodology, and the Forecasting Methodology of the final selection of mark in code stream.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201110140471 CN102209243B (en) | 2011-05-27 | 2011-05-27 | Depth map intra prediction method based on linear model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201110140471 CN102209243B (en) | 2011-05-27 | 2011-05-27 | Depth map intra prediction method based on linear model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102209243A CN102209243A (en) | 2011-10-05 |
CN102209243B true CN102209243B (en) | 2012-10-24 |
Family
ID=44697878
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 201110140471 Expired - Fee Related CN102209243B (en) | 2011-05-27 | 2011-05-27 | Depth map intra prediction method based on linear model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102209243B (en) |
Families Citing this family (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102407474B1 (en) | 2011-10-18 | 2022-06-10 | 엘지전자 주식회사 | Method for intra prediction and device |
SI2773117T1 (en) * | 2011-10-24 | 2019-02-28 | Infobridge Pte. Ltd. | Image decoding apparatus |
CN102595166B (en) * | 2012-03-05 | 2014-03-05 | 山东大学 | Lagrange factor calculation method applied for depth image encoding |
CN103379321B (en) * | 2012-04-16 | 2017-02-01 | 华为技术有限公司 | Prediction method and prediction device for video image component |
WO2013155662A1 (en) * | 2012-04-16 | 2013-10-24 | Mediatek Singapore Pte. Ltd. | Methods and apparatuses of simplification for intra chroma lm mode |
WO2013159300A1 (en) * | 2012-04-25 | 2013-10-31 | Nokia Corporation | An apparatus, a method and a computer program for video coding and decoding |
CN104396250B (en) * | 2012-07-02 | 2018-04-03 | 高通股份有限公司 | Method and apparatus for the intra-coding of the depth map of 3D video codings |
EP2920964B1 (en) | 2013-03-26 | 2018-05-09 | MediaTek Inc. | Method of cross color intra prediction |
CN104104959B (en) * | 2013-04-10 | 2018-11-20 | 乐金电子(中国)研究开发中心有限公司 | Depth image intra-frame prediction method and device |
CN105594214B (en) * | 2013-04-12 | 2019-11-26 | 联发科技股份有限公司 | The inner frame coding method and device of depth block in 3-dimensional encoding system |
WO2014166116A1 (en) * | 2013-04-12 | 2014-10-16 | Mediatek Inc. | Direct simplified depth coding |
JP6866157B2 (en) * | 2013-09-27 | 2021-04-28 | クゥアルコム・インコーポレイテッドQualcomm Incorporated | Residual coding for depth intra prediction mode |
WO2019009749A1 (en) * | 2017-07-05 | 2019-01-10 | Huawei Technologies Co., Ltd | Apparatus and method for directional intra prediction using a fitting plane and a plurality of primary reference samples as well as a plurality of secondary reference samples |
EP3652936A1 (en) * | 2017-07-05 | 2020-05-20 | Huawei Technologies Co., Ltd. | Devices and methods for video coding |
EP3643068B1 (en) | 2017-07-05 | 2021-05-05 | Huawei Technologies Co., Ltd. | Planar intra prediction in video coding |
CN109168004B (en) * | 2018-09-27 | 2020-10-27 | 北京奇艺世纪科技有限公司 | Interpolation method and device for motion compensation |
EP4336839A3 (en) | 2019-06-25 | 2024-04-10 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Image component prediction method and device, and computer storage medium |
CN111787320B (en) * | 2020-07-03 | 2022-02-08 | 北京博雅慧视智能技术研究院有限公司 | Transform coding system and method |
CN112672150A (en) * | 2020-12-22 | 2021-04-16 | 福州大学 | Video coding method based on video prediction |
CN113422959A (en) * | 2021-05-31 | 2021-09-21 | 浙江智慧视频安防创新中心有限公司 | Video encoding and decoding method and device, electronic equipment and storage medium |
CN116582688B (en) * | 2023-05-04 | 2024-08-02 | 光线云(杭州)科技有限公司 | Depth map compression method and device adapting to cloud drawing system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008125066A1 (en) * | 2007-04-17 | 2008-10-23 | Huawei Technologies Co., Ltd. | An encoding and decoding method and means of multiple viewpoint |
CN101951511A (en) * | 2010-08-19 | 2011-01-19 | 深圳市亮信科技有限公司 | Method for layering video scenes by analyzing depth |
-
2011
- 2011-05-27 CN CN 201110140471 patent/CN102209243B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008125066A1 (en) * | 2007-04-17 | 2008-10-23 | Huawei Technologies Co., Ltd. | An encoding and decoding method and means of multiple viewpoint |
CN101951511A (en) * | 2010-08-19 | 2011-01-19 | 深圳市亮信科技有限公司 | Method for layering video scenes by analyzing depth |
Non-Patent Citations (3)
Title |
---|
Masayoshi Aoki.Image Processing in ITS.《1998 IEEE International Conference on Intelligent Vehicles》.1998, * |
Xiaoxian Liu,etc.A DEPTH ESTIMATION METHOD FOR EDGE PRECISION IMPROVEMENT OF DEPTH MAP.《201O International Conference on Computer and Communication Technologies in Agriculture Engineering》.2010, * |
章勤,李光明.线性逼近获取深度图象算法研究.《计算机仿真》.2003,第20卷(第6期), * |
Also Published As
Publication number | Publication date |
---|---|
CN102209243A (en) | 2011-10-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102209243B (en) | Depth map intra prediction method based on linear model | |
CN102484719B (en) | Method and apparatus for encoding video, and method and apparatus for decoding video | |
CN101385356B (en) | Process for coding images using intra prediction mode | |
CN102484704B (en) | Method and apparatus for encoding video, and method and apparatus for decoding video | |
CN101835056B (en) | Allocation method for optimal code rates of texture video and depth map based on models | |
CN101729891B (en) | Method for encoding multi-view depth video | |
CN103581647A (en) | Depth map sequence fractal coding method based on motion vectors of color video | |
CN106170092A (en) | Fast encoding method for lossless coding | |
CN101404766B (en) | Multi-view point video signal encoding method | |
CN107257485A (en) | Multi-view signal codec | |
CN103067704B (en) | A kind of method for video coding of skipping in advance based on coding unit level and system | |
CN106028037A (en) | Equipment for decoding images | |
CN103596004A (en) | Intra-frame prediction method and device based on mathematical statistics and classification training in HEVC | |
CN103238334A (en) | Image intra prediction method and apparatus | |
CN107888929A (en) | Video coding coding/decoding method, equipment and generation and the method for stored bits stream | |
CN104429062A (en) | Apparatus for coding a bit stream representing a three-dimensional video | |
CN103546758A (en) | Rapid depth map sequence interframe mode selection fractal coding method | |
CN107864380A (en) | 3D HEVC fast intra-mode prediction decision-making techniques based on DCT | |
CN104469336B (en) | Coding method for multi-view depth video signals | |
CN103119935A (en) | Image interpolation method and apparatus | |
CN102187668A (en) | Encoding and decoding with elimination of one or more predetermined predictors | |
CN104702959B (en) | A kind of intra-frame prediction method and system of Video coding | |
CN103634600B (en) | A kind of Video Encoding Mode system of selection based on SSIM evaluation, system | |
CN102158710B (en) | Depth view encoding rate distortion judgment method for virtual view quality | |
CN103188500B (en) | Encoding method for multi-view video signals |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20121024 Termination date: 20160527 |