JP2006014086A

JP2006014086A - Moving image encoding apparatus and moving image encoding method

Info

Publication number: JP2006014086A
Application number: JP2004190305A
Authority: JP
Inventors: Hiroki Kishi; 裕樹岸; Hiroshi Kajiwara; 浩梶原
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2004-06-28
Filing date: 2004-06-28
Publication date: 2006-01-12
Also published as: WO2006001490A1

Abstract

<P>PROBLEM TO BE SOLVED: To suppress image quality deterioration in an inter-frame when encoding a moving image using motion prediction. <P>SOLUTION: An encoding section (206) for encoding a moving image using inter-frame motion prediction includes: a a dividing part (302) for dividing each frame into a plurality of divided areas; an ROI tile determining part (317) for determining an important area out of the frames; inter-frame predicting means (310, 314) for searching highly correlated pixel sets for each divided area of the frame as an encoding target within the range of the important area of the preceding frame, and outputting differential data while taking a difference between data in the divided area and data of the searched pixel sets; and encoding means (303-308) for encoding the differential data. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、動画像符号化装置及び方法に関し、更に詳しくは、動き予測を用いて動画像を符号化する動画像符号化装置及び方法に関する。 The present invention relates to a moving image encoding apparatus and method, and more particularly, to a moving image encoding apparatus and method for encoding a moving image using motion prediction.

近年、ネットワークを介して流れるコンテンツは文字情報から静止画像情報、さらには動画像情報と大容量化、多様化している。これに合わせて、情報量を圧縮する符号化技術の開発も進み、開発された符号化技術は国際標準化によって広く普及するようになった。 In recent years, contents flowing through a network have become larger and diversified from character information to still image information and further to moving image information. Along with this, the development of an encoding technique for compressing the information amount has progressed, and the developed encoding technique has been widely spread by international standardization.

一方で、ネットワーク自体も大容量化、多様化が進んでおり、1つのコンテンツが送信側から受信側に届くまでに様々な環境を通過することになった。また、送信／受信側機器の処理性能も多様化している。送受信機器として主に用いられるＰＣではＣＰＵ性能、グラフィクス性能など、大幅な性能向上が進む一方、ＰＤＡ、携帯電話機、ＴＶ、ハードディスクレコーダなど、処理性能の異なる様々な機器がネットワーク接続機能を持つようになってきている。このため、１つのデータで、変化する通信回線容量や受信側機器の処理性能に対応できるスケーラビリティという機能が注目されている。 On the other hand, the network itself has been increased in capacity and diversified, and one content has passed through various environments before it reaches the receiving side from the transmitting side. In addition, the processing performance of transmission / reception side devices is diversified. While PCs that are mainly used as transmission / reception devices have greatly improved performance such as CPU performance and graphics performance, various devices such as PDAs, mobile phones, TVs, and hard disk recorders have network connection functions. It has become to. For this reason, attention has been paid to a function called scalability that can cope with changing communication line capacity and processing performance of the receiving side device with a single data.

このスケーラビリティ機能を持つ静止画像符号化方式としてＪＰＥＧ２０００符号化方式が広く知られている。この方式は国際標準化され、非特許文献１に詳細が記述されている。その特徴は入力された画像データに対して離散ウェーブレット変換（ＤＷＴ：Discrete Wavelet Transform）を施し、複数周波数帯に分離する。それらの係数を量子化し、その値をビットプレーン毎に算術符号化するというものである。ビットプレーンを必要な数だけ符号化したり、復号したりすることで、きめの細かい階層の制御を可能にしている。 The JPEG 2000 encoding method is widely known as a still image encoding method having this scalability function. This method is internationally standardized and is described in detail in Non-Patent Document 1. The feature is that the input image data is subjected to a discrete wavelet transform (DWT) and separated into a plurality of frequency bands. These coefficients are quantized and the values are arithmetically encoded for each bit plane. By encoding and decoding as many bit planes as necessary, it is possible to control a fine hierarchy.

また、ＪＰＥＧ２０００符号化方式では、従来の符号化技術には無い、画像の中で興味がある領域の画質を相対的に向上させるＲＯＩ（Region Of Interest）といった技術も実現している。 In addition, the JPEG2000 encoding method also realizes a technology called ROI (Region Of Interest) that relatively improves the image quality of a region of interest in an image, which is not found in the conventional encoding technology.

図２３はＪＰＥＧ２０００符号化方式による符号化部を示す。タイル分割部９００１は入力画像を複数の領域（タイル）に分割する。この機能はオプションである。ＤＷＴ部９００２は離散ウェーブレット変換を行い、周波数帯に分離する。量子化部９００３で、各係数を量子化する。ＲＯＩ指定部９００７はオプションであり、興味のある領域を設定することができる。量子化部９００３はシフトアップを行う。エントロピー符号化部９００４でＥＢＣＯＴ（ＥｍｂｅｄｅｄＢｌｏｃｋＣｏｄｉｎｇｗｉｔｈＯｐｔｉｍｉｚｅｄＴｒｕｎｃａｔｉｏｎ）方式でエントロピー符号化を行い、符号化されたデータはビット切り捨て部９００５で必要に応じて下位ビットを切り捨てられてレート制御を行う。符号形成部９００６でヘッダ情報を付加し、種々のスケーラビリティの機能を選択して符号化データを出力する。 FIG. 23 shows an encoding unit according to the JPEG2000 encoding method. A tile dividing unit 9001 divides an input image into a plurality of areas (tiles). This feature is optional. The DWT unit 9002 performs discrete wavelet transform and separates it into frequency bands. The quantization unit 9003 quantizes each coefficient. The ROI designation unit 9007 is an option, and an area of interest can be set. The quantization unit 9003 performs upshifting. The entropy encoding unit 9004 performs entropy encoding using an EBCOT (Embedded Block Coding with Optimized Truncation) method, and the bit truncation unit 9005 truncates the lower bits as necessary to perform rate control. The code forming unit 9006 adds header information, selects various scalability functions, and outputs encoded data.

図２４はＪＰＥＧ２０００符号化方式による復号化部を示す。符号解析部９０２０はヘッダを解析し、階層を構成するための情報を得る。ビット切り捨て部９０２１は入力される符号化データを内部バッファの容量、復号処理能力に対応して、下位のビットを切り捨てる。エントロピー復号部９０２２はＥＢＣＯＴ符号化方式の符号化データを復号し、量子化されたウェーブレット変換係数を得る。逆量子化９０２３部はこれに逆量子化を施し、逆ＤＷＴ部は逆離散ウェーブレット変換を施して画像データを再生する。タイル合成部９０２５は複数のタイルを合成して画像データ再生する。 FIG. 24 shows a decoding unit according to the JPEG2000 encoding method. The code analysis unit 9020 analyzes the header and obtains information for configuring a hierarchy. The bit truncation unit 9021 truncates the lower order bits of the input encoded data corresponding to the capacity of the internal buffer and the decoding processing capability. The entropy decoding unit 9022 decodes the encoded data of the EBCOT encoding method, and obtains quantized wavelet transform coefficients. The inverse quantization 9023 unit performs inverse quantization on this, and the inverse DWT unit performs inverse discrete wavelet transform to reproduce image data. A tile combining unit 9025 combines a plurality of tiles and reproduces image data.

このＪＰＥＧ２０００符号化方式を動画像の各フレームに対応させることで動画像符号化を行うＭｏｔｉｏｎＪＰＥＧ２０００方式も勧告されている（例えば、非特許文献２参照）。この方式ではフレーム単位に独立に符号化処理が行われており、時間相関を用いて符号化を行わないため、フレーム間に冗長性が残る。このため、時間相関を用いた動画像符号化方式に比べて符号量を効果的に削減することは難しいという問題がある。 The Motion JPEG2000 system that performs moving picture coding by making this JPEG2000 coding system correspond to each frame of a moving picture is also recommended (see, for example, Non-Patent Document 2). In this method, encoding processing is performed independently for each frame, and encoding is not performed using time correlation, so that redundancy remains between frames. For this reason, there is a problem that it is difficult to effectively reduce the code amount as compared with the moving picture coding method using time correlation.

一方で、ＭＰＥＧ符号化方式では動き補償を行い、符号化効率の改善を図っている（例えば、非特許文献３参照）。図２５にその符号化部の構成を示す。ブロック分割部９０３１で８×８のブロックに分割し、差分部９０３２で動き補償による予測データを引き、ＤＣＴ部９０３３で離散コサイン変換を行い、量子化部９０３４で量子化を行う。その結果はエントロピー符号化部９０３５で符号化され、符号形成部９０３６でヘッダ情報を付加して、符号化データを出力する。 On the other hand, in the MPEG encoding method, motion compensation is performed to improve encoding efficiency (see, for example, Non-Patent Document 3). FIG. 25 shows the configuration of the encoding unit. The block division unit 9031 divides the block into 8 × 8 blocks, the difference unit 9032 subtracts the motion compensation prediction data, the DCT unit 9033 performs discrete cosine transform, and the quantization unit 9034 performs quantization. The result is encoded by the entropy encoding unit 9035, the header information is added by the code forming unit 9036, and the encoded data is output.

同時に、エントロピー符号化部９０３５の処理と同じタイミングで逆量子化部９０３７で逆量子化し、逆ＤＣＴ部９０３８で離散コサイン変換の逆変換を施し、加算部９０３９で予測データを加算してフレームメモリ９０４０に格納する。動き補償部９０４１は入力画像とフレームメモリ９０４０に格納されている参照フレームを参照して動きベクトルを求め、予測データを生成する。 At the same time, the inverse quantization unit 9037 performs inverse quantization at the same timing as the processing of the entropy encoding unit 9035, the inverse DCT unit 9038 performs inverse transform of the discrete cosine transform, the addition unit 9039 adds the prediction data, and the frame memory 9040 To store. The motion compensation unit 9041 obtains a motion vector with reference to the input image and the reference frame stored in the frame memory 9040, and generates prediction data.

ＩＳＯ／ＩＥＣ１５４４４−１（Information technology -- JPEG 2000 image coding system -- Part 1: Core coding system）ISO / IEC15444-1 (Information technology-JPEG 2000 image coding system-Part 1: Core coding system) ＩＳＯ／ＩＥＣ１５４４４−３（Information technology -- JPEG 2000 image coding system Part 3: Motion JPEG 2000）ISO / IEC 15444-3 (Information technology-JPEG 2000 image coding system Part 3: Motion JPEG 2000) 「最新ＭＰＥＧ教科書」76ページ他アスキー出版局1994年“Latest MPEG Textbook”, 76 pages, etc. ASCII Publishing Bureau 1994

このＪＰＥＧ２０００符号化の効率を高めることを目的として、ＪＰＥＧ２０００に動き補償を追加した圧縮方式がある。このような動画像の圧縮方式では、図２６に示したように、下位ビットプレーンの切り捨てなどにより、予測先のデータが部分的に破棄される場合、予測誤差が積み重なり、インターフレームの画質が大きく劣化する問題があった。 For the purpose of improving the efficiency of JPEG2000 encoding, there is a compression method in which motion compensation is added to JPEG2000. In such a moving image compression method, as shown in FIG. 26, when the prediction destination data is partially discarded due to truncation of lower bit planes, prediction errors accumulate, and the image quality of the inter frame increases. There was a problem of deterioration.

本発明は上記問題点を鑑みてなされたものであり、動き予測を用いて動画像を符号化する場合に、インターフレームにおける画質劣化を抑制することを目的としている。 The present invention has been made in view of the above problems, and an object thereof is to suppress image quality deterioration in an inter frame when a moving image is encoded using motion prediction.

上記目的を達成するために、フレーム間動き予測を用いて動画像を符号化する本発明の動画像符号化装置は、各フレームを複数の分割領域に分割する分割手段と、フレーム内から重要領域を決定する決定手段と、前フレームの重要領域の範囲内で、符号化対象のフレームの各分割領域毎に相関性が高い画素集合を探索し、該各分割領域のデータと、探索した画素集合のデータとの差分を取って差分データを出力するフレーム間予測手段と、前記差分データを符号化する符号化手段とを有する。 In order to achieve the above object, a moving picture coding apparatus according to the present invention for coding a moving picture using inter-frame motion prediction includes dividing means for dividing each frame into a plurality of divided areas, and an important area from within the frame. A pixel unit having a high correlation for each divided region of the frame to be encoded within the important region of the previous frame, and data of each divided region and the searched pixel set Inter-frame prediction means for taking the difference from the data and outputting the difference data, and encoding means for encoding the difference data.

また、フレーム間動き予測を用いて動画像を符号化する本発明の動画像符号化方法は、各フレームを複数の分割領域に分割する分割工程と、フレーム内から重要領域を決定する決定工程と、前フレームの重要領域の範囲内で、符号化対象のフレームの各分割領域毎に相関性が高い画素集合を探索し、該各分割領域のデータと、探索した画素集合のデータとの差分を取って差分データを出力するフレーム間予測工程と、前記差分データを符号化する符号化工程とを有する。 The moving image encoding method of the present invention for encoding a moving image using inter-frame motion prediction includes a dividing step of dividing each frame into a plurality of divided regions, and a determining step of determining an important region from within the frame. In the range of the important area of the previous frame, a pixel set having high correlation is searched for each divided area of the encoding target frame, and the difference between the data of each divided area and the searched pixel set data is calculated. An inter-frame prediction step for outputting difference data, and an encoding step for encoding the difference data.

また、別の構成によれば、フレーム間動き予測を用いて動画像を符号化する本発明の動画像符号化装置は、各フレームを複数の分割領域に分割する分割手段と、フレーム内から重要領域を決定する決定手段と、各分割領域毎にデータ変換を行い、変換係数を生成する変換手段と、前フレームの重要領域の範囲に対応する変換係数から、符号化対象のフレームの各分割領域の変換係数毎に相関性が高い変換係数を探索し、該各分割領域の変換係数と、探索した変換係数との差分を取って差分データを出力するフレーム間予測手段と、前記差分データを符号化する符号化手段とを有する。 According to another configuration, the moving image encoding apparatus of the present invention that encodes a moving image using inter-frame motion prediction includes a dividing unit that divides each frame into a plurality of divided regions, Each divided region of the encoding target frame is determined from a determining unit that determines a region, a conversion unit that performs data conversion for each divided region and generates a conversion coefficient, and a conversion coefficient corresponding to the range of the important region of the previous frame. An inter-frame prediction unit that searches for a transform coefficient having a high correlation for each transform coefficient, takes a difference between the transform coefficient of each divided region and the found transform coefficient, and outputs difference data, and encodes the difference data And encoding means for converting.

また、フレーム間動き予測を用いて動画像をフレーム毎に符号化する本発明の動画像符号化方法は、各フレームを複数の分割領域に分割する分割工程と、フレーム内から重要領域を決定する決定工程と、各分割領域毎にデータ変換を行い、変換係数を生成する変換工程と、前フレームの重要領域の範囲に対応する変換係数から、符号化対象のフレームの各分割領域の変換係数毎に相関性が高い変換係数を探索し、該各分割領域の変換係数と、探索した変換係数との差分を取って差分データを出力するフレーム間予測工程と、前記差分データを符号化する符号化工程とを有する。 In addition, the moving image encoding method of the present invention that encodes a moving image for each frame using inter-frame motion prediction determines a division step for dividing each frame into a plurality of divided regions and an important region from within the frame. From the determination step, the conversion step of performing data conversion for each divided region, and generating the conversion coefficient, and the conversion coefficient corresponding to the range of the important region of the previous frame, for each conversion coefficient of each divided region of the encoding target frame An inter-frame prediction step of searching for a transform coefficient having a high correlation with each other, taking a difference between the transform coefficient of each divided region and the found transform coefficient, and outputting difference data, and encoding for encoding the difference data Process.

上記構成によれば、動き予測を用いて動画像を符号化する場合に、インターフレームの画質劣化を抑制することができる。 According to the above configuration, when a moving image is encoded using motion prediction, it is possible to suppress image quality deterioration of an inter frame.

以下、添付図面を参照して本発明を実施するための最良の形態を詳細に説明する。 The best mode for carrying out the present invention will be described below in detail with reference to the accompanying drawings.

（第１の実施形態）
本発明における処理対象となる動画像は、図１に示されているように、画像と音声から構成され、さらに画像は瞬間瞬間の情報を表すフレームから構成されている。 (First embodiment)
As shown in FIG. 1, a moving image to be processed in the present invention is composed of an image and a sound, and the image is composed of a frame representing information of an instantaneous moment.

図２は、本第１の実施形態における動画像処理装置の構成を示すブロック図である。同図において、２００はＣＰＵ、２０１はメモリ、２０２は端末、２０３は蓄積部、２０４は撮像部、２０５は表示部、２０６は符号化部である。 FIG. 2 is a block diagram illustrating a configuration of the moving image processing apparatus according to the first embodiment. In the figure, 200 is a CPU, 201 is a memory, 202 is a terminal, 203 is a storage unit, 204 is an imaging unit, 205 is a display unit, and 206 is an encoding unit.

＜符号化部２０６の処理説明＞
次に、符号化部２０６におけるフレームデータの符号化処理について、図３に示す符号化部２０６の構成および図４のフローチャートを参照して説明する。なお、ヘッダの作成方法等、詳細についてはＩＳＯ／ＩＥＣ勧告書に説明されている通りであるので、ここでは説明を省略する。 <Description of Processing of Encoding Unit 206>
Next, frame data encoding processing in the encoding unit 206 will be described with reference to the configuration of the encoding unit 206 shown in FIG. 3 and the flowchart of FIG. The details of the header creation method and the like are as described in the ISO / IEC recommendation, and will not be described here.

また、以下の説明では、符号化対象となるフレームデータが８ビットのモノクロフレームデータであるものとして説明をする。しかしながら、フレームデータの形態はこれに限るものではなく、各画素４ビット、１０ビット、１２ビットと言った具合に８ビット以外のビット数で表すモノクロ画像、或いは各画素における各色成分（ＲＧＢ／Ｌａｂ／ＹＣｒＣｂ）を８ビットで表現するカラーの多値フレームデータである場合に適用することも可能である。また、画像を構成する各画素の状態等を表す多値情報である場合、例えば各画素の色を表す多値のインデックス値である場合にも適用できる。これらに応用する場合には、各種類の多値情報を後述するモノクロフレームデータとすればよい。 In the following description, it is assumed that the frame data to be encoded is 8-bit monochrome frame data. However, the form of the frame data is not limited to this, and a monochrome image represented by a number of bits other than 8 bits such as 4 bits, 10 bits, and 12 bits for each pixel, or each color component (RGB / Lab) for each pixel. / YCrCb) can also be applied to color multivalued frame data expressing 8 bits. Further, the present invention can be applied to multi-value information representing the state of each pixel constituting an image, for example, multi-value index value representing the color of each pixel. When applied to these, each type of multi-value information may be monochrome frame data described later.

まず、撮像部２０４から、フレームデータ入力部３０１へ、符号化対象となる画像のフレームデータを構成する画素データがラスタースキャン順に入力され、タイル分割部３０２に出力される。 First, pixel data constituting frame data of an image to be encoded is input from the imaging unit 204 to the frame data input unit 301 in the order of raster scan and output to the tile dividing unit 302.

タイル分割部３０２は、フレームデータ入力部３０１から入力される１枚の画像を図５に示すようなＮ枚のタイルに分割し（ステップＳ４０１）、各タイルを識別するために、本第１の実施形態ではラスタースキャン順にタイル番号0, 1, 2, ...,N-1を割り振る。以下、各タイルを表すデータを「タイルデータ」と呼ぶ。なお、図５では画像を横８縦６の４８枚のタイルに分割した例を示しているが、分割タイル数は適宜変更可能であることは言うまでもない。これら生成されたタイルデータは、順に離散ウェーブレット変換部３０３に送られる。離散ウェーブレット変換部３０３以降の処理においては、タイルデータ毎に符号化される。 The tile dividing unit 302 divides one image input from the frame data input unit 301 into N tiles as shown in FIG. 5 (step S401), and identifies the first tile in order to identify each tile. In the embodiment, tile numbers 0, 1, 2,..., N−1 are assigned in the raster scan order. Hereinafter, data representing each tile is referred to as “tile data”. Although FIG. 5 shows an example in which an image is divided into 48 tiles of 8 horizontal by 6 vertical, it goes without saying that the number of divided tiles can be changed as appropriate. The generated tile data is sequentially sent to the discrete wavelet transform unit 303. In the processing after the discrete wavelet transform unit 303, encoding is performed for each tile data.

また、ＲＯＩタイル決定部３１７が高画質に符号化するタイル（ＲＯＩタイル）がいずれであるかを決定する（ステップＳ４０２）。図６は決定したＲＯＩタイルの例を示す図である。なお、ＲＯＩタイル決定部３１７は、ユーザが不図示の入力デバイスにより指定した優先領域を包含する領域に対して、ＲＯＩタイルと決定する。次にステップＳ４０３において、処理しているタイルを認識するためのカウンタをｉ＝０に設定する。 Further, the ROI tile determination unit 317 determines which tile (ROI tile) is encoded with high image quality (step S402). FIG. 6 is a diagram illustrating an example of the determined ROI tile. Note that the ROI tile determination unit 317 determines an ROI tile for an area including a priority area specified by an input device (not shown) by the user. In step S403, a counter for recognizing the tile being processed is set to i = 0.

次に、フレーム属性判定部３１６は、符号化対象のフレームがＩフレーム（Intra frame）／Ｐフレーム（Predictive frame）のいずれであるかを判定する（ステップＳ４０４）。符号化対象フレームがＩフレームならば、タイルデータは離散ウェーブレット変換部３０３に出力される。一方、符号化対象フレームがＰフレームならば、動き補償（ＭＣ：Motion Compensation）予測部３１０にフレームデータをコピーする。 Next, the frame attribute determining unit 316 determines whether the frame to be encoded is an I frame (Intra frame) or a P frame (Predictive frame) (step S404). If the encoding target frame is an I frame, the tile data is output to the discrete wavelet transform unit 303. On the other hand, if the encoding target frame is a P frame, the frame data is copied to a motion compensation (MC) prediction unit 310.

［符号化対象フレームがＩフレームの場合］
符号化対象フレームがＩフレームの場合、ステップＳ４０５において離散ウェーブレット変換部３０３は、タイル分割部３０２から入力される、１フレーム画像のフレームデータ中の１つのタイルデータx(n)における複数の画素（参照画素）のデータ（以下、「参照画素データ」）を用いて離散ウェーブレット変換を行う。 [When encoding target frame is I frame]
When the encoding target frame is an I frame, in step S405, the discrete wavelet transform unit 303 inputs a plurality of pixels (one tile data x (n) in the frame data of one frame image input from the tile dividing unit 302 ( Discrete wavelet transform is performed using reference pixel data (hereinafter referred to as “reference pixel data”).

ここで、離散ウェーブレット変換後のフレームデータ（離散ウェーブレット変換係数）を示す。
Y(2n) = X(2n)+floor{ (Y(2n-1)+Y(2n+1)+2)/4 }
Y(2n+1) = X(2n+1)-floor{ (X(2n)+X(2n+2))/2 } …（１） Here, the frame data (discrete wavelet transform coefficient) after the discrete wavelet transform is shown.
Y (2n) = X (2n) + floor {(Y (2n-1) + Y (2n + 1) +2) / 4}
Y (2n + 1) = X (2n + 1) -floor {(X (2n) + X (2n + 2)) / 2} (1)

Y(2n),Y(2n+1)は離散ウェーブレット変換係数列であり、Y(2n)は低周波サブバンド、Y(2n+1)は高周波サブバンドである。また、上記変換式（１）においてfloor{X}はXを超えない最大の整数値を表す。この離散ウェーブレット変換を模式的に表わしたのが図７である。 Y (2n) and Y (2n + 1) are discrete wavelet transform coefficient sequences, Y (2n) is a low-frequency subband, and Y (2n + 1) is a high-frequency subband. In the conversion formula (1), floor {X} represents a maximum integer value not exceeding X. FIG. 7 schematically shows the discrete wavelet transform.

上記変換式（１）は一次元のデータに対するものであるが、この変換を水平方向、垂直方向の順に適用して二次元の変換を行うことにより、図８（ａ）に示すようなＬＬ，ＨＬ，ＬＨ，ＨＨの４つのサブバンドに分割することができる。ここで、Ｌは低周波サブバンド、Ｈは高周波サブバンドを示している。次にＬＬサブバンドを、同じようにして４つのサブバンドに分け（図８（ｂ））、その中のＬＬサブバンドを更に４つのサブバンドに分ける（図８（ｃ））。このようにして合計１０のサブバンドを作る。１０個のサブバンドそれぞれを、図８（ｃ）の様にＨＨ１，ＨＬ１，…と呼ぶ。ここで、各サブバンドの名称における数字は、それぞれのサブバンドのレベルを示す。つまり、レベル１のサブバンドは、ＨＬ１，ＨＨ１，ＬＨ１、レベル２のサブバンドは、ＨＬ２，ＨＨ２，ＬＨ２、レベル３のサブバンドは、ＨＬ３，ＨＨ３，ＬＨ３である。なおＬＬサブバンドは、レベル０のサブバンドである。ＬＬサブバンドは一つしかないので添字を付けない。またレベル０からレベルｎまでのサブバンドを復号することで得られる復号画像を、レベルｎの復号画像と呼ぶ。復号画像は、そのレベルが高い程解像度は高い。 The conversion equation (1) is for one-dimensional data. By applying this conversion in the order of the horizontal direction and the vertical direction, and performing two-dimensional conversion, the LL, as shown in FIG. It can be divided into four subbands HL, LH, and HH. Here, L indicates a low-frequency subband, and H indicates a high-frequency subband. Next, the LL subband is divided into four subbands in the same manner (FIG. 8B), and the LL subband is further divided into four subbands (FIG. 8C). In this way, a total of 10 subbands are created. Each of the ten subbands is called HH1, HL1,... As shown in FIG. Here, the number in the name of each subband indicates the level of each subband. That is, the level 1 subbands are HL1, HH1, and LH1, the level 2 subbands are HL2, HH2, and LH2, and the level 3 subbands are HL3, HH3, and LH3. The LL subband is a level 0 subband. Since there is only one LL subband, no subscript is added. A decoded image obtained by decoding subbands from level 0 to level n is referred to as a level n decoded image. The higher the level of the decoded image, the higher the resolution.

１０個のサブバンドの変換係数は、一旦バッファ３０４に格納され、ＬＬ，ＨＬ１，ＬＨ１，ＨＨ１，ＨＬ２，ＬＨ２，ＨＨ２，ＨＬ３，ＬＨ３，ＨＨ３の順に、つまり、レベルが低いサブバンドからレベルが高いサブバンドの順に、係数量子化部３０５へ出力される。 The transform coefficients of 10 subbands are temporarily stored in the buffer 304, and are in the order of LL, HL1, LH1, HH1, HL2, LH2, HH2, HL3, LH3, and HH3, that is, from the subband having the lowest level to the higher level. The result is output to coefficient quantization section 305 in the order of subbands.

係数量子化部３０５では、バッファ３０４から出力される各サブバンドの変換係数を各周波数成分毎に定めた量子化ステップで量子化し、量子化後の値（係数量子化値）をエントロピー符号化部３０６へ出力する（ステップＳ４０６）。係数値をＸ、この係数の属する周波数成分に対する量子化ステップの値をｑとすると、量子化後の係数値Ｑ（Ｘ）は次式（２）によって求めるものとする。
Q(X)=floor{(X/q)+0.5} …（２） The coefficient quantization unit 305 quantizes the transform coefficient of each subband output from the buffer 304 at a quantization step determined for each frequency component, and the quantized value (coefficient quantization value) is an entropy coding unit. The data is output to 306 (step S406). When the coefficient value is X and the quantization step value for the frequency component to which the coefficient belongs is q, the quantized coefficient value Q (X) is obtained by the following equation (2).
Q (X) = floor {(X / q) +0.5} (2)

本実施の形態における各周波数成分と量子化ステップとの対応を図９に示す。同図に示すように、よりレベルが高いサブバンドの方に、大きい量子化ステップを与えている。なお、各サブバンド毎の量子化ステップは予め不図示のＲＡＭやＲＯＭなどのメモリに格納されているものとする。そして、一つのサブバンドにおける全ての変換係数を量子化した後、それら係数量子化値をエントロピー符号化部３０６と逆係数量子化部３１２に出力する。 FIG. 9 shows the correspondence between each frequency component and the quantization step in the present embodiment. As shown in the figure, a larger quantization step is given to a sub-band having a higher level. Note that the quantization step for each subband is stored in advance in a memory such as a RAM or a ROM (not shown). After all the transform coefficients in one subband are quantized, the coefficient quantized values are output to the entropy coding unit 306 and the inverse coefficient quantization unit 312.

逆係数量子化部３１２は、図９の量子化ステップを利用し、以下の式（３）に基づいて、係数量子化値を逆量子化する（ステップＳ４０７）。
Y=q*Q …（３） The inverse coefficient quantization unit 312 uses the quantization step of FIG. 9 to inversely quantize the coefficient quantization value based on the following equation (3) (step S407).
Y = q * Q (3)

ここで、ｑを量子化ステップ、Ｑを係数量子化値、Ｙを逆量子化値とする。
逆離散ウェーブレット変換部３１３は、逆量子化値を以下の式（４）に従って逆離散ウェーブレット変換する（ステップＳ４０８）。
X(2n)=Y(2n)-floor{(Y(2n-1)+Y(2n+1)+2)/4}
X(2n+1)=Y(2n+1)+floor{(X(2n)+X(2n+2))/2} …（４） Here, q is a quantization step, Q is a coefficient quantization value, and Y is an inverse quantization value.
The inverse discrete wavelet transform unit 313 performs inverse discrete wavelet transform on the inverse quantized value according to the following equation (4) (step S408).
X (2n) = Y (2n) -floor {(Y (2n-1) + Y (2n + 1) +2) / 4}
X (2n + 1) = Y (2n + 1) + floor {(X (2n) + X (2n + 2)) / 2} (4)

そして、得られた復号画素をフレームメモリ３１１に記録する（ステップＳ４０９）。 Then, the obtained decoded pixel is recorded in the frame memory 311 (step S409).

一方、エントロピー符号化部３０６は、入力された係数量子化値をエントロピー符号化する（ステップＳ４１０）。ここでは、まず、図１０に示すように、入力された係数量子化値の集まりである各サブバンドが矩形（「コードブロック」と呼ぶ。）に分割される。なお、このコードブロックの大きさには、２ｍ×２ｎ（ｍ、ｎは２以上の整数）等が設定される。さらにこのコードブロックを、図１１に示すように、ビットプレーンに分割する。その上で、各ビットプレーンにおける各ビットは、図１２に示すように所定分類規則に基づいて３種類に分けられて、同じ種類のビットを集めたコーディングパスが３種類生成される。この３種類のコーディングパスは、有意な係数が周囲にある有意でない係数の符号パスであるsignificance propagation passと、有意な係数の符号パスであるmagnitude refinement passと、残りの係数情報の符号パスであるcleanup passである。 On the other hand, the entropy encoding unit 306 performs entropy encoding on the input coefficient quantization value (step S410). Here, as shown in FIG. 10, first, each subband, which is a collection of input coefficient quantization values, is divided into rectangles (referred to as “code blocks”). The size of this code block is set to 2m × 2n (m and n are integers of 2 or more). Further, this code block is divided into bit planes as shown in FIG. Then, each bit in each bit plane is divided into three types based on a predetermined classification rule, as shown in FIG. 12, and three types of coding paths that collect the same type of bits are generated. These three types of coding passes are a significance propagation pass that is a sign path of insignificant coefficients around which a significant coefficient is present, a magnitude refinement pass that is a sign path of significant coefficients, and a sign path of the remaining coefficient information. cleanup pass.

入力された係数量子化値は、ここで得られたコーディングパスを単位として、エントロピー符号化である二値算術符号化が行われ、エントロピー符号化値が生成される。 The input coefficient quantization value is subjected to binary arithmetic coding, which is entropy coding, with the coding pass obtained here as a unit, and an entropy coded value is generated.

なお、エントロピー符号化は、１つのコードブロックに注目すると上位ビットプレーンから下位ビットプレーンの順に符号化され、更にそのコードブロックのあるビットプレーンに注目すると、図１２に示す３種類のパスを上から順に符号化するようになっている。なお、図１２は図１１の第４のビットプレーンにおけるコーディングパスの分類を示す。 Note that entropy encoding is performed in order from the upper bit plane to the lower bit plane when attention is paid to one code block. Further, when attention is paid to a bit plane having the code block, the three types of paths shown in FIG. The encoding is performed in order. FIG. 12 shows the classification of coding paths in the fourth bit plane of FIG.

エントロピー符号化されたコーディングパスは、タイル符号化データ生成部３０７に出力される。 The coding path subjected to entropy encoding is output to the tile encoded data generation unit 307.

タイル符号化データ生成部３０７では、入力された複数のコーディングパスから、単一もしくは複数のレイヤーを構成し、それらレイヤーをデータの単位としてタイル符号化データを生成する（ステップＳ４１１）。以下にレイヤーの構成に関する説明を行う。 The tile encoded data generation unit 307 forms a single or a plurality of layers from a plurality of input coding passes, and generates tile encoded data using these layers as data units (step S411). The layer structure will be described below.

タイル符号化データ生成部３０７は、図１３に示すように、複数のサブバンドにおける複数のコードブロックから、エントロピー符号化されたコーディングパスを集めた上で、レイヤーを構成する。図１３は５枚のレイヤーを生成する場合を示している。なお、任意のコードブロックからコーディングパスを取得する際には、図１４に示すように、常にそのコードブロックにおいて最上位に存在するコーディングパスから順に選択する。その後、タイル符号化データ生成部３０７は、図１５に示すように、生成したレイヤーを上位に位置するレイヤーから順に並べた上で、その先頭にタイルヘッダを付加してタイル符号化データを生成する。このヘッダには、タイルを識別する情報や、当該タイル符号化データの符号長や、圧縮に使用した様々なパラメータ等が格納される。このように生成されたタイル符号化データは、フレーム符号化データ生成部３０８に出力される。 As shown in FIG. 13, the tile encoded data generation unit 307 collects entropy-encoded coding paths from a plurality of code blocks in a plurality of subbands, and configures a layer. FIG. 13 shows a case where five layers are generated. When a coding pass is acquired from an arbitrary code block, as shown in FIG. 14, selection is always made in order from the coding pass that exists at the top in the code block. After that, as shown in FIG. 15, the tile encoded data generation unit 307 arranges the generated layers in order from the upper layer and generates a tile encoded data by adding a tile header to the head thereof. . This header stores information for identifying a tile, the code length of the encoded tile data, various parameters used for compression, and the like. The tile encoded data generated in this way is output to the frame encoded data generation unit 308.

次に、ステップＳ４１２で符号化すべきタイルデータが残っているかどうかをカウンタｉの値とタイル番号とを比較することにより判断する。符号化すべきタイルデータが残っている場合（つまりｉ＜Ｎ−１）は、ステップＳ４１３でカウンタｉを１増やし、ステップＳ４０５に戻って次のタイルに対してステップＳ４１２までの処理を繰り返す。符号化すべきタイルデータが残っていない場合（つまりｉ＝Ｎ−１）は、ステップＳ４２６に進む。 Next, in step S412, whether or not tile data to be encoded remains is determined by comparing the value of the counter i with the tile number. If tile data to be encoded remains (that is, i <N−1), the counter i is incremented by 1 in step S413, and the process returns to step S405 to repeat the processing up to step S412 for the next tile. If no tile data to be encoded remains (that is, i = N−1), the process proceeds to step S426.

ステップＳ４２６において、フレーム符号化データ生成部３０８では、図１５に示すようなタイル符号化データを、図１６に示すように所定の順番（例えば、タイル番号順）に並べた上で、先頭にヘッダを付加してフレーム符号化データを生成する。このヘッダには、入力画像やタイルの縦横のサイズ、圧縮に使用した様々なパラメータ等が格納される。このように生成されたフレーム符号化データは、フレーム符号化データ出力部３０９から記録部２１２に出力される。 In step S426, the frame encoded data generation unit 308 arranges tile encoded data as shown in FIG. 15 in a predetermined order (for example, in order of tile numbers) as shown in FIG. Is added to generate frame encoded data. The header stores the input image, the vertical and horizontal sizes of tiles, various parameters used for compression, and the like. The frame encoded data generated in this way is output from the frame encoded data output unit 309 to the recording unit 212.

なお、上記説明では、ステップＳ４０７〜Ｓ４０９の処理をステップＳ４１０、Ｓ４１１の処理に先立って行うように説明しているが、逆の順序で行っても、また並行して行っても構わない。 In the above description, the processes of steps S407 to S409 are described as being performed prior to the processes of steps S410 and S411. However, the processes may be performed in the reverse order or in parallel.

［符号化対象フレームがＰフレームの場合］
次に、ステップＳ４０４の判定で、符号化対象フレームがＰフレームの場合の処理について説明する。その場合、上述したようにタイル分割部３０２は、ＭＣ予測部３１０にフレームデータをコピーし、ＭＣ予測部３１０は、フレームメモリ３１１に記録されているフレーム（前フレーム）と符号化対象フレームとの間でＭＣ予測をする（ステップＳ４１４）。ここで、図１７に示したように、ＭＣ予測先のデータとして、前フレームのＲＯＩタイルに限定する。これは、タイル符号化データ生成部におけるデータ破棄の累積による非ＲＯＩタイルの画質低減を避けるためである。 [When encoding target frame is P frame]
Next, a process when the encoding target frame is a P frame in the determination in step S404 will be described. In that case, as described above, the tile division unit 302 copies the frame data to the MC prediction unit 310, and the MC prediction unit 310 determines whether the frame (previous frame) recorded in the frame memory 311 and the encoding target frame are included. MC prediction is performed between them (step S414). Here, as shown in FIG. 17, MC prediction destination data is limited to the ROI tile of the previous frame. This is to avoid image quality reduction of non-ROI tiles due to accumulation of data discard in the tile encoded data generation unit.

減算器３１４は、予測結果を基に、前フレームと符号化対象フレームの差を演算する（ステップＳ４１５）。そこで得られる減算結果（差分データ）について、Ｉフレームに対する処理と同様に、離散ウェーブレット変換（ステップＳ４１６）、量子化（ステップＳ４１７）、逆量子化（ステップＳ４１８）、逆離散ウェーブレット変換（ステップＳ４１９）、エントロピー符号化（ステップＳ４２２）、タイル符号化データ生成（ステップＳ４２３）、タイル番号判定（ステップＳ４２４）、画像符号化データ生成（ステップＳ４２６）を実施する。 The subtractor 314 calculates the difference between the previous frame and the encoding target frame based on the prediction result (step S415). As for the subtraction result (difference data) obtained there, discrete wavelet transform (step S416), quantization (step S417), inverse quantization (step S418), and inverse discrete wavelet transform (step S419), as in the case of processing for the I frame. Entropy encoding (step S422), tile encoded data generation (step S423), tile number determination (step S424), and image encoded data generation (step S426) are performed.

Ｉフレームの処理との違いは、和演算器３１５により、差分データと前フレームの和を取って、符号化対象フレームを復元して（ステップＳ４２０）、そこで得られた復号フレームをフレームメモリ３１１に記録する処理（ステップＳ４２１）が存在する点である。上述したステップＳ４１４では、ここで記録された復号フレームを用いてＭＣ予測を行う。 The difference from the processing of the I frame is that the sum calculator 315 calculates the sum of the difference data and the previous frame, restores the encoding target frame (step S420), and stores the obtained decoded frame in the frame memory 311. This is a point that there is a recording process (step S421). In step S414 described above, MC prediction is performed using the decoded frame recorded here.

上記ステップＳ４１４〜Ｓ４２３の処理は、ステップＳ４２４で符号化すべきタイルデータが残っていないと判断されるまで、ステップＳ４２５におけるカウンタｉを１ずつ増やす処理を介して繰り返し行われる。 The processes of steps S414 to S423 are repeatedly performed through the process of incrementing the counter i by 1 in step S425 until it is determined in step S424 that there is no tile data to be encoded.

なお、予測で使用するデータの単位として、タイル、もしくはタイルをさらに分割して得られるブロック等が考えられる。 As a unit of data used in prediction, a tile or a block obtained by further dividing the tile can be considered.

また、図４では、ステップＳ４１８〜Ｓ４２１の処理をステップＳ４２２、Ｓ４２３の処理に先立って行うように説明しているが、逆の順序で行っても、また並行して行っても構わない。 In FIG. 4, the processing of steps S418 to S421 is described as being performed prior to the processing of steps S422 and S423. However, the processing may be performed in the reverse order or in parallel.

上記の通り本第１の実施形態によれば、前フレームにおけるＲＯＩタイルのみをＭＣ予測先と設定することで、タイル符号化データ生成部におけるデータ破棄の累積によるＰフレームの画質低減を回避することができる。 As described above, according to the first embodiment, by setting only the ROI tile in the previous frame as the MC prediction destination, it is possible to avoid image quality reduction of the P frame due to accumulation of data discard in the tile encoded data generation unit. Can do.

（第２の実施形態）
上記第１の実施形態においては、予測先データとしてＲＯＩタイルに限定することで、タイル符号化データ生成部におけるデータ破棄の累積によるＰフレームの画質低減を回避する方法を示した。 (Second Embodiment)
In the first embodiment, the method of avoiding the image quality reduction of the P frame due to accumulation of data discard in the tile encoded data generation unit by limiting to the ROI tile as the prediction destination data has been shown.

一般的に、ユーザはある対象物をＲＯＩと設定し、その対象物を包含するタイルがＲＯＩタイルと決定される。このため、前後するフレーム間でＲＯＩタイルの画素分布や特性は似ている。このため、ＲＯＩタイル間における予測は、高い符号化効率を実現できると考えられる。しかし、ＲＯＩタイルと非ＲＯＩタイル間における予測は、それ程高い符号化効率を実現できないことも考えられる。それ程高い符号化効率を実現できないならば、ＭＣ予測処理は無駄となり得る。そこで本第２の実施形態では、ＲＯＩタイル間のみで、ＭＣ予測を行う。なお、本第２の実施形態は、上記第１の実施形態とは図４に示す符号化処理のステップＳ４１５における処理が異なるだけであるので、その点のみを説明する。 In general, the user sets an object as an ROI, and a tile that includes the object is determined as an ROI tile. For this reason, the pixel distribution and characteristics of the ROI tile are similar between the following frames. For this reason, it is thought that the prediction between ROI tiles can implement | achieve high encoding efficiency. However, it is also conceivable that prediction between ROI tiles and non-ROI tiles cannot achieve such high coding efficiency. If the encoding efficiency is not so high, the MC prediction process can be wasted. Therefore, in the second embodiment, MC prediction is performed only between ROI tiles. Note that the second embodiment is different from the first embodiment only in the process in step S415 of the encoding process shown in FIG. 4, and only this point will be described.

図１８は、本第２の実施形態においてステップＳ４１５で行われる、ＭＣ予測部３１０における処理を示している。ここでは、図１８に示すようにＲＯＩタイル間のみでＭＣ予測を実施し、非ＲＯＩタイルのＭＣ予測は行わない。 FIG. 18 shows the processing in the MC prediction unit 310 performed in step S415 in the second embodiment. Here, as shown in FIG. 18, MC prediction is performed only between ROI tiles, and MC prediction of non-ROI tiles is not performed.

上記の通り本第２の実施形態では、ＲＯＩタイル間のみでＭＣ予測を実施することで、無駄な演算を省いた上で、Ｐフレームの画質低減を回避することができる。 As described above, in the second embodiment, by performing MC prediction only between ROI tiles, it is possible to avoid a reduction in image quality of P frames while omitting useless calculations.

（第３の実施形態）
第３の実施形態では、タイル毎にＲＯＩ領域を設定せずに、離散ウェーブレット変換係数空間上でＲＯＩ領域を設定する。そして、予測先としてＲＯＩ係数に限定することで、Ｐフレームの画質低減を回避する。 (Third embodiment)
In the third embodiment, the ROI region is set on the discrete wavelet transform coefficient space without setting the ROI region for each tile. Then, by limiting to the ROI coefficient as a prediction destination, image quality reduction of the P frame is avoided.

図１９は、本第３の実施形態における符号化部２０６のブロック図である。なお、動画像処理装置は、図２に示すものと同様であるものとする。図１９に示す構成は、第１の実施形態における符号化部２０６のブロック図と比較して、ＲＯＩタイル決定部３１７がＲＯＩ決定部４１７に変わったものである。ＲＯＩタイル決定部３１７はタイル単位で領域を決定するのに対して、ＲＯＩ決定部４１７は画素単位で領域を決定する点に違いがある。例えば、前者のＲＯＩタイル決定部３１７は、不図示のオブジェクト抽出部が抽出した領域を包含するタイルをＲＯＩタイルと決定するのに対して、後者のＲＯＩタイル決定部４１７は、抽出された領域を画素単位でＲＯＩ領域と決定する。 FIG. 19 is a block diagram of the encoding unit 206 in the third embodiment. Note that the moving image processing apparatus is the same as that shown in FIG. The configuration shown in FIG. 19 is obtained by replacing the ROI tile determination unit 317 with the ROI determination unit 417 as compared with the block diagram of the encoding unit 206 in the first embodiment. The ROI tile determining unit 317 determines a region in units of tiles, whereas the ROI determining unit 417 is different in that the region is determined in units of pixels. For example, the former ROI tile determination unit 317 determines a tile that includes a region extracted by an object extraction unit (not shown) as a ROI tile, while the latter ROI tile determination unit 417 determines the extracted region as an ROI tile. The ROI area is determined in pixel units.

また、予測を行うデータが画素から係数に変わったことにより減算部３１４の位置が変更した点、ＲＯＩ部４１８、逆ＲＯＩ部４１９が追加された点、逆離散ウェーブレット変換部３１３が不要になった点に違いがある。なお、逆ＲＯＩ部４１９は、係数を図２１における（ｃ）から（ａ）に処理する。 Further, the position of the subtraction unit 314 is changed due to the change of the data to be predicted from the pixel to the coefficient, the point that the ROI unit 418 and the inverse ROI unit 419 are added, and the inverse discrete wavelet transform unit 313 is no longer necessary. There are differences. The inverse ROI unit 419 processes the coefficients from (c) to (a) in FIG.

また、図２０は、本第３の実施形態における符号化処理を示すフローチャートである。図３のフローチャートと同様の処理には同じ参照番号を付し、説明を省略する。 FIG. 20 is a flowchart showing the encoding process in the third embodiment. The same processes as those in the flowchart of FIG. 3 are denoted by the same reference numerals, and the description thereof is omitted.

［符号化対象フレームがＩフレームの場合］
本第３の実施形態では、符号化対象フレームがＩフレームの場合、離散ウェーブレット変換部３０３により変換された変換係数を量子化した後（ステップＳ４０６）、ステップＳ５０６で、ＲＯＩ部４１８はＲＯＩか否かに応じて、係数量子化値を次式（５）に基づいて変更する。
Q"= Q * 2^B；（Q：ＲＯＩ内の画素から得られた係数量子化値の絶対値）
Q' = Q；（Q：上記以外の係数量子化値の絶対値） …（５） [When encoding target frame is I frame]
In the third embodiment, when the encoding target frame is an I frame, after quantizing the transform coefficient transformed by the discrete wavelet transform unit 303 (step S406), whether or not the ROI unit 418 is an ROI in step S506. Accordingly, the coefficient quantization value is changed based on the following equation (5).
Q "= Q * 2 ^B ; (Q: absolute value of coefficient quantization value obtained from pixels in ROI)
Q '= Q; (Q: absolute value of coefficient quantization value other than above) (5)

ここで、Ｂはサブバンド毎に与えられるものであり、注目サブバンドにおいて、各Q'はいかなるQ"よりも大きくなるように設定される。つまり、Q'の元の係数量子化値を構成するビットと、Q"の元の係数量子化値を構成するビットが、同じ桁に存在することのないように、ビットのシフトアップがなされるのである。
以上の処理により、ＲＯＩと関連する係数量子化値のみがＢビット上方にシフトアップされる。
図２１（ａ）は各サブバンドにおけるＲＯＩと非ＲＯＩを示し、図２１（ｂ）および（ｃ）はシフトアップによる係数量子化値の変化を示す概念図である。図２１（ｂ）において、３つのサブバンドに各々３個の係数量子化値が存在しており、網がけされた係数量子化値がＲＯＩを構成している係数量子化値である。それらはシフトアップ後、図２１（ｃ）のようになる。 Here, B is given for each subband, and each Q ′ is set to be larger than any Q ″ in the target subband. That is, the original coefficient quantization value of Q ′ is configured. Therefore, the bits are shifted up so that the bits constituting the original coefficient quantized value of Q "and the bits constituting Q" do not exist in the same digit.
With the above processing, only the coefficient quantization value related to the ROI is shifted up by B bits.
FIG. 21A shows ROI and non-ROI in each subband, and FIGS. 21B and 21C are conceptual diagrams showing changes in coefficient quantization values due to shift-up. In FIG. 21B, three coefficient quantized values exist in each of the three subbands, and the coefficient quantized values that are shaded are the coefficient quantized values constituting the ROI. After shifting up, they become as shown in FIG.

ステップＳ５０７で、逆ＲＯＩ部４１９によりＲＯＩ部４１８でビットのシフトアップされたＲＯＩ領域をシフトダウンする処理を行う。 In step S507, the reverse ROI unit 419 performs a process of shifting down the ROI region whose bits are shifted up by the ROI unit 418.

［符号化対象フレームがＰフレームの場合］
符号化対象フレームがＰフレームの場合、本第３の実施形態では、まずステップＳ５１４において離散ウェーブレット変換を行う。その後、ステップＳ５１５において離散ウェーブレット変換係数空間上でＭＣ予測を行う。ここで、ＭＣ予測部３１０では、図２２に示すようにＲＯＩ係数に関係するＤＷＴ係数のみを予測対象のデータと限定する。 [When encoding target frame is P frame]
When the encoding target frame is a P frame, in the third embodiment, first, discrete wavelet transform is performed in step S514. Thereafter, MC prediction is performed on the discrete wavelet transform coefficient space in step S515. Here, the MC prediction unit 310 limits only the DWT coefficient related to the ROI coefficient to the prediction target data as shown in FIG.

ステップＳ５１６において予測結果を基に、前フレームと符号化対象フレームの差（差分データ）を演算する。係数量子化部３０５ではこの差分データの量子化を行う（ステップＳ４１７）。その後、ステップＳ５１７において、ＲＯＩ部４１８はＲＯＩか否かに応じて、差分データの係数量子化値を上述した式（５）に基づいて変更する。 In step S516, the difference (difference data) between the previous frame and the encoding target frame is calculated based on the prediction result. The coefficient quantizing unit 305 quantizes the difference data (step S417). Thereafter, in step S517, the ROI unit 418 changes the coefficient quantization value of the difference data based on the above-described equation (5) depending on whether or not the ROI.

ステップＳ５１８において、逆ＲＯＩ部４１９によりＲＯＩ部４１８でビットのシフトアップされたＲＯＩ領域をシフトダウンする処理を行う。 In step S518, the reverse ROI unit 419 performs a process of shifting down the ROI region whose bits are shifted up by the ROI unit 418.

上記の通り本第３の実施形態では、ＲＯＩに関連する係数のみでＭＣ予測を実施することで、Ｐフレームの画質低減を回避することができる。 As described above, in the third embodiment, it is possible to avoid a reduction in the image quality of the P frame by performing MC prediction using only the coefficients related to the ROI.

（その他の実施形態）
上記第１〜第３の実施形態では、離散ウェーブレット変換を対象として発明を説明したが、離散コサイン変換を適用した実施形態も本発明の範疇に入る。 (Other embodiments)
In the first to third embodiments, the invention has been described with respect to the discrete wavelet transform. However, embodiments to which the discrete cosine transform is applied are also within the scope of the present invention.

また、本発明は複数の機器（例えばホストコンピュ−タ、インタ−フェ−ス機器、リ−ダ、プリンタ等）から構成されるシステムの一部として適用しても、１つの機器（例えば複写機、デジタルカメラ等）からなる装置の一部に適用しても良い。 Further, even if the present invention is applied as part of a system composed of a plurality of devices (for example, a host computer, an interface device, a reader, a printer, etc.), a single device (for example, a copying machine) The present invention may be applied to a part of an apparatus including a digital camera.

また、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体（または記録媒体）を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読み出し実行することによっても、達成されることは言うまでもない。この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているオペレーティングシステム（ＯＳ）などが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。ここでプログラムコードを記憶する記憶媒体としては、例えば、フレキシブルディスク、ハードディスク、ＲＯＭ、ＲＡＭ、磁気テープ、不揮発性のメモリカード、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＤＶＤ、光ディスク、光磁気ディスク、ＭＯなどが考えられる。また、ＬＡＮ（ローカル・エリア・ネットワーク）やＷＡＮ（ワイド・エリア・ネットワーク）などのコンピュータネットワークを、プログラムコードを供給するために用いることができる。 Another object of the present invention is to supply a storage medium (or recording medium) in which a program code of software that realizes the functions of the above-described embodiments is recorded to a system or apparatus, and the computer (or CPU or Needless to say, this can also be achieved by the MPU) reading and executing the program code stored in the storage medium. In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and the storage medium storing the program code constitutes the present invention. Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an operating system (OS) running on the computer based on the instruction of the program code. It goes without saying that a case where the function of the above-described embodiment is realized by performing part or all of the actual processing and the processing is included. Examples of the storage medium for storing the program code include a flexible disk, hard disk, ROM, RAM, magnetic tape, nonvolatile memory card, CD-ROM, CD-R, DVD, optical disk, magneto-optical disk, MO, and the like. Can be considered. Also, a computer network such as a LAN (Local Area Network) or a WAN (Wide Area Network) can be used to supply the program code.

さらに、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張カードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれた後、そのプログラムコードの指示に基づき、その機能拡張カードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Furthermore, after the program code read from the storage medium is written into a memory provided in a function expansion card inserted into the computer or a function expansion unit connected to the computer, the function is based on the instruction of the program code. It goes without saying that the CPU or the like provided in the expansion card or the function expansion unit performs part or all of the actual processing and the functions of the above-described embodiments are realized by the processing.

本発明を上記記憶媒体に適用する場合、その記憶媒体には、先に説明したフローチャートに対応するプログラムコードが格納されることになる。 When the present invention is applied to the storage medium, the storage medium stores program codes corresponding to the flowcharts described above.

本発明の実施の形態における符号化対象の動画像の概念を示す図である。It is a figure which shows the concept of the moving image of the encoding target in embodiment of this invention. 本発明の実施の形態における動画像処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image processing apparatus in embodiment of this invention. 本発明の第１の実施形態における符号化部の構成を示すブロック図である。It is a block diagram which shows the structure of the encoding part in the 1st Embodiment of this invention. 本発明の第１の実施形態における符号化処理を示すフローチャートである。It is a flowchart which shows the encoding process in the 1st Embodiment of this invention. タイル分割の説明図である。It is explanatory drawing of tile division | segmentation. ＲＯＩタイルの一例を示す図である。It is a figure which shows an example of a ROI tile. １次元離散ウェーブレット変換の説明図である。It is explanatory drawing of a one-dimensional discrete wavelet transform. （ａ）は４つのサブバンドに分解する図、（ｂ）は（ａ）のＬＬサブバンドを更に４つのサブバンドに分解する図、（ｃ）は（ｂ）のＬＬサブバンドを更に４つのサブバンドに分解する図である。(A) is a diagram that decomposes into four subbands, (b) is a diagram that further decomposes the LL subband of (a) into four subbands, and (c) is a diagram that further decomposes the LL subband of (b) into four subbands. It is a figure which decomposes | disassembles into a subband. 量子化ステップの説明図である。It is explanatory drawing of a quantization step. コードブロック分割の説明図である。It is explanatory drawing of a code block division | segmentation. ビットプレーン分割の説明図である。It is explanatory drawing of bit-plane division | segmentation. コーディングパスの説明図である。It is explanatory drawing of a coding pass. レイヤー生成の説明図である。It is explanatory drawing of a layer production | generation. レイヤー生成の説明図である。It is explanatory drawing of a layer production | generation. タイル符号化データの構成の説明図である。It is explanatory drawing of a structure of tile coding data. フレーム符号化データの構成の説明図である。It is explanatory drawing of a structure of frame coding data. 本発明の第１の実施形態におけるＭＣ予測先のデータの概念を示す図である。It is a figure which shows the concept of the data of MC prediction destination in the 1st Embodiment of this invention. 本発明の第２の実施形態におけるＭＣ予測先のデータの概念を示す図である。It is a figure which shows the concept of the data of MC prediction destination in the 2nd Embodiment of this invention. 本発明の第３の実施形態における符号化部の構成を示すブロック図である。It is a block diagram which shows the structure of the encoding part in the 3rd Embodiment of this invention. 本発明の第３の実施形態における符号化処理を示すフローチャートである。It is a flowchart which shows the encoding process in the 3rd Embodiment of this invention. （ａ）は各サブバンドにおけるＲＯＩと非ＲＯＩを示し、（ｂ）および（ｃ）はシフトアップによる係数量子化値の変化を示す概念図である。(A) shows ROI and non-ROI in each subband, and (b) and (c) are conceptual diagrams showing changes in coefficient quantization values due to shift-up. 本発明の第３の実施形態におけるＭＣ予測先のデータの概念を示す図である。It is a figure which shows the concept of the data of MC prediction destination in the 3rd Embodiment of this invention. ＪＰＥＧ２０００符号化方式による符号化部を示すブロック図である。It is a block diagram which shows the encoding part by a JPEG2000 encoding system. ＪＰＥＧ２０００符号化方式による復号化部を示すブロック図である。It is a block diagram which shows the decoding part by a JPEG2000 encoding system. ＭＰＥＧ符号化方式による符号化部を示すブロック図である。It is a block diagram which shows the encoding part by an MPEG encoding system. 従来のＭＣ予測先のデータの概念を示す図である。It is a figure which shows the concept of the data of the conventional MC prediction destination.

Explanation of symbols

３０１フレームデータ入力部
３０２タイル分割部
３０３離散ウェーブレット変換部
３０４バッファ
３０５係数量子化部
３０６エントロピー符号化部
３０７タイル符号化データ生成部
３０８フレーム符号化データ生成部
３０９フレーム符号化データ出力部
３１０動き補償（ＭＣ）予測部
３１１フレームメモリ
３１２逆係数量子化部
３１３逆離散ウェーブレット変換部
３１４減算器
３１５和演算器
３１６フレーム属性判定部
３１７ＲＯＩタイル決定部
４１７ＲＯＩ決定部
４１８ＲＯＩ部 DESCRIPTION OF SYMBOLS 301 Frame data input part 302 Tile division part 303 Discrete wavelet transformation part 304 Buffer 305 Coefficient quantization part 306 Entropy encoding part 307 Tile encoded data generation part 308 Frame encoded data generation part 309 Frame encoded data output part 310 Motion compensation (MC) Prediction unit 311 Frame memory 312 Inverse coefficient quantization unit 313 Inverse discrete wavelet transform unit 314 Subtractor 315 Sum calculator 316 Frame attribute determination unit 317 ROI tile determination unit 417 ROI determination unit 418 ROI unit

Claims

A video encoding device that encodes a video using inter-frame motion prediction,
A dividing means for dividing each frame into a plurality of divided regions;
A determination means for determining an important area from within the frame;
Within the important area of the previous frame, a pixel set having a high correlation is searched for each divided area of the encoding target frame, and the difference between the data of each divided area and the data of the searched pixel set is obtained. Inter-frame prediction means for outputting difference data by
A moving picture coding apparatus comprising: coding means for coding the difference data.

2. The moving picture encoding apparatus according to claim 1, wherein the encoding unit discards data with priority from an area outside the important area in order to adjust a code amount.

A determination unit for determining whether a frame to be encoded is a frame to be intra-frame encoded or a frame to be encoded between frames;
When the determination unit determines that the frame is an intra-frame encoded frame, the encoding unit encodes the data of each divided region of the encoding target frame without performing the processing by the inter-frame prediction unit. The moving picture coding apparatus according to claim 1 or 2, wherein

The moving image according to any one of claims 1 to 3, wherein the inter-frame prediction unit performs processing only on an important region determined by the determination unit among the divided regions of the encoding target frame. Image encoding device.

The moving image encoding apparatus according to claim 1, wherein the encoding unit performs discrete wavelet transform.

6. The moving image encoding apparatus according to claim 5, wherein the encoding unit performs encoding according to a JPEG2000 encoding method.

The moving image encoding apparatus according to claim 1, wherein the encoding unit performs discrete cosine transform.

A video encoding device that encodes a video using inter-frame motion prediction,
A dividing means for dividing each frame into a plurality of divided regions;
A determination means for determining an important area from within the frame;
A conversion means for performing data conversion for each divided region and generating a conversion coefficient;
From the transform coefficients corresponding to the range of the important area of the previous frame, a transform coefficient having a high correlation is searched for each transform coefficient of each divided area of the frame to be encoded, and the transform coefficient of each divided area and the searched transform An inter-frame prediction means for taking a difference from a coefficient and outputting difference data;
A moving picture coding apparatus comprising: coding means for coding the difference data.

9. The moving picture coding apparatus according to claim 8, wherein the coding unit discards data in preference to an area outside the important area in order to adjust a code amount.

A determination unit for determining whether a frame to be encoded is a frame to be intra-frame encoded or a frame to be encoded between frames;
When the determination unit determines that the frame is an intra-frame encoded frame, the encoding unit does not perform the process by the inter-frame prediction unit, and the encoding unit calculates a transform coefficient for each divided region of the encoding target frame. The moving image encoding apparatus according to claim 8 or 9, wherein encoding is performed.

The inter-frame prediction unit performs processing only on transform coefficients of an important region determined by the determination unit among divided regions of the encoding target frame. Video encoding device.

The moving image encoding apparatus according to claim 8, wherein the conversion unit performs discrete wavelet conversion.

The moving image encoding apparatus according to claim 8, wherein the conversion unit performs discrete cosine conversion.

A moving image encoding method for encoding a moving image using inter-frame motion prediction,
A dividing step of dividing each frame into a plurality of divided regions;
A decision process to determine important areas from within the frame;
Within the important area of the previous frame, a pixel set having a high correlation is searched for each divided area of the encoding target frame, and a difference between the data of each divided area and the searched pixel set data is obtained. Inter-frame prediction process for outputting difference data
And a coding process for coding the difference data.

A moving image encoding method for encoding a moving image for each frame using inter-frame motion prediction,
A dividing step of dividing each frame into a plurality of divided regions;
A decision process to determine important areas from within the frame;
A conversion step of performing data conversion for each divided area and generating a conversion coefficient;
From the transform coefficients corresponding to the range of the important area of the previous frame, a transform coefficient having a high correlation is searched for each transform coefficient of each divided area of the frame to be encoded, and the transform coefficient of each divided area and the searched transform An inter-frame prediction step of taking the difference with the coefficient and outputting the difference data;
And a coding process for coding the difference data.

A program executable by an information processing apparatus, comprising program code for realizing the moving picture coding method according to claim 14.

A storage medium readable by an information processing apparatus, wherein the program according to claim 16 is stored.