JP3655088B2

JP3655088B2 - Wavelet transform device and encoding / decoding device

Info

Publication number: JP3655088B2
Application number: JP10437798A
Authority: JP
Inventors: 啓行 ▲高▼橋; 正喜佐藤
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1997-12-19
Filing date: 1998-04-15
Publication date: 2005-06-02
Anticipated expiration: 2018-04-15
Also published as: JPH11239060A

Description

【０００１】
【発明の属する技術分野】
本発明は、データ圧縮及び伸長の分野に係り、特に、ウェーブレット変換を利用する符号化／復号化装置及びウェーブレット変換装置に関する。
【０００２】
【従来の技術】
データ圧縮は、大量のデータの蓄積や伝送のために非常に有用なツールである。例えば、文書のファクシミリ伝送や、ワールドワイドウェブのような画像の伝送に要する時間は、圧縮を使って画像の再生に必要とされるビット数を減らすと、飛躍的に短縮される。
【０００３】
従来より、多様なデータ圧縮手法が存在している。最も広く普及している圧縮方式としてＪＰＥＧ（Ｊoint Ｐhotographics Ｅxperts Ｇroup）の圧縮方式がある。ＪＰＥＧの圧縮方式においては、入力シンボルまたは輝度データは量子化されてから出力符号語へ変換される。量子化は、データの重要な特徴量を保存しつつ、重要でない特徴量を除去することを目的としている。量子化に先立ち、エネルギー集中をするために変換が用いられるが、ＪＰＥＧではＤＣＴ(ＤiscreteＣosine Ｔransform）が採用されている。ところが、ＤＣＴを用いるＪＰＥＧ方式に対し様々な欠点が指摘されている。例えば、ブロックノイズやモスキートノイズ（蚊が飛んでいるように見えるところから、このように呼ばれる）である。画像信号処理においては、これらの欠点を解消する効率的かつ高精度のデータ圧縮符号化方式を追求することに関心が集まっている。その方式の中に、ウェーブレット（wavelet）ピラミッド処理方式がある。
【０００４】
画像信号のような２次元信号にウェーブレット変換を適用する場合には、入力信号に対し水平方向低域通過型フィルタＨＬ（Ｈolizontal Ｌow）及び水平方向高域通過型フィルタＨＨ（Ｈolizontal Ｈigh）を使用して、水平方向低域信号（Ｓ(smooth)係数）及び水平方向高域信号（Ｄ(detail）係数）に分離し、さらにＳ係数及びＤ係数に対して垂直方向低域通過型フィルタＶＬ（Ｖertical Ｌow）及び垂直方向高域通過型フィルタＶＨ（Ｖertical Ｈigh）を使用して水平方向低域−垂直方向低域信号（ＳＳ係数）、水平方向低域−垂直方向高域信号（ＳＤ係数）、水平方向高域−垂直方向低域信号（ＤＳ係数）、及び水平方向高域−垂直方向高域信号（ＤＤ係数）に分離する。以上の一連の処理をレベルと呼び、１回の水平処理と垂直処理を行った出力をレベル１の出力と呼ぶ。さらに、以上の４種類の信号を周波数帯信号と呼ぶ。レベル２以上の出力を希望するときは、この処理がＳＳ係数に対して再帰的に行われる。レベル２の出力では、ＳＳ係数と、１ＳＤ係数及び２ＳＤ係数、１ＤＳ係数及び２ＤＳ係数、１ＤＤ係数及び２ＤＤ係数、の７つの周波数帯信号が得られる。以上の説明では、まず水平方向にフィルタを適用し、次に垂直方向にフィルタを適用したが、その順序は逆でもよい。
【０００５】
図２８にレベル４までの処理を行う場合の従来の構成を示した。図中、１０００はウェーブレット変換部、１１００は符号化部である。符号化部１１００は、ウェーブレット変換部１０００の出力データを符号化して圧縮コードを出力し、あるいは、外部から入力する同様に圧縮されたコードを復号化して各レベルの周波数帯信号に伸長する機能を持つ。
【０００６】
図２８において、１００１〜１０１２はそれぞれフィルタであり、その中のフィルタ１００１，１００４，１００７，１０１０（以下、これらをfilter１Ｈ，filter２Ｈ，filter３Ｈ，filter４Ｈと表記する）は、水平方向低域通過型フィルタＨＬ及び水平方向高域通過型フィルタＨＨを含む水平方向フィルタである。これらのフィルタ名中の数字１〜４はレベル番号を表し、Ｈは水平方向フィルタであることを意味する。同様に、フィルタ１００２（filter１Ｖ１）と同１００３（filter１Ｖ２），同１００５（filter２Ｖ１）と同１００６（filter２Ｖ２），同１００８（filter３Ｖ１）と同１００９（filter３Ｖ２），同１０１１
（filter４Ｖ１）と同１０１２（filter４Ｖ２）は、垂直方向低域通過型フィルタＶＬ及び垂直方向高域通過型フィルタＶＨを含む垂直方向フィルタである。これらのフィルタ名中のＶは垂直方向フィルタであることを意味し、Ｖの前の数字１〜４はレベル番号を表し、Ｖの後の数字１は水平方向低域信号（Ｓ係数）を入力とするフィルタであることを示し、Ｖの後の数字２は水平方向高域信号（Ｄ係数）を入力とするフィルタであることを示す。以上のフィルタはどのような構成のものでもよいが、以下の説明では、水平方向低域通過型フィルタＨＬ及び垂直方向低域通過型フィルタＶＬとして、２組のデータを用いて演算を行う２タップのフィルタを使用するものとする。また、水平方向高域通過型フィルタＨＨ及び垂直方向高域通過型フィルタＶＨとして、低域通過形フィルタＨＬまたはＶＬの出力であるＳ係数のうち、現在の位置と、１つ前及び１つ後の合計３組のデータを用いて演算を行う６タップのフィルタを使用するものとする。
【０００７】
このようなフィルタを用いた場合の演算の例を図３０に示す。但し、この図におけるデータのマッピングは演算の方法を説明するためのものであり、実際のメモリへのマッピングは例えば図３２から図３５に示すようになることに注意されたい。図３０の（ａ）は水平方向フィルタの処理を説明するもので、［００］は０ライン目の０画素目のデータを意味し、［１２］は１ライン目の２画素目のデータを意味する（このようにライン、画素とも０番目から数えるものとする）。水平方向低域通過型フィルタＨＬの０画素目の出力［Ｓ００］は、［００］データ及び［０１］データから求められ、また、１画素目の出力［Ｓ０１］は［０２］データ及び［０３］データから求められる。これに対し、水平方向高域通過型フィルタＨＨの０画素目の出力［Ｄ００］は、［００］データの２つ前及び１つ前のデータ（実在しない）と、［００］データと、［０１］データと、［０２］データと、［０３］データとから求められる。ここで、実在しない［００］データの２つ前と１つ前のデータを得るため、ミラーと呼ばれる処理を施す。具体的には、データを鏡像関係で折り返す処理を行う。これにより、２つ前と１つ前のデータは［０１］データと［００］データとなる。このようにして、［Ｄ００］は６画素のデータから計算される。
【０００８】
図３０の（ｂ）は垂直方向フィルタの処理を説明している。この処理は、垂直方向フィルタ処理によるＳ係数及びＤ係数を用いて垂直方向に行われる。実在しない係数は、水平方向フィルタの処理の場合と同様にミラー処理が施される。
【０００９】
図３１乃至図３５に、ウェーブレット処理の結果の格納方法を例示する。図３１はフレームメモリにラスタ順に格納されたイメージデータを示す。フレームメモリからデータを読み出して水平処理を行い、その結果を再びフレームメモリに書き込む。この書き込みの際に、未処理のデータに上書きしてしまわないように、例えば図３２に示すようなマッピングでＳ係数及びＤ係数を書き込んでいく。図３２において、［１Ｓ００］はレベル１のアドレス００のＳ係数を意味する。図３３は垂直処理を行った後の各係数を書き込む際のマッピングの例を示す。ここまでがレベル１の各係数の格納方法である。図３４はレベル２の水平方向の各係数の格納方法の例を示す。レベル２の処理は１ＳＳ係数に対してのみ行われるため、網掛けされた部分のデータは用いられないことに注意されたい。ついで、図３５に示すようなマッピングで、レベル２の各係数が格納され、レベル２の処理が終了する。以上の処理がレベル４まで繰り返される。
【００１０】
図２９は、図２８に示す構成のウェーブレット変換部１０００のタイミングチャートである。ただし、このタイミングチャートは処理手順の説明のために用いるものであり、メモリアクセス等に必要な時間は考慮されておらず、横軸（時間軸）のスケールはリニアでないことに注意されたい。また、以下の説明では、画素数もしくはライン数を０画素目もしくは０ライン目、というように０から数える。ｄａｔａより入力されるイメージデータ（ラスタデータ）は３２画素×３２ライン（０から３１）であり、１つのデータの区切り（×＝×）が１ラインに相当するものとする。
【００１１】
時刻ｔ０から、０ライン目のデータが０画素目から順次入力され、１画素目が入力されるとfilter１Ｈより０画素目の［１Ｓ００］データが出力される。ついで［１Ｓ０１］データが出力されると、Ｄ係数の計算に必要となる３組のＳ係数（［１Ｓ００］，［１Ｓ００］，［１Ｓ０１］）が揃い（１つ前のデータはミラー処理により得られた）、Ｄ係数［１Ｄ００］が出力される。これが１ライン分、繰り返される。なお、タイミングチャート上では１ラインの時間単位で示されているが、拡大すれば画素単位でのずれが生じていることに注意されたい。
【００１２】
時刻ｔ１から１ライン目のデータの入力が始まり、filter１Ｈより［１Ｓ１０］、［１Ｄ１０］、とＳ係数及びＤ係数が順次出力される。２Ｈの処理（レベル２の水平方向処理）が始まる。［１Ｓ１０］が出力された時点で１Ｖの処理（レベル１の垂直方向処理）が始まり、filter１Ｖ１より［１ＳＳ００］が、filter１Ｖ２より［１ＤＳ００］が出力される。［１Ｓ１１］が出力された時点でfilter１Ｖ１及びfilter１Ｖ２においてＤ係数の計算に必要な３組のデータが揃う。すなわち、filter１Ｖ１においては［１Ｓ１０］，［１Ｓ１０］，［１Ｓ１１］、filter１Ｖ２においては［１Ｄ１０］，［１Ｄ１０］，［１Ｄ１１］が揃い（１つ前のデータはミラー処理により得られる）、レベル１の出力データ［１ＳＳ００］，［１ＳＤ００］，［１ＤＳ００］，［１ＤＤ００］が得られる。これが１ライン分、繰り返される。
【００１３】
時刻ｔ２で、２Ｖの処理（レベル２の垂直方向処理）のための１ライン目の入力が始まり、２Ｖの処理が始まる。以下、同様のタイミング関係で時刻ｔ９まで処理が繰り返され、レベル４までの各周波数帯信号が出力される。
【００１４】
以上のようにして得られた各レベルの周波数帯信号が符号化部１１００で符号化されて圧縮されるのであるが、符号化は通常、ビット処理が行われるため、周波数帯信号を一旦ストレージに書き込んでおく必要がある。一般に用いられるストレージは半導体メモリである。符号化部１１００ではストレージに書き込まれた各周波数帯信号を参照してビット処理を行って符号化し、生成したコードを
ｃｏｄｅとして出力する。圧縮されたコードからイメージデータへの復元は、以上に述べた動作の逆順で行われる。
【００１５】
なお、本発明に関連する符号化／復号化装置、ウェーブレット変換部、あるいはフィルタについてのより詳細な情報は、特開平８−１１６２６５号公報、特開平８−１３９９３５号公報、特開平９−２７７５２号公報、特開平９−２７９１２号公報などを参照されたい。また、類似の従来技術については、特開平３−２７６８７号公報、特開平５−１６７９９７号公報、特開平５−１８３３８６号を参照されたい。
【００１６】
次に、ウェーブレット変換のための処理時間について説明するが、ここではウェーブレット変換部１０００により生成される各周波数帯信号のストレージとして、一度に読み出し又は書き込みの一方しかできない一般的な半導体メモリ等を用いる場合として説明する。
【００１７】
図２９のタイミングチャートを用いて説明したように、各周波数帯信号は同時刻にパラレルに出力されるため、メモリへの書き込みもパラレルに行われなければならないが、通常用いられるメモリでは１時刻に読み出しまたは書き込みをすることができるのは１データだけである。図２９の左下のｒａｎｇｅは、時刻ｔ０からｔ９に対応する、１Ｈ，１Ｖ，．．．，４Ｖの各処理が占める処理時間の範囲を←→で示したものである。ｒａｎｇｅの下のｒ／ｗ cyclesは、ｒａｎｇｅの各範囲（←→）に必要なメモリアクセス回数で、その範囲内での書き込み回数と読み出し回数の合計であるが、異なるレベルが同時に処理されている範囲での回数は、それら各レベルに関する回数の合計で示されている。図２９の右側に示した数値は、各レベルの水平処理もしくは垂直処理に要するメモリアクセスの回数（書き込みと読み出しの合計数）である。メモリアクセスの回数についてであるが、各レベルにおいて、水平処理、垂直処理のいずれも必ず全データが１回読み出され、全データがフィルタ出力データで書き換えられるから、全画素数の２倍の書き込み／書き込み回数が必要となる。
【００１８】
【発明が解決しようとする課題】
前述のように、符号化は各レベルの各周波数帯信号を用いて行われるのであるが、符号化は通常ビット処理が行われるため、ウェーブレット変換後の出力データを一旦ストレージに貯える必要があり、データを単に入力するのに要する時間に比べて数倍の処理時間がかかるという問題があった。前述したところから明らかなように、例えば入力されるイメージデータのサイズを３２画素×３２ライン、レベル数を４とした場合、イメージデータの入力に必要なサイクル数が１０２４＝３２×３２であるのに対し、必要な処理時間は５倍以上の５４４０サイクルとなる。入力データのサイズが増加すれば、処理時間はさらに大幅に増大することは明かである。例えば、６４画素×６４ラインの場合は、図２９に点線で示すように、１Ｈの処理が時刻ｔ１０まで行われる結果、パラレルに出力される区間が増加するため、処理時間は大幅に増大する。レベル数が増えた場合も同様に処理時間が大幅に増大する。
【００１９】
また、各レベルの各周波数帯信号が同じ時刻に出力されるので、パイプライン処理が必要であった。すなわち、すべてのフィルタのデータ入力タイミングが異なっているため、各フィルタに、それが使用される場所に応じたコントローラを内蔵させ個別的に設計する必要があった。また、これらのコントローラはただ一つの条件の画素数とレベルの組合せにしか対応させることができず、画素数またはレベルの一方又は両方が変更された場合に対応が困難であるという問題があった。さらに、通常はエンコード処理（イメージデータを各周波数帯信号に変換する処理、ウェーブレット順変換）とデコード処理（各周波数帯信号をイメージデータに変換する処理、ウェーブレット逆変換）のシーケンスが異なるが、この相違を考慮して全体のパイプライン動作に破綻を来さないように、注意深い設計が必要があった。
【００２０】
本発明の目的は、ウェーブレット変換装置及びウェーブレット変換を利用する符号化／復号化装置における前述の諸問題点を改善することにあり、より具体的に列挙すれば、より高速の動作を可能にする、パイプライン処理を不要にする、設計もしくは設計変更を容易にする、回路規模の増大を回避する等々である。
【００２１】
【課題を解決するための手段】
前記目的を達成するため、請求項１の発明によるウェーブレット変換装置は、ウェーブレット変換の各レベルに１対１に対応付けられた、互いに独立した複数のメモリと、ウェーブレット変換の全レベルに共通のフィルタ部と、装置の内部のデータ転送及び装置の外部とのデータ転送の制御並びに前記複数のメモリの動作の制御を行う制御部とを具備する構成とされる。請求項２の発明によれば、各レベルに１対１に対応付けられた複数のメモリはそれぞれ、読み出しアドレスと書き込みアドレスが独立し、読み出しと書き込みが同時に可能なタイプのメモリとされる。
【００２２】
前記目的を達成するため、請求項３の発明は、請求項１又は２記載のウェーブレット変換装置において、前記制御部を少なくとも、前記複数のメモリに対しアドレス及びイネーブル信号の制御信号を選択的に供給する部分、装置の内部のデータ転送及び装置の外部とのデータ転送を制御する部分、及びそれら２つの部分の制御及び装置外部との制御情報の授受を行う部分とに分割することを特徴とする。請求項４の発明によれば、制御部を構成する少なくとも３つの部分はそれぞれエンコード時専用の部分とデコード時専用の部分とに分割される。
【００２３】
前記目的を達成するため、請求項５の発明によるウェーブレット変換装置は、ウェーブレット変換の各レベルに１対１に対応付けられた、互いに独立した複数のメモリと、前記複数のメモリ中の最大のメモリと同じワード数を少なくとも有するウェーブレット変換の全レベルに共通のバッファメモリと、ウェーブレット変換の全レベルに共通のフィルタ部と、装置の内部のデータ転送及び装置の外部とのデータ転送並びに前記複数のメモリ及び前記バッファメモリの動作を制御する制御部とを具備する構成とされる。請求項６の発明によれば、請求項５記載のフィルタ部は独立した２つのフィルタ部から構成され、この２つのフィルタ部の一方を水平処理に割り当て、他方を垂直処理に割り当てる。
【００２４】
前記目的を達成するため、請求項７の発明によるウェーブレット変換装置は、ウェーブレット変換の各レベルに１対１に対応付けられた、互いに独立した複数のメモリと、ウェーブレット変換の全レベルに共通のラインメモリと、ウェーブレット変換の全レベルに共通のフィルタ部と、装置の内部のデータ転送及び装置の外部とのデータ転送並びに前記複数のメモリ及び前記ラインメモリの動作を制御する制御部とを具備する構成とされる。請求項８の発明によれば、請求項７記載のフィルタ部として水平処理用フィルタ部と垂直処理用フィルタ部を具備せしめる。
【００２５】
前記目的を達成するため、請求項９の発明は、請求項１又は２記載のウェーブレット変換装置において、前記複数のメモリをそれぞれウェーブレット変換係数の種類数と等しい個数の独立したメモリとし、前記フィルタ部として水平処理及び垂直処理の両方に利用される独立した２つのフィルタ部を具備することを特徴とする。
【００２６】
前記目的を達成するため、請求項１０の発明は、請求項５記載のウェーブレット変換装置において、前記複数のメモリの各メモリ及び前記バッファメモリをそれぞれウェーブレット変換係数の種類数と等しい個数の独立したメモリとし、前記フィルタ部として水平処理及び垂直処理の両方に利用される独立した２つのフィルタ部を具備することを特徴とする。
【００２７】
前記目的を達成するため、請求項１１の発明は、請求項１又は２記載のウェーブレット変換装置において、前記複数のメモリをそれぞれウェーブレット変換係数の全種類のビット深さの総和に等しいビット深さを少なくとも持つメモリとすることを特徴とする。
【００２８】
前記目的を達成するため、請求項１２の発明は、請求項５記載のウェーブレット変換装置において、前記複数のメモリの各メモリ及び前記バッファメモリをそれぞれウェーブレット変換係数の全種類のビット深さの総和に等しいビット深さを持つメモリとすることを特徴とする。
【００３２】
また、前記目的を達成するため、請求項１３の発明による符号化／復号化装置は、請求項１乃至１２のいずれか１項記載のウェーブレット変換装置と、該ウェーブレット変換装置と接続し、該ウェーブレット変換装置の出力データを符号化し、又は外部から入力される符号化データを復号化して該ウェーブレット変換装置に入力する符号化部を具備する構成とされる。
【００３３】
【発明の実施の形態】
以下、図面を用いて本発明の実施の形態を説明する。なお、説明の煩雑化を避けるため、複数の図面において、対応する部分には同一又は類似の参照符号が用いられる。また、複数の実施例で同一又は同様の構成については、関連した図面から適宜省略しまたは簡略化し、あるいは他の実施例の図面を援用する。
【００３４】
＜第１実施例＞
図１は本発明の第１実施例による符号化／復号化装置のブロック図である。この符号化／復号化装置は、ウェーブレット変換部１００と符号化部２００からなる。ウェーブレット変換部１００は、レベル４まで処理可能である。もちろん、レベル数はいくつに設定してもよく、レベル数が増えるほど本発明はその効果を発揮する。さらに、入力データ数（画素数）が増えるほど本発明はその効果を発揮する。
【００３５】
ウェーブレット変換部１００は、レベル１，２，３，４の周波数帯信号データをそれぞれ格納するためのメモリ１０２_1〜１０２_4（以下、ｍｅｍ１，ｍｅｍ２，ｍｅｍ３，ｍｅｍ４と表記）と、制御信号選択部（s_mux）１１１、データ選択部（d_mux）１１４、及びそれらを制御する主制御部（main）１１８からなる制御部（cnotroller)１１０と、低域通過型フィルタ及び高域通過型フィルタを含むフィルタ部（filter）１３０とから構成される。
【００３６】
ウェーブレット変換部１００は、外部とのイメージデータの入出力ｄａｔａと符号化部２００との間の周波数帯信号データの入出力を持つ。エンコード時には、ウェーブレット変換部１００に一度に処理できる最大数（画素数）以下のイメージデータが外部から入出力ｄａｔａより入力される。このデータの最大数はｍｅｍ１のワード数Ｗ１と等しい。ｍｅｍ２のワード数Ｗ２は、Ｗ２＝Ｗ１×（１／４）に設定する。ｍｅｍ３のワード数Ｗ３はＷ３＝Ｗ２×（１／４）に設定し、ｍｅｍ４のワード数Ｗ４はＷ４＝Ｗ３×（１／４）に設定する。したがって、メモリｍｅｍ１〜ｍｅｍ４の総ワード数は、入力データ総数に対して
1+(1/4)+(1/4)^2+(1/4)^3 ＝１＋２１／６４
となり、約３３％の増加となる。
【００３７】
制御部１１０の主制御部１１８は、外部との制御信号の授受及び制御信号選択部１１１及びデータ選択部１１４の制御を行う部分であり、制御信号選択部１１１はメモリｍｅｍ１〜ｍｅｍ４のアクセス制御を行う部分である。データ選択部１１４は、ウェーブレット変換部１００の内部のデータ転送の制御及び外部とのイメージデータの転送の制御及び符号化部２００との間の周波数帯信号データの転送の制御を行う部分である。主制御部１１８は、ウェーブレット変換を開始させるための外部入力ｓｔａｒｔ、エンコード（ウェーブレット順変換）又はデコード（ウェーブレット逆変換）の動作を選択するための外部入力ｄｉｒ、ウェーブレット変換の終了を外部に通知するための出力ｅｎｄ、制御信号選択部１１１に対する制御信号の出力ｃｎｔ１、データ選択部１１４に対する制御信号の出力ｃｎｔ２を有する。制御信号選択部１１１の出力mem_cnt1〜mem_cnt4はメモリに対する制御信号で、アドレス及び一連のイネーブル信号等を含む。データ選択部１１４はメモリに対する出力mem_out1〜mem_out4と、メモリからの入力mem_in1〜mem_in4を持ち、さらにフィルタ部１３０に対する出力fil_outと、フィルタ部１３０からの入力fil_inを有する。
【００３８】
以下、本実施例の動作について、図２及び図３に示したメモリマップと、図４及び図５に示したタイミングチャートを参照して説明する。
【００３９】
まず、エンコード動作について説明する。制御部１１０の主制御部１１８にｓｔａｒｔ信号が入力され、それから数画素時間程度遅れてｄａｔａよりイメージデータが入力されるものとする。図４は、このエンコード時のタイミングチャートである。エンコード動作であることはｄｉｒ信号により通知される。
【００４０】
図４の時刻ｔ０でｓｔａｒｔ信号が入力されると、主制御部１１８内のレジスタ等がリセットされ、レベル１のエンコード処理であること、かつ水平処理であることがｃｎｔ１及びｃｎｔ２を通して制御信号選択部１１１及びデータ選択部１１４にそれぞれ通知される。これらの信号を受けて、まず入力されたイメージデータがデータ選択部１１４によりfil_outを通してフィルタ部１３０に入力され、フィルタ部１３０より水平方向の低域通過型フィルタからのＳ係数及び高域通過型フィルタからのＤ係数がfil_inを通してデータ選択部１１４に出力される。データ選択部１１４に入力されたデータは、mem_out1を通してメモリｍｅｍ１に書き込まれる。この時、メモリｍｅｍ１に対し、書き込みアドレス及び各種イネーブル信号が制御信号選択部１１１よりmem_cnt1を通して与えられる。これらの制御のタイミングは主制御部１１８によりスケジューリングされる。これらの動作が時刻ｔ１までラスタ順に行われ、水平方向の処理が終了する。
【００４１】
３２画素×３２ラインのイメージデータを入力した場合、レベル１の水平処理（１Ｈ）に必要なリード／ライトサイクルの総数は１０２４サイクルとなる。Ｓ，Ｄ係数データは、例えば図２に示すように、メモリｍｅｍ１の左半分にＳデータが、右半分にＤデータがそれぞれ書き込まれる。
【００４２】
ついで、主制御部１１８より、垂直処理であることがｃｎｔ１及びｃｎｔ２を通して制御信号選択部１１１及びデータ選択部１１４にそれぞれ通知される。これらの信号を受けて、制御信号選択部１１１からメモリｍｅｍ１に対し、読み出しアドレス及び各種イネーブル信号がmem_cnt1を通して出力される。ただし、この際に行われるアドレッシングは、ラスタ方向（水平方向）ではなく、ライン方向（垂直方向）であることに注意されたい。まず、１Ｈの処理により得られたＳ係数がメモリｍｅｍ１からmem_in1に出力され、これがデータ選択部１１４によりfil_outを通してフィルタ部１３０に入力される。そして、フィルタ部１３０より、垂直方向の低域通過型フィルタからのＳＳ係数及び高域通過型フィルタからのＳＤ係数がfil_inを通してデータ選択部１１４に出力される。次に、１Ｈの処理結果であるＤ係数が同様にメモリｍｅｍ１からmem_in1に出力され、これがデータ選択部１１４からfil_outを通してフィルタ部１３０に入力される。そして、フィルタ部１３０より垂直方向の低域通過型フィルタからのＤＳ係数及び高域通過型フィルタからのＤＤ係数がfil_inを通してデータ選択部１１４に出力される。ただし、ここでは最初にＳ係数を処理し、次にＤ係数を処理したが、順番はそれと逆にしても構わない。さて、フィルタ部１３０からデータ選択部１１４に入力されたＳＳ，ＳＤ，ＤＳ，ＤＤの各係数データは、mem_out1を通してメモリｍｅｍ１に書き込まれる。この時、メモリｍｅｍ１に対する書き込みアドレス及び各種イネーブル信号は、制御信号選択部１１１よりmem_cnt1を通して与えられる。以上の動作が時刻ｔ２まで繰り返し行われ、レベル１の垂直処理（１Ｖ）が終了する。
【００４３】
３２画素×３２ラインのイメージデータを入力した場合、１Ｖの処理に必要なリード／ライトサイクルの総数は、フィルタ部１３０への入力がＳ係数とＤ係数の２種類になるため、２０４８サイクルとなる。
【００４４】
１Ｖ処理の結果のデータは、例えば図３に示すように、メモリｍｅｍ１の左上四半分にＳＳ係数が、左下四半分にＳＤ係数が、右上四半分にＤＳ係数が、右下四半分にＤＤ係数が、それぞれ書き込まれる。
【００４５】
次に時刻ｔ２から時刻ｔ４まで、レベル２の処理が行われるのであるが、その水平処理の対象となるデータはレベル１のＳＳ係数である。したがって、時刻ｔ２から時刻ｔ３までのレベル２の水平処理（２Ｈ）においては、制御信号選択部１１１及びデータ選択部１１４は、メモリｍｅｍ１のＳＳ係数が格納されている領域から入力データを読み出し、フィルタ部１３０の出力データをレベル２のためのメモリｍｅｍ２に書き込むことになり、１Ｈの処理の場合と同様に、データの読み出しと書き込みを同時に行うことができる。レベル２の垂直処理（２Ｖ）では、メモリｍｅｍ２に対しデータの読み出しと書き込みを行うことになる。したがって、２Ｈ処理及び２Ｖ処理に必要なリード／ライトサイクルの総数はそれぞれ、２５６サイクルと５１２サイクルとなる。
【００４６】
続いて時刻ｔ４からｔ６までがレベル３の処理であり、その水平処理（３Ｈ）ではメモリｍｅｍ２のＳＳ係数の読み出しとメモリｍｅｍ３へのデータの書き込みが同時に行われ、垂直処理（３Ｖ）ではメモリｍｅｍ３に対する読み出しと書き込みが行われる。したがって、３Ｈ処理及び３Ｖ処理に必要なリード／ライトサイクルの総数はそれぞれ、６４サイクルと１２８サイクルとなる。続いて時刻ｔ６からｔ８までがレベル４の処理となり、その水平処理（４Ｈ）及び垂直処理（４Ｖ）に必要なリード／ライトサイクルの総数はそれぞれ、１６サイクルと３２サイクルとなる。したがって、レベル１からレベル４までのリード／ライトサイクルの総数は、４０８０サイクルとなる。
【００４７】
次にデコード時の動作を説明する。図５はデコード時のタイミングチャートである。デコード時は、符号化部２００から周波数帯信号データがウェーブレット変換部１００に入力され、ウェーブレット変換部１００はエンコード時とは逆順で動作してウェーブレット逆変換を行い、復元したイメージデータをｄａｔａより外部へ出力する。デコード動作であることはｄｉｒ信号で通知される。
【００４８】
図５において、時刻ｔ０でｓｔａｒｔ信号が入力されると、入力されたレベル４の係数データに対するレベル４の垂直処理が開始する。入力された係数はデータ選択部１１４からfilter_outを通してフィルタ部１３０へ送られ、フィルタ部１３０からの出力データがfilter_inを通してデータ選択部１１４に入力し、これが制御信号選択部１１１の制御によってメモリｍｅｍ４に書き込まれる。これは時刻ｔ１で終了し、次にレベル４の水平処理が開始され、メモリｍｅｍ４のデータが読み出されてデータ選択部１１４を介してフィルタ部１３０に送られ、フィルタ部１３０から出力されるデータ、つまりレベル３のＳＳ係数がデータ選択部１１４を介してメモリｍｅｍ３の対応領域に書き込まれる。この垂直処理は時刻ｔ２で終了する。時刻ｔ２からはレベル３の処理であり、入力されるレベル３のＳＤ，ＤＳ，ＤＤ係数とメモリｍｅｍ３より読み出されるＳＳ係数に対する垂直処理が行われ、その結果はメモリｍｅｍ３の対応領域に書き込まれる。レベル３の垂直処理は時刻ｔ３で終了し、続いてレベル３の水平処理が行われ、その結果データつまりレベル２のＳＳ係数データはメモリｍｅｍ２の対応領域に書き込まれる。時刻ｔ４でレベル３の処理は終了し、続いてレベル２の垂直処理が行われ、時刻ｔ５からレベル２の水平処理が行われる。時刻ｔ６から時刻ｔ７までレベル１の垂直処理、ついで時刻ｔ８までレベル１の水平処理が行われ、復元されたイメージデータはｄａｔａを通して外部に出力される。
【００４９】
各レベルにおいて、水平方向、垂直方向ともに入力画素数に対し出力画素数は２倍になる。したがって、例えば時刻ｔ０からｔ１までの時間もくしは時刻ｔ１からｔ２までの時間に対し、時刻ｔ２からｔ３までの時間もしくは時刻ｔ２からｔ３までの時間は４倍となる。よって、デコード動作に必要なサイクル数は、エンコード時と同様に４０８０サイクルである。
【００５０】
なお、図２及び図３に示したメモリマップは、処理のイメージを掴みやすくするために示した一例にすぎず、アドレッシングの容易さ、あるいは設計の容易さを考慮して、いかような形態にしようと自由である。なぜならば、メモリｍｅｍ１〜ｍｅｍ４は各レベルの演算において独立しており、周波数帯信号を最終的に蓄積しておくストレージであるとともに、水平処理の結果を一時的に蓄積するバッファメモリとしても機能しているためである。
【００５１】
＜第２実施例＞
図６は本発明の第２実施例を示すブロック図である。ただし、符号化部は省略されており、ウェーブレット変換部だけが示されている。
【００５２】
本実施例の構成はＩＣ化に好適なものである。すなわち、前記第１実施例では、メモリｍｅｍ１〜ｍｅｍ４として、最も一般的なメモリ（アドレスが１種類、つまりリードアドレスとライトアドレスが共通なもの）を使用した場合で説明したが、本実施例ではＩＣ内で比較的よく使用されるメモリのような、リードアドレスとライトアドレスが独立しており、読み出しと書き込みを同時に行うことが可能なメモリが使用される。したがって、メモリｍｅｍ１〜ｍｅｍ４は２種類のアドレス入力ｒａ（リード時専用アドレス入力），ｗａ（ライト時専用アドレス入力）、それに対応した２種類のイネーブル入力ｒｅｂ（リード時専用イネーブルバー），ｗｅｂ（ライト時専用イネーブルバー）及びデータ入力ｉ及びデータ出力ｏを持っている。
【００５３】
制御信号選択部（s_mux）１１１はメモリｍｅｍ１〜ｍｅｍ４に対してｒａ１〜ｒａ４、ｗａ１〜ｗａ４、ｒｅｂ１〜ｒｅｂ４、及びｗｅｂ１〜ｗｅｂ４の信号を出力する。これら信号の基になる信号ｒａ，ｗａ，ｒｅｂ，ｗｅｂの各信号が主制御部（ｍａｉｎ）１１８から制御信号選択部１１１へ入力される。さらに、制御信号選択部１１１は、信号出力をレベルに応じて選択するために、主制御部よりhb_s_mux(holizontal bar, s_mux)及びlevel_s_mux(level, s_mux)の各信号が入力される。エンコード／デコードの状態は外部よりｄｉｒ（direction）信号により与えられる。ｓｔａｒｔ信号も同様に外部から与えられる。
【００５４】
データ選択部（d_mux）１１４はメモリｍｅｍ１〜ｍｅｍ４の出力信号ｏ１〜ｏ４、及びフィルタ部（ｆｉｌｔｅｒ）１３０の出力信号ｓｉｎ，ｄｉｎ，ｘｉｎ１，ｘｉｎ２が入力し、また、メモリｍｅｍ１〜ｍｅｍ４への入力信号ｉ１〜ｉ４、及びフィルタ部（filter）１３０への入力信号ｓｏ，ｄｏ，ｘｏ１，ｘｏ２を出力する。
【００５５】
以下、本実施例のエンコード時の動作を図３６のタイミングチャートに基づいて説明する。なお、タイムスケールは、例えば時刻ｔ０からｔ１の間で全てのイメージデータ（ここでは３２画素×３２ライン、トータル１０２４画素）がｄａｔａより入力されるものとして規定している。図中の信号の中には画素単位で変化する信号もあるが、画素単位では描ききれないため、以下のように簡略化している。すなわち、Ｘはdon't care信号であり、ｘ＝ｘはビット深さを持つ信号であり、ｗｅｂやｓｔａｒｔ信号は￣|_|￣のようなカドのある信号（ｘ＝ｘ以外）で、かつハイ／ローの２レベルの信号が同時刻に描かれている信号は１ビットの信号である。
【００５６】
さて、時刻ｔ０にｓｔａｒｔ信号が入力され、これに同期してイメージデータがｄａｔａより入力される。主制御部（ｍａｉｎ）１１８からwa,web,hb_mux，level_s_mux，hb_d_mux及びlevel_d_muxの各信号が出力される。制御信号選択部（s_mux）１１１及びデータ選択部（d_mux）１１４では、level_s_mux，hb_d_mux及びlevel_d_muxの各信号に基づいて、アクセスすべきメモリに対する入出力信号を選択する。図３６においては、level_s_mux及びlevel_d_mux、hb_d_mux及びhb_d_muxが同じ信号として描かれているが、画素単位で時間のずれを生じる可能性があることに注意されたい。時刻ｔ０からｔ１までは入力されたイメージデータに対してレベル１の水平処理が行われ、結果がメモリｍｅｍ１に書き込まれる。アクセスすべきメモリの選択は、信号level_s_mux及びlevel_d_muxが１であることで通知され、水平処理であることは信号ｈｂ＿ｄ＿ｍｕｘ及びhb_d_muxがローレベルであることで通知される。時刻ｔ１からｔ２まではメモリｍｅｍ１に記憶されているＳ係数及びＤ係数に対してレベル１の垂直処理が行われ、その結果がメモリｍｅｍ１に書き込まれる。アクセスすべきメモリの選択は水平処理の時と同様に信号level_s_mux及びlevel_d_muxが１であることで通知され、垂直処理であることは信号hb_s_mux及びhb_d_muxがハイレベルであることで通知される。本実施例では、リードアドレスとライトアドレスが独立しているメモリを用いており、リード／ライトを同時刻に行うことができるので、図３６のタイミングチャートに示すような動作が可能となる。
【００５７】
時刻ｔ２からレベル２の処理が開始される。今度はメモリ１に得られたレベル１のＳＳ係数データ（１ＳＳ）が処理対象となるが、メモリｍｅｍ１，ｍｅｍ２のアドレス信号が独立しているので、レベル１の処理シーケンスをほぼそのままレベル２の処理シーケンスに適用することが可能となる（入力がイメージデータから１ＳＳ係数データに変わり、そのサイズが１／４になるだけである）。したがって、時刻ｔ４まで継続されるレベル２の処理は、レベル１で説明した動作を参照すれば容易に理解されよう。
【００５８】
ついで、時刻ｔ４からｔ６の区間でレベル３の処理が、時刻ｔ６からｔ８の区間でレベル４の処理が行われる。前記第１実施例で求めたと同様にして、本実施例についてメモリアクセスの総数を計算すると、
32x32x2(H,V)+16x16x2+8x8x2+4x4x2＝２７２０サイクル
となる。ただし、複数のアクセスを行う場合でも、それが同時刻で可能であれば、それを１回と数えた。この回数は従来技術の１／２であり、前記第１実施例と比べても２／３のアクセス回数で済む。
【００５９】
デコード時の動作は、以上に説明した動作と逆順になるが、これは以上の説明及び前記第１実施例に関連した説明から容易に理解されるであろうから、説明を割愛する。
【００６０】
＜第３実施例＞
図７は本発明の第３実施例を示すブロック図である。本実施例の全体的なブロック構成は基本的に図６に示したものと同様であるが、ウェーブレット変換部のメモリｍｅｍ１〜ｍｅｍ４及びフィルタ部１３０は省略され、制御部１１０のみ描かれている。
【００６１】
本実施例では、制御部１１０はエンコード用のブロックとデコード用のブロックがそれぞれ専用に割り当てられる構成とされる。すなわち、主制御部１１８、制御信号選択部１１１及びデータ選択部１１４は、エンコード専用のｅｍａｉｎブロック１１８ｅ、es_muxブロック１１８ｄ及びed_muxブロック１１４ｅと、デコード専用のｄｍａｉｎブロック１１８ｄ，ds_muxブロック１１１ｄ及びdd_muxブロック１１４ｄに分けられ、さらに主選択部（m_mux）１２０と開始終了制御部（ｓｅ）１２２が追加されている。制御部に対する入出力は図６と同様である。
【００６２】
さて、エンコード用のｅｍａｉｎ，es_mux及びed_muxの各ブロックについて注目すれば、その構成は図６の構成と同一であることが理解されよう。同様に、デコード用のｄｍａｉｎ，ds_mux及びdd_muxの各ブロックについても、その構成が図６の構成と同一であることが理解されよう。追加された主選択部（m_mux）１２０は、外部からの入力信号ｄｉｒに従って、es_mux及びed_muxの各ブロックの入力又は出力と、ds_mux及びdd_muxの各ブロックの入力又は出力を単に切り替えるに過ぎない。追加された開始終了制御部（ｓｅ）１２２は、外部からの入力信号ｓｔａｒｔ，ｄｉｒに従って、エンコード用のブロックを動作させるか、デコード用のブロックを動作させるかを選択するに過ぎない。すなわち、これらの２つのブロックの設計は非常に容易で、かつ、回路規模は非常に小さい。したがって、設計者は、メインブロックであるエンコード用の３つのブロック１１８ｅ，１１１ｅ，１１４ｅとデコード用の３つのブロック１１８ｄ，１１１ｄ，１１４ｄの設計に注力すればよい。しかも、エンコード用とデコード用それぞれのブロックの設計を別々に、あるいは独立して行うことができるため、設計の期間及び効率を飛躍的に向上させることが可能になる。
【００６３】
＜第４実施例＞
図８は本発明の第４実施例を示すブロック図であり、ウェーブレット変換部だけを表している。本実施例のウェーブレット変換部のブロック構成は、基本的に前記第１実施例又は第２実施例と同様であるが、本実施例ではメモリｍｅｍ１〜ｍｅｍ４とは独立に、バッファメモリ１０６（以下ｂｍｅｍと表記）を配置していることが異なる。バッファメモリｂｍｅｍは、メモリｍｅｍ１と同じワード数を有し、書き込みアドレスと読み出しアドレスが独立している。制御信号選択部１１１とデータ選択部１１４に、バッファメモリｂｍｅｍとの間の配線が追加されることは当然である。
【００６４】
本実施例における動作は、図９及び図４のタイミングチャートを参照すれば容易に理解されよう。すなわち、図４に示されるように前記第１実施例又は第２実施例では、水平処理の結果であるＳ係数及びＤ係数の書き込みはメモリｍｅｍ１〜ｍｅｍ４に再帰的に行われるが、本実施例では、図９に見られるように、水平処理の結果の書き込みはバッファメモリｂｍｅｍになされる。したがって、本実施例における総サイクル数は、３２画素×３２ラインのイメージデータの場合、前記第１実施例及び第２実施例と同様に２７２０サイクルとなり、従来技術の１／２の処理時間で済むことが理解されよう。
【００６５】
バッファメモリｂｍｅｍは、ウェーブレット処理中に水平処理の結果であるＳ係数及びＤ係数を格納するために用いられるが、ウェーブレット処理が終了した後は他のブロックで使用しても何ら問題がない。換言すれば、符号化部でワークメモリが必要であれば、このバッファメモリｂｍｅｍをワークメモリとして利用することができるのである。通常はワークメモリが用意されるケースが殆どであるので、本実施例の構成は好適である。
【００６６】
さらに、本実施例は、前記第１実施例で説明した、最も一般的なメモリをメモリｍｅｍ１〜ｍｅｍ４に使用した場合にその効果を発揮する。すなわち、ＩＣ内で比較的よく使用される、リードアドレスとライトアドレスが独立しているメモリを使用する必要がなくなるのである。リードアドレスとライトアドレスが独立したメモリは、多機能である故に、両アドレスが独立していない最も一般的なメモリに比べて面積が大きいという欠点がある。本実施例によれば、最小のメモリエリアで前記第３実施例と同等の高速処理を実現できる。
【００６７】
＜第５実施例＞
本発明の第５実施例によれば、前記第４実施例と同様な構成において、図１０に示すように、２つのフィルタ部（filterＨ）１３０ｈとフィルタ部（filterＶ）１３０ｖを用意し、前者を水平処理専用に割り当て、後者を垂直処理専用に割り当てる。なお、図１０では省略されているが、本実施例においては、メモリｍｅｍ１〜ｍｅｍ４及びバッファメモリｂｍｅｍすべてにリードアドレスとライトアドレスが独立しているメモリを用いている点に注意されたい。
【００６８】
本実施例の動作を図１１のタイミングチャートに基づいて説明する。時刻ｔ０でイメージデータ（ｄａｔａ）の入力が開始されると、これがフィルタ部１３０ｈに順次入力され、得られたＳ係数及びＤ係数はバッファメモリｂｍｅｍに書き込まれる。時刻ｔ１から１ライン目（ライン及び画素を０番目から数えていることに注意されたい）の水平処理が開始され、時刻ｔ２から２ライン目の水平処理が開始される。２ライン目の１画素目のＳもしくはＤ係数が出力されると、垂直処理におけるＤ係数の計算のためのデータが揃い、フィルタ１３０ｖ部を利用した垂直処理も開始される。ここで、垂直処理におけるＳＳ係数及びＤＳ係数の計算のためのデータは１ライン目の１画素目のＳもしくはＤ係数が出力された時点に既に揃い、出力されているが、このタイミングチャートでは垂直処理について記していることに注意されたい。さて、イメージデータはラスタ順で順次入力され、水平処理が施されたＳ及びＤ係数も順次バッファメモリｂｍｅｍに書き込むと同時に垂直方向の読み出しが行われ、さらに同時に垂直処理が施されたＳＳ，ＳＤ，ＤＳ及びＤＤ係数の書き込みが行われる。このような動作は２組のフィルタ部１３０ｈ，１３０ｖを備える構成であるが故に可能である。時刻ｔ４でレベル１の処理が終わってレベル２の処理が始まり、時刻ｔ７でレベル２の処理が終わりレベル３の処理が開始し、レベルｔ１０でレベル３の処理が終了してレベル４の処理が開始し、時刻ｔ１３で全処理が終了する。
【００６９】
以上のような水平処理と垂直処理の同時実行により、３２画素×３２ラインのイメージデータを処理する場合、本実施例における総サイクル数は１４８０サイクルとなり、これは従来技術の総サイクル数の約１／４であるため、非常な高速動作が可能である。
【００７０】
＜第６実施例＞
図１２は本発明の第６実施例を示すブロック図である。本実施例のブロック構成は基本的に前記第２実施例の場合と同様であるが、本実施例では、図１２に見られるように、Ｓ係数格納用の３本のラインメモリ１０７ｓ（line mem（Ｓ０），line mem（Ｓ１），line mem（Ｓ２））及びＤ係数格納用の３本のラインメモリ１０７ｄ（line mem（Ｄ０），line mem（Ｄ１），line mem（Ｄ２））を備え、さらにデータ選択部（d_mux）１１４がこれらラインメモリに対する信号線を備えている。
【００７１】
本発明の動作を図１３のタイミングチャートに基づいて説明する。時刻ｔ０でイメージデータ（ｄａｔａ）の入力が開始され、フィルタ部１３０に入力されて水平処理され、得られたＳ係数はラインメモリ１０７ｓ中のline mem(S0)に、Ｄ係数はラインメモリ１０７ｄ中のline mem(D0)に、それぞれ書き込まれる。時刻ｔ１から１ライン目の水平処理が開始されると、lime mem(S0)に書き込まれたＳ係数はline mem(S1)に順次シフトされる。line mem(D0)のＤ係数についても同様である。時刻ｔ２において１ライン目の水平処理が終了すると、垂直処理における各周波数帯信号の計算のためのデータが揃う。ここで一旦、イメージデータの入力が停止され、ラインメモリ１０７ｓに書き込まれたＳ係数及びラインメモリ１０７ｖに書き込まれたＤ係数に対して垂直処理が施され、０ライン目のＳＳ，ＳＤ，ＤＳ及びＤＤ係数がメモリｍｅｍ１に書き込まれる。時刻ｔ３でイメージデータの入力が再開され、３ライン目のＳ係数がline mem(S0)に、Ｄ係数がline mem(D0)にそれぞれ書き込まれる。１ライン前のＳ係数及びＤ係数はline mem(S1)及びline mem(D1)にシフトされる。この処理が順次施されれる。最後の２ライン分の垂直処理（時刻ｔ４からｔ５、時刻ｔ５からｔ６）は必要なＳ係数及びＤ係数が全て揃っている（最終ラインに対しては、ミラー処理が施される）ので、続けて処理される。時刻ｔ６でレベル１の処理が終了する。以下、時刻ｔ６からｔ９がレベル２の処理、時刻ｔ９からｔ１０がレベル３の処理、時刻ｔ１０からｔ１１がレベル４の処理である。
【００７２】
本実施例で用いられる６本のラインメモリはパラレルで読み出しを行うことができれば、一般的なレジスタ、あるいはシフトレジスタ等、どのようなものを用いてもよい。３２画素×３２ラインのイメージデータを処理する場合、本実施例での総サイクル数は
(32x32x2)+(16x16x2)+(8x8x2)+(4x4x2)=２７２０サイクル
となることが分かる。この処理サイクル数は従来技術の約１／２の処理時間であり、高速動作を小さな回路規模で実現することができる。
【００７３】
＜第７実施例＞
図１４は本発明の第７実施例を示すブロック図である。本実施例のブロック構成は基本的に前記第６実施例（図１２）と同様であるが、本実施例では図１４に見られるように２つのフィルタ部１３０ｈ，１３０ｖを備え、それぞれを水平処理と垂直処理専用に割り当てる。
【００７４】
本実施例の動作を図１５のタイミングチャートに基づいて説明する。時刻ｔ０でイメージデータ（ｄａｔａ）の入力が開始されると、Ｓ係数がline mem(S0)に、ｄ係数がline mem(D0)にそれぞれ書き込まれる。時刻ｔ１から１ライン目の水平処理が開始されると、line mem(S0)に書き込まれたＳ係数はline mem(S1)に順次シフトされる。line mem(D0)のＤ係数についても同様である。時刻ｔ２において１ライン目の水平処理が終了すると、垂直処理における各周波数帯信号の計算のためのデータが揃う。時刻ｔ２以降、水平処理は水平処理専用のフィルタ部１３０ｈにより行われ、結果がラインメモリ１０７ｓ，１０７ｄに順次書き込まれる。この書き込みを行うのと同時に読み出しが行われ、垂直処理専用のフィルタ部１３０ｖにより垂直処理が行われ、結果はメモリｍｅｍ１に書き込まれる。この処理が順次実行され、時刻ｔ３でレベル１の処理が終了する。時刻ｔ３からレベル２の処理が行われるのであるが、レベル２の処理の対象となるのはレベル１のＳＳ係数となる。以下、時刻ｔ３からｔ６がレベル２の処理、時刻ｔ６からｔ８がレベル３の処理、時刻ｔ８からｔ１０がレベル４の処理である。
【００７５】
本実施例で用いられる６本のラインメモリはパラレルで読み出しを行うことができれば、一般的なレジスタあるいはシフトレジスタ等、どのようなものを用いてもよい。以上のような動作は２つのフィルタ部を利用するが故に実現できる。３２画素×３２ラインのイメージデータを処理する場合、本実施例での総サイクル数は各レベルで２ライン分の増加となるので、１４８０サイクルとなることは図１５より明かであろう。この処理サイクル数は従来技術の約１／４の処理時間となり、非常な高速動作を小さな回路規模で実現することができる。
【００７６】
＜第８実施例＞
本発明の第８実施例によれば、基本的なブロック構成は前記第１実施例又は第２実施例と同様であるが、図１６に示すように、メモリｍｅｍ１〜ｍｅｍ４それぞれを、ＳＳ，ＳＤ，ＤＳ及びＤＤの各係数データ専用の４つの独立したメモリに分割した構成とし、さらに、図１８及び図１９に示すように、水平処理と垂直処理の両方に利用可能な２つのフィルタ部１３０_1，１３０_2を備え、それぞれを水平処理時と垂直処理時とで入出力を切り替えて用いる点が異なる。
【００７７】
本実施例の動作を図１７のメモリマップ、図１８及び図１９の接続図、及び図２０のタイミングチャートに基づいて説明する。図１７はメモリマップを示す図であり、前処理として、左側に示したようなラスタデータ（外部入力のイメージデータ又は前レベルのＳＳ係数データ）を予め右側に示したように４つの各係数専用メモリ上に再配置しておく必要がある。この再配置は以下のような規則で行う。例えばレベル１の処理の場合、メモリｍｅｍ１内のＳＳメモリの０ライン目の０画素目にはラスタデータ（ｄａｔａ）の０ライン目の０画素目のデータが格納され、ＳＤメモリの０ライン目の０画素目にはラスタデータの０ライン目の１画素目のデータが格納される。また、ＤＳメモリの０ライン目の０画素目にはラスタデータの１ライン目の０画素目のデータが格納され、ＤＤの０ライン目の０画素目にはラスタデータの１ライン目の１画素目のデータが格納される。すなわち、偶数ライン目（０ライン目も偶数と数える）の偶数画素目（０画素目も偶数と数える）がＳＳメモリに格納され、偶数ライン目の奇数画素目がＳＤメモリに格納され、奇数ライン目の偶数画素目がＤＳメモリに格納され、奇数ライン目の奇数画素目がＤＤメモリに格納されるのである。
【００７８】
このような形でメモリｍｅｍ１内の４つのメモリに再配置されたデータに対し、まずデータ選択部１１４によって図１８のような接続とし、２つのフィルタ部１３０_1，１３０_2を同時に利用して水平処理が行われる。メモリｍｅｍ１内のＳＳ〜ＤＤの各メモリはそれぞれ独立しているため、同時刻にリード／ライトを行うことができる。偶数ライン目をフィルタ部（filter1）１３０_1で処理し、奇数ライン目をフィルタ部（filter2）１３０_2で処理し、結果をそれぞれのメモリに同時に書き込むのである。次に、図１９のように接続が変更され、垂直処理が行われる。すなわち、偶数画素目をフィルタ部（filter1）１３０_1で処理し、奇数画素目をフィルタ部（filter２）１３０_2で処理し、結果をそれぞれのメモリに同時に書き込むのである。
【００７９】
以上はレベル１の処理の場合であるが、レベル２以降の処理の場合も、処理対象データが前レベルのＳＳ係数データであり、これを再配置してから処理することと、メモリｍｅｍ２〜ｍｅｍ４を利用することを除けば同じである。
【００８０】
次に、本実施例の動作を図２０のタイミングチャートに基づいて説明する。時刻ｔ０でイメージデータ（ｄａｔａ）の入力が開始されると、まずイメージデータ（ラスタデータ）が図１７に示すようにメモリｍｅｍ１内の４つのメモリに振り分けられて格納される（図中ｔｒ）。時刻ｔ１でイメージデータの入力が終了し、時刻ｔ２まで図１８に示す接続により水平処理が行われ、ついで時刻ｔ３まで図１９に示す接続により垂直処理が行われる。時刻ｔ３でレベル１の処理が終了する。時刻ｔ３からｔ４までの間で、メモリｍｅｍ１内のＳＳメモリに記憶されているレベル１のＳＳ係数データが、メモリｍｅｍ２内の４つのメモリに図１７に示すような規則に従って振り分けて格納される。時刻ｔ４でレベル１のＳＳ係数の入力が終了し、このＳＳ係数データに対する水平処理が時刻ｔ５まで行われ、ついで時刻ｔ６まで垂直処理が行われる。以下同様の処理が時刻ｔ１２まで施され、処理が終了する。
【００８１】
本実施例での総サイクル数は、３２画素×３２ラインのイメージデータを処理する場合、
32x32+(16x16x3)+(8x8x3)+(4x4x3)+(2x2x2)＝１８４３サイクル
となる。この処理サイクル数は従来技術の約１／３の処理時間であり、非常な高速動作を実現することができる。各レベルの処理に先だって行われるデータ振り分けのための時間を除いたサイクル数は僅か８２４サイクルだけで済む。
【００８２】
＜第９実施例＞
本発明の第９実施例によれば、前記第８実施例と同様の基本的ブロック構成において、前記第４実施例と同様にバッファメモリｂｍｅｍが設けられが、このバッファメモリｂｍｅｍもメモリｍｅｍ１〜ｍｅｍ４と同様、図２１に示すようにＳＳ，ＳＤ，ＤＳ，ＤＤの各係数専用の４つの独立したメモリに分割された構成とされる。ただし、バッファメモリｂｍｅｍを構成する４つのメモリはそれぞれ、メモリｍｅｍ１と同じワード数を有する。
【００８３】
本実施例の動作を図２２のタイミングチャートに基づいて説明する。時刻ｔ０でイメージデータ（ｄａｔａ）の入力が開始されると、このデータ（ラスタデータ）は前記第８実施例の場合と同様にメモリｍｅｍ１内の４つのメモリに振り分けられて格納される（図中ｔｒ）。時刻ｔ１でイメージデータ（ｄａｔａ）の入力が終了し、時刻ｔ２まで水平処理が行われ、その結果がバッファメモリｂｍｅｍに書き込まれる。このときの動作は、水平処理結果がバッファメモリｂｍｅｍに書き込まれる以外は前記第８実施例の場合と同様である（図１８参照）。
【００８４】
ついで時刻ｔ２からｔ３までバッファメモリｂｍｅｍからデータを読み出して垂直処理が行われる。各係数は、ＳＳ係数のみがメモリｍｅｍ２内の対応メモリに書き込まれ、他の３つの係数ＳＤ，ＤＳ及びＤＤはメモリｍｅｍ１内の対応メモリに書き込まれる。時刻ｔ３でレベル１の処理が終了する。本実施例ではバッファメモリｂｍｅｍからのレベル１のＳＳ係数の読み出しをレベル２の処理と並行して行うことができるため、時刻ｔ３から直ちにレベル２の処理に入れる。以降、時刻ｔ３からｔ５までレベル２の処理が行われ、時刻ｔ５からｔ７までレベル３の処理が行われ、時刻ｔ７からｔ９までレベル４の処理が行われる。
【００８５】
本実施例での総サイクル数は、３２画素×３２ラインのイメージデータを処理する場合
32x2+(16x16x2)+(8x8x2)+(4x4x2)+(2x2x2)＝１７０４サイクル
となることが図２２より分かる。この処理サイクル数は従来技術の１／３未満であり、高速動作を実現できる。最初にデータを振り分ける時間を除いたサイクル数は僅か６８０サイクルだけである。
【００８６】
＜第１０実施例＞
本発明の第１０実施例によれば、基本的ブロック構成は前記第１実施例、第２実施例、第３実施例、第６実施例又は第７実施例と同様とされるが、各レベルに対応したメモリｍｅｍ１〜ｍｅｍ４はそれぞれ、図２３に模式的に示すように、ワード数が１／４にされ、ビット深さがＳＳ，ＳＤ，ＤＳ及びＤＤの各係数に必要なビット深さの和と等しくされる。例えばＳＳ，ＳＤ，ＤＳ及びＤＤの各係数が８ビットであった場合、各メモリのビット深さは３２ビットとされる。もちろん、ＳＳ，ＳＤ，ＤＳ及びＤＤの各係数のビット数が異なっていても、必要なビット深さの和の分が確保されていれば全く問題はない。本実施例の動作は、例えば前記第８実施例（図１６）の構成に適用した場合、図２０と同様のタイミングチャートによって表すことができる。
【００８７】
本実施例によれば、各周波数帯信号毎に分かれていたメモリロケーションを１つのメモリロケーションにまとめることで、メモリサイズの縮小化を図ることができる。一般的に、ＩＣ内のメモリは、ワード数×ビット深さが同じ場合、ワード数が小さい方が面積が小さくなるからである。さらに、アドレス等の制御信号の数が１／４で済むので配線の引き回しが少なくなり、遅延時間が減少し、あるいはノイズが減少する、といった利点がある。
【００８８】
＜第１１実施例＞
本発明の第１１実施例によれば、バッファメモリｂｍｅｍを利用する前記第４実施例や第９実施例と同様な構成において、メモリｍｅｍ１〜ｍｅｍ４のみならずバッファメモリｂｍｅｍについても、前記第１０実施例に述べたように、ワード数が１／４、ビット深さがＳＳ，ＳＤ，ＤＳ及びＤＤの各係数で必要なビット深さの和となるようにする。図２４に、その様子を模式的に示す。ＳＳ，ＳＤ，ＤＳ及びＤＤ係数のビット数がそれぞれ異なっていても、必要なビット深さの和の分が確保されていれば全く問題はない。
【００８９】
本実施例の動作は、例えば前記第９実施例（図２１）の構成に適用した場合、図２２と同様なタイミングチャートにより表すことができる。本実施例によれば、各周波数帯信号毎に分かれていたメモリロケーションを１つのメモリロケーションにまとめることで、前記第１０実施例に述べたように、バッフアメモリｂｍｅｍも含めたメモリサイズの縮小化を図ることができ、さらに、アドレス等の制御信号の数の減少により遅延時間やノイズが減少する等の利点がある。
【００９０】
＜第１２実施例＞
図２５は本発明第１２の実施例を示すブロック図である。ウェーブレット変換部１００はレベル１用のウェーブレット変換単位モジュール１００_level1、レベル２用のウェーブレット変換単位モジュール１００_level2、レベル３用のウェーブレット変換単位モジュール１００_level3、及びレベル４用のウェーブレット変換単位モジュール１００_level4によって構成される。各ウェーブレット変換単位モジュールは、対応した１つのレベルの処理にのみ関わる点を別にすれば、前記各実施例と基本的に同様な構成であり、それぞれ、対応レベル用のメモリ、フィルタ部、それにメモリの制御、モジュール内のデータ転送制御、外部又は隣接レベルのモジュールとのデータ転送制御、及び外部又は隣接レベルのモジュールとの制御信号の授受のための制御部からなる。
【００９１】
すなわち、レベル１用のウェーブレット変換単位モジュール１００_level1は、メモリｍｅｍ１、制御部１１０_level1及びフィルタ部１３０_level1から構成される。メモリｍｅｍ１は、外部から入力されるイメージデータ（ｄａｔａ）と同数のワード数を有する。レベル２用のウェーブレット変換単位モジュール１００_level2は、メモリｍｅｍ２、制御部１１０_level2及びフィルタ部１３０_level2から構成され、メモリｍｅｍ２はメモリｍｅｍ１の１／４のワード数を有する。レベル３用のウェーブレット変換単位モジュール１００_level3は、メモリｍｅｍ３、制御部１１０_level3及びフィルタ部１３０_level3から構成される。メモリｍｅｍ３は、メモリｍｅｍ２の１／４のワード数を有する。レベル４用のウェーブレット変換単位モジュール１００_level4は、メモリｍｅｍ４、制御部１１０_level4及びフィルタ部１３０_level41から構成され、メモリｍｅｍ４はメモリｍｅｍ３の１／４のワード数を有する。
【００９２】
レベル１の制御部１１０_level1は、入力信号として外部からのｅｓｔａｒｔ信号、及びレベル２からのデコード終了信号ｄｅｎｄ１があり、出力信号として外部へのデコード終了信号ｄｅｎｄ及びレベル２へのエンコード終了信号ｅｅｎｄ１があり、入出力信号としてレベル２に対するデータ入出力信号ｄａｔａ１、メモリｍｅｍ１に対する入出力信号及びフィルタ部１３０_level1に対する入出力信号がある。レベル２の制御部１１０_level2の入力信号は、レベル１からのエンコードスタート信号ｅｅｎｄ１及びレベル３からのデコード終了信号ｄｅｎｄ２であり、出力信号は、レベル１へのデコード終了信号ｄｅｎｄ１及びレベル３へのエンコード終了信号ｅｅｎｄ２であり、入出力信号はレベル３に対するデータ入出力信号ｄａｔａ２、メモリｍｅｍ２に対する入出力信号及びフィルタ部１３０_level2に対する入出力信号である。レベル３の制御部１１０_level3の入力信号は、レベル２からのエンコードスタート信号ｅｅｎｄ２及びレベル４からのデコード終了信号ｄｅｎｄ３であり、出力信号は、レベル２へのデコード終了信号ｄｅｎｄ２及びレベル４へのエンコード終了信号ｅｅｎｄ３であり、入出力信号はレベル４に対するデータ入出力信号ｄａｔａ３、メモリｍｅｍ３に対する入出力信号及びフィルタ１３０_level3に対する入出力信号である。レベル４の制御部１１０_level4の入力信号は、レベル３からのエンコードスタート信号ｅｅｎｄ３及び外部からのデコードスタート信号ｄｓｔａｒｔであり、出力信号は、レベル３へのデコード終了信号ｄｅｎｄ３及び外部へのエンコード終了信号ｅｅｎｄであり、入出力信号はメモリｍｅｍ４に対する入出力信号及びフィルタ部１３０_level4に対する入出力信号である。
【００９３】
本実施例の動作のタイミングチャートは図４（エンコード時）又は図５（デコード時）とほぼ同様である。エンコード時の動作を説明する。ｅｓｔａｒｔ信号が入力されると、ウェーブレット変換単位モジュール１００_level1において、外部より入力されるイメージデータ（ｄａｔａ）に対し水平処理が行われ、ついで垂直処理が実行される。レベル１の処理が終了すると、これがｅｅｎｄ１信号により制御部１１０_level2に通知され、ウェーブレット変換単位モジュール２００_level2でレベル２の処理が開始される。以下、レベル４までの処理が終了すると、外部にｅｅｎｄ信号が出力される。
【００９４】
次にデコード時の動作を説明する。ｄｓｔａｒｔ信号が入力されると、符号化部２００より入力されるレベル４の係数データに対して、ウェーブレット変換単位モジュール２００_level4で垂直処理が実行され、ついで水平処理が実行される。レベル４の処理が終了しレベル３のＳＳ係数データが再生されると、これがｄｅｎｄ３信号によって制御部１１０_level3に通知され、ウェーブレット変換単位モジュール１００_level3によるレベル３の処理が開始する。このようなウェーブレット逆変換処理によってイメージデータの再生が終了すると、ｄｅｎｄ信号が外部に出力される。
【００９５】
本実施例によれば、全レベルにわたる動作全体を考慮しながら設計を行う必要がなく、１つの基本モジュールの設計のみを行えばよい。この設計結果を各レベルに適用することで所望レベル数のウェーブレット変換部の設計を行うことができるので、信頼性が向上し、またデータサイズの変更やレベルの追加等の設計変更要求があった場合でも速やかに対応することが可能となる。
【００９６】
＜第１３実施例＞
図２６は本発明の第１３実施例を示すブロック図である。本実施例のウェーブレット変換部１００は、基本的には前記第１２実施例と同様に４つのウェーブレット変換単位モジュールによって構成されるが、各ウェーブレット変換単位モジュールの内部構成が一部相違する。すなわち、各ウェーブレット変換単位モジュール１００_level1〜１００_level4に、例えば前記第４実施例や第５実施例と同様にバッファメモリ（ｂｍｅｍｉ）１０６_leveli（ｉ＝１〜４）を設け、さらに前記第４実施例と同様に水平処理用と垂直処理用の２つのフィルタ部１３０ｈ_leveli，１３０ｖ_leveliを持たせる。バッファメモリｂｍｅｍ１〜ｂｍｅｍ４はそれぞれメモリｍｅｍ１〜ｍｅｍ４と同じワード数を有する。図１３では、ウェーブレット変換単位モジュール１００_level3，１００_level4の内部構成が省略されている。
【００９７】
本実施例の動作は、図１１とほぼ同様のタイミングチャートにより表すことができる。エンコード時の動作を説明する。ｅｓｔａｒｔ信号が入力されると、ウェーブレット変換単位モジュール１００_level1において、外部から入力したイメージデータに対する水平処理と垂直処理がほぼ並行して実行される。レベル１の処理が終了すると、ｅｅｎｄ１信号により制御部１１０_level2に通知され、ウェーブレット変換単位モジュール１００_level2においてレベル２の処理が開始される。以下、レベル４までの処理が終了すると、外部にｅｅｎｄ信号が出力されエンコード処理を終了する。
【００９８】
次にデコード時の動作を説明する。ｄｓｔａｒｔ信号が入力されると、ウェーブレット変換単位モジュール１００_level4でウェーブレット逆変換の垂直処理、ついで水平処理が実行される。レベル４の処理が終了しレベル３のＳＳ係数が再生されると、ｄｅｎｄ３信号によって制御部１１０_level3に通知され、ウェーブレット変換単位モジュール１００_level3でレベル３の処理が開始される。このようにしてイメージデータ（ｄａｔａ）の再生処理が終了すると、外部にｄｅｎｄ信号が出力される。
【００９９】
本実施例は、前記第１２実施例よりも処理の高速化が可能になるとともに、前記第１２実施例と同様に設計の容易化、信頼性の向上、設計変更要求への対応の迅速化といった利点がある。
【０１００】
＜第１４実施例＞
図２７は本発明の第１４実施例を示すブロック図である。本実施例によれば、前記第１３実施例における各ウェーブレット変換単位モジュール１００_leveli内のバッファメモリｂｍｅｍｉに代えて、Ｓ係数の一時的保持のための３本のラインメモリ１０７ｓ_leveliとＤ係数の一時的保持のための３本のラインメモリ１０７ｄ_leveliが設けられる。当然、これらラインメモリに対する入出力信号を制御部１００_leveliは有する。なお、図２７では、ウェーブレット変換単位モジュール１００_level3，１００_level4の内部構成はウェーブレット変換単位モジュール１００_level1又は１００_level2と同様であるため省略し、他モジュール又は外部に対する入出力信号だけを示した。
【０１０１】
本実施例のエンコード時のタイミングチャートは図１５とほぼ同様である。ｅｓｔａｒｔ信号が入力されると、ウェーブレット変換単位モジュール１００_level1においてレベル１の処理が実行される。レベル１の処理が終了すると、ｅｅｎｄ１信号によりウェーブレット変換単位モジュール１００_level2でレベル２の処理が開始される。レベル４までの処理が終了すると、外部にｅｅｎｄ信号が出力され、処理全体が終了する。デコード時においては、ｄｓｔａｒｔ信号が入力されると、ウェーブレット変換単位モジュール１００_level4で垂直処理と水平処理が行われる。レベル４の処理が終了しレベル３のＳＳ係数が再生されると、ｄｅｎｄ３信号によりウェーブレット変換単位モジュール１００_level3に通知され、レベル３の処理が開始される。イメージデータの再生処理が終了すると、外部にｄｅｎｄ信号が出力される。
【０１０２】
本実施例は、前記第１２実施例と同様な利点のほかに、前記第７実施例と同様の利点を有する。
【０１０３】
【発明の効果】
以上の詳細な説明から明らかなように、以下のような効果を得られる。
【０１０４】
請求項１の発明によれば、ウェーブレット変換のレベル毎にメモりを具備することにより、水平処理（垂直処理を先に行う場合は垂直処理）の時間を短縮し、処理を高速化することができる。レベル対応のメモリは、各レベルの演算において独立しており、周波数帯信号を最終的に蓄積しておくストレージであるとともに、水平処理（又は垂直処理）の結果の一時記憶のためのバッファメモリとしても働くので、アドレッシングの容易さ、あるいは設計の容易さを考慮し自由にアドレスマッピングをすることが可能である。従来は各レベル毎に３個のフィルタを用意し、それぞれに制御機構を用意する必要があったが、制御機構をフィルタから分離して制御部に集約するとともに全レベルに共通のフィルタ部を用意することにより、フィルタ部にフィルタ本来の演算機能のみを持たせれば済むようになり、またループ処理も可能になり、回路構成の簡略化及び規模縮小を図ることができ設計も容易になる。
【０１０５】
請求項２の発明によれば、各レベルの処理において対応したメモリの読み出しと書き込みを同時に行い、処理の一層の高速化を図ることができる。
【０１０６】
請求項３の発明によれば、制御部を構成する部分は機能的に単純であり回路規模も小さくなるため、制御部の設計もしくは設計変更が一層容易になるとともに信頼性を高めることができる。
【０１０７】
請求項４の発明によれば、制御部のエンコード用部分とデコード用部分を独立に扱うことが可能になり、それぞれの部分は機能的により単純になり、また回路規模もより小さくなるため、制御部の設計が一層容易になるとともに信頼性を高めることができる。
【０１０８】
請求項５の発明によれば、レベル対応のメモリのほかに、その最大のものと同じワード数を少なくとも持つバッファメモリを備え、このバッファメモリは他ブロックのバッファメモリとも共有可能であり、これを各レベルの処理に利用することにより、読み出しアドレスと書き込みアドレスが独立したメモリに比べてメモリエリアの小さな、両アドレスが独立していないメモリをレベル対応メモリとして用い、両アドレスが独立したメモリを用いる場合に匹敵する高速処理が可能になる。
【０１０９】
請求項６又は８の発明によれば、水平処理と垂直処理を同時に実行することにより、さらなる高速処理が可能になる。
【０１１０】
請求項７の発明によれば、一般的なレジスタあるいはシフトレジスタ等で構成可能なラインメモリを利用することにより、回路規模を増大させることなく高速処理を実現できる。
【０１１１】
請求項９又は１０の発明によれば水平処理及び垂直処理をそれぞれ並列化して一層の高速化を図ることができ、しかも、読み出しアドリスと書き込みアドレスが独立しない一般的なメモリをレベル対応のメモリやバッファメモリとして利用できるため、メモリエリアの増大を回避できる。
【０１１２】
請求項１１又は１２の発明によれば、レベル対応のメモリ又はレベル対応メモリ及びバッファメモリのメモリサイズを縮小することができるとともに、メモリ又はバッファに関連した配線による信号遅延やノイズを減らすことができる。
【０１１４】
請求項１３によれば、以上に述べたような利点を持つ符号化／復号化装置を実現することができる。
【図面の簡単な説明】
【図１】本発明の第１実施例の構成を示すブロック図である。
【図２】第１実施例におけるＳ係数及びＤ係数のためのメモリマップの一例を示す図である。
【図３】第１実施例におけるＳＳ，ＳＤ，ＤＳ，ＤＤ係数のためのメモリマップの一例を示す図である。
【図４】第１実施例におけるエンコード時のタイミングチャートである。
【図５】第１実施例におけるデコード時のタイミングチャートである。
【図６】本発明の第２実施例の構成を示すブロック図である。
【図７】本発明の第３実施例の構成を示すブロック図である。
【図８】本発明の第４実施例の構成を示すブロック図である。
【図９】第４実施例におけるエンコード時のタイミングチャートである。
【図１０】本発明の第５実施例の構成を示すブロック図である。
【図１１】第５実施例におけるエンコード時のタイミングチャートである。
【図１２】本発明の第６実施例の構成を示すブロック図である。
【図１３】第６実施例におけるエンコード時のタイミングチャートである。
【図１４】本発明の第７実施例の構成を示すブロック図である。
【図１５】第７実施例におけるエンコード時のタイミングチャートである。
【図１６】本発明の第８実施例におけるメモリの構成を示す図である。
【図１７】第８実施例におけるデータ再配置の説明図である。
【図１８】第８実施例における水平処理時の接続を示す図である。
【図１９】第８実施例における垂直処理時の接続を示す図である。
【図２０】第８実施例におけるエンコード時のタイミングチャートである。
【図２１】本発明の第９実施例におけるバッフアメモリの構成を示す図である。
【図２２】第９実施例におけるエンコード時のタイミングチャートである。
【図２３】本発明の第１０実施例におけるメモリの構成を示す図である。
【図２４】本発明の第１１実施例におけるメモリ及びバッファメモリの構成を示す図である。
【図２５】本発明の第１２実施例の構成を示すブロック図である。
【図２６】本発明の第１３実施例の構成を示すブロック図である。
【図２７】本発明の第１４実施例の構成を示すブロック図である。
【図２８】従来技術を示すブロック図である。
【図２９】従来技術におけるエンコード時のタイミングチャートである。
【図３０】ウェーブレット変換の水平処理及び垂直処理の演算方法の説明図である。
【図３１】イメージデータのメモリマップの一例を示す図である。
【図３２】レベル１のＳ係数及びＤ係数のためのメモリマップの一例を示す図である。
【図３３】レベル１のＳＳ係数、ＳＤ係数、ＤＳ係数及びＤＤ係数のためのメモリマップの一例を示す図である。
【図３４】レベル２のＳ係数及びＤ係数のためのメモリマップの一例を示す図である。
【図３５】レベル２のＳＳ係数、ＳＤ係数、ＤＳ係数及びＤＤ係数のためのメモリマップの一例を示す図である。
【図３６】本発明の第２実施例におけるエンコード時のタイミングチャートである。
【符号の説明】
１００ウェーブレット変換部
１００_level1〜１００_level4 ウェーブレット変換単位モジュール
１０２_1〜１０２_4 メモリ（ｍｅｍ１〜ｍｅｍ４）
１０６バッファメモリ（ｂｍｅｍ）
１０７ｓ，１０７ｓ_level1，１０７ｓ_level2 Ｓ係数用ラインメモリ
１０７ｄ，１０７ｄ_level1，１０７ｄ_level2 Ｄ係数用ラインメモリ
１１０，１１０_level1〜１１０_level4 制御部
１１１制御信号選択部（ｓ_ｍｕｘ）
１１１ｅエンコード専用制御信号選択部（ｅｓ_ｍｕｘブロック）
１１１ｄデコード専用制御信号選択部（ｄｓ_ｍｕｘブロック）
１１４データ選択部
１１４ｅエンコード専用データ選択部（ｅｄ_ｍｕｘブロック）
１１４ｄデコード専用データ選択部（ｄｄ_ｍｕｘブロック）
１１８主制御部
１１８ｅエンコード専用主制御部（ｅｍａｉｎブロック）
１１８ｄデコード専用主制御部（ｄｍａｉｎブロック）
１２２開始終了制御部（ｓｅブロック）
１３０，１３０_1，１３０_2 フィルタ部
１３０ｈ，１３０ｈ_level1〜１３０ｈ_level4 水平処理用フィルタ部
１３０ｖ，１３０ｖ_level1〜１３０ｖ_level4 垂直処理用フィルタ部
１３０_level1〜１３０_level4 フィルタ部
２００符号化部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to the field of data compression and decompression, and more particularly to an encoding / decoding device and a wavelet transformation device that use wavelet transformation.
[0002]
[Prior art]
Data compression is a very useful tool for storing and transmitting large amounts of data. For example, the time required for facsimile transmission of a document and transmission of an image such as the World Wide Web can be drastically shortened by reducing the number of bits required for image reproduction using compression.
[0003]
Conventionally, various data compression methods exist. A compression method of JPEG (Joint Photographics Experts Group) is the most widely used compression method. In the JPEG compression method, input symbols or luminance data are quantized and then converted into output codewords. The purpose of quantization is to remove unimportant feature quantities while preserving important feature quantities of data. Prior to quantization, transformation is used to concentrate energy, but JPEG employs DCT (Discrete Cosine Transform). However, various drawbacks have been pointed out with respect to the JPEG method using DCT. For example, block noise and mosquito noise (called this because mosquitoes appear to fly). In image signal processing, there is an interest in pursuing an efficient and highly accurate data compression encoding method that eliminates these drawbacks. Among these methods, there is a wavelet pyramid processing method.
[0004]
When wavelet transform is applied to a two-dimensional signal such as an image signal, a horizontal low-pass filter HL (Horizontal Low) and a horizontal high-pass filter HH (Holizontal High) are used for the input signal. Are separated into a horizontal low-frequency signal (S (smooth) coefficient) and a horizontal high-frequency signal (D (detail) coefficient), and further, a vertical low-pass filter VL (Vertical) with respect to the S coefficient and D coefficient. Low) and vertical high-pass filter VH (Vertical High), horizontal low-vertical low-frequency signal (SS coefficient), horizontal low-vertical high-frequency signal (SD coefficient), horizontal The direction high band-vertical low band signal (DS coefficient) and the horizontal high band-vertical high band signal (DD coefficient) are separated. The series of processes described above is called a level, and an output obtained by performing one horizontal process and vertical process is called a level 1 output. Furthermore, the above four types of signals are called frequency band signals. This process is recursively performed on the SS coefficient when an output of level 2 or higher is desired. In the level 2 output, seven frequency band signals of SS coefficient, 1SD coefficient and 2SD coefficient, 1DS coefficient and 2DS coefficient, 1DD coefficient and 2DD coefficient are obtained. In the above description, the filter is first applied in the horizontal direction and then the filter is applied in the vertical direction, but the order may be reversed.
[0005]
FIG. 28 shows a conventional configuration when processing up to level 4 is performed. In the figure, 1000 is a wavelet transform unit, and 1100 is an encoding unit. The encoding unit 1100 has a function of encoding the output data of the wavelet transform unit 1000 and outputting a compressed code, or decoding a similarly compressed code input from the outside and expanding it to a frequency band signal of each level. Have.
[0006]
In FIG. 28, reference numerals 1001 to 1012 denote filters, and filters 1001, 1004, 1007, and 1010 (hereinafter referred to as filter1H, filter2H, filter3H, and filter4H) are horizontal low-pass filters HL. And a horizontal filter including a horizontal high-pass filter HH. Numbers 1 to 4 in these filter names represent level numbers, and H means a horizontal filter. Similarly, the filters 1002 (filter1V1) and 1003 (filter1V2), the same 1005 (filter2V1) and the same 1006 (filter2V2), the same 1008 (filter3V1), the same 1009 (filter3V2), and the same 1011
(Filter4V1) and 1012 (filter4V2) are vertical filters including a vertical low-pass filter VL and a vertical high-pass filter VH. V in these filter names means a vertical filter, numbers 1 to 4 before V represent level numbers, and number 1 after V inputs a horizontal low-frequency signal (S coefficient). The numeral 2 after V indicates that the filter receives a horizontal high-frequency signal (D coefficient). The above filter may have any configuration, but in the following description, the horizontal low-pass filter HL and the vertical low-pass filter VL are 2-tap that performs calculations using two sets of data. This filter shall be used. Further, as the horizontal high-pass filter HH and the vertical high-pass filter VH, among the S coefficients that are the outputs of the low-pass filter HL or VL, the current position, one before and one after. It is assumed that a 6-tap filter that calculates using a total of three sets of data is used.
[0007]
An example of calculation when such a filter is used is shown in FIG. However, it should be noted that the data mapping in this figure is for explaining the calculation method, and the actual mapping to the memory is as shown in FIGS. 32 to 35, for example. (A) of FIG. 30 explains the process of the horizontal filter, [00] means the 0th pixel data of the 0th line, and [12] means the 2nd pixel data of the 1st line. (In this way, both lines and pixels are counted from 0). The output [S00] of the 0th pixel of the horizontal low-pass filter HL is obtained from [00] data and [01] data, and the output [S01] of the first pixel is [02] data and [03]. It is obtained from the data. On the other hand, the output [D00] of the 0th pixel of the horizontal high-pass filter HH includes data (not existing) two and one before (00) data, [00] data, [ 01] data, [02] data, and [03] data. Here, in order to obtain data that is two and one before the [00] data that does not exist, a process called a mirror is performed. Specifically, a process of turning back the data in a mirror image relationship is performed. As a result, the previous data and the previous data become [01] data and [00] data. In this way, [D00] is calculated from the data of 6 pixels.
[0008]
FIG. 30B illustrates the vertical filter processing. This process is performed in the vertical direction using the S coefficient and the D coefficient by the vertical filter process. The non-existent coefficient is subjected to mirror processing as in the case of the horizontal filter processing.
[0009]
FIG. 31 to FIG. 35 illustrate a method for storing the results of the wavelet processing. FIG. 31 shows image data stored in the frame memory in raster order. Data is read from the frame memory, horizontal processing is performed, and the result is written to the frame memory again. In order to avoid overwriting unprocessed data at the time of writing, for example, the S coefficient and the D coefficient are written by mapping as shown in FIG. In FIG. 32, [1S00] means the S coefficient of address 00 of level 1. FIG. 33 shows an example of mapping when writing each coefficient after performing vertical processing. This is the method of storing each level 1 coefficient. FIG. 34 shows an example of a method for storing level 2 horizontal coefficients. Note that the shaded portion of data is not used because level 2 processing is only performed on 1SS coefficients. Next, each level 2 coefficient is stored by mapping as shown in FIG. 35, and the level 2 processing is completed. The above processing is repeated up to level 4.
[0010]
FIG. 29 is a timing chart of the wavelet transform unit 1000 configured as shown in FIG. However, it should be noted that this timing chart is used for explaining the processing procedure, the time required for memory access or the like is not considered, and the horizontal axis (time axis) scale is not linear. In the following description, the number of pixels or the number of lines is counted from 0, such as 0th pixel or 0th line. Image data (raster data) input from data is 32 pixels × 32 lines (0 to 31), and one data segment (× = ×) corresponds to one line.
[0011]
From time t0, the 0th line data is sequentially input from the 0th pixel, and when the 1st pixel is input, the [1S00] data of the 0th pixel is output from the filter 1H. Next, when [1S01] data is output, three sets of S coefficients ([1S00], [1S00], [1S01]) necessary for calculating the D coefficient are prepared (the previous data is obtained by mirror processing). D coefficient [1D00] is output. This is repeated for one line. In the timing chart, it is shown in units of time of one line, but it should be noted that if it is enlarged, a deviation in units of pixels occurs.
[0012]
Input of data on the first line starts from time t1, and [1S10], [1D10], and S and D coefficients are sequentially output from filter 1H. 2H processing (level 2 horizontal processing) starts. When [1S10] is output, 1V processing (level 1 vertical processing) starts, and [1SS00] is output from filter1V1, and [1DS00] is output from filter1V2. When [1S11] is output, three sets of data necessary for calculating the D coefficient are prepared in filter1V1 and filter1V2. That is, [1S10], [1S10], [1S11] in filter1V1, and [1D10], [1D10], [1D11] are aligned in filter1V2 (the previous data is obtained by mirror processing), and level 1 Output data [1SS00], [1SD00], [1DS00], and [1DD00] are obtained. This is repeated for one line.
[0013]
At time t2, input of the first line for 2V processing (level 2 vertical processing) begins, and 2V processing begins. Thereafter, the processing is repeated until time t9 with the same timing relationship, and each frequency band signal up to level 4 is output.
[0014]
The frequency band signals of the respective levels obtained as described above are encoded and compressed by the encoding unit 1100. However, since the encoding is usually bit processing, the frequency band signals are temporarily stored in the storage. It is necessary to write it down. A commonly used storage is a semiconductor memory. The encoding unit 1100 performs bit processing with reference to each frequency band signal written in the storage and encodes the generated code.
Output as code. The decompression from the compressed code to the image data is performed in the reverse order of the operations described above.
[0015]
More detailed information about the encoding / decoding device, wavelet transform unit, or filter related to the present invention can be found in JP-A-8-116265, JP-A-8-139935, and JP-A-9-27752. Refer to Japanese Patent Laid-Open No. 9-27912 and the like. For similar prior art, refer to JP-A-3-27687, JP-A-5-167997, and JP-A-5-183386.
[0016]
Next, the processing time for wavelet transform will be described. Here, as a storage for each frequency band signal generated by the wavelet transform unit 1000, a general semiconductor memory that can only be read or written at a time is used. This will be described as a case.
[0017]
As described with reference to the timing chart of FIG. 29, since each frequency band signal is output in parallel at the same time, writing to the memory must also be performed in parallel. Only one data can be read or written. 29, the lower left range corresponds to 1H, 1V,. . . , 4V represents the range of processing time occupied by ← →. The r / w cycles under range is the number of memory accesses required for each range (← →) of the range, and is the sum of the number of writes and the number of reads within that range, but different levels are processed simultaneously. The number of times in the range is shown as the total number of times for each level. The numerical value shown on the right side of FIG. 29 is the number of memory accesses (total number of writing and reading) required for horizontal processing or vertical processing at each level. As for the number of memory accesses, at each level, all horizontal data and vertical data are always read once, and all data is rewritten with filter output data, so writing twice the total number of pixels. / Write count is required.
[0018]
[Problems to be solved by the invention]
As described above, encoding is performed using each frequency band signal of each level, but since encoding is normally performed bit processing, it is necessary to temporarily store the output data after wavelet transform in the storage, There is a problem that the processing time is several times longer than the time required to simply input data. As is apparent from the above description, for example, when the size of the input image data is 32 pixels × 32 lines and the number of levels is 4, the number of cycles required to input the image data is 1024 = 32 × 32. On the other hand, the required processing time is 5440 cycles, which is five times or more. Obviously, if the size of the input data increases, the processing time will increase significantly. For example, in the case of 64 pixels × 64 lines, as shown by the dotted line in FIG. 29, as a result of the 1H processing being performed until time t10, the number of sections output in parallel increases, so the processing time increases significantly. Similarly, when the number of levels increases, the processing time increases significantly.
[0019]
Moreover, since each frequency band signal of each level is output at the same time, pipeline processing is necessary. That is, since the data input timings of all filters are different, it is necessary to individually design each filter by incorporating a controller according to the place where the filter is used. In addition, these controllers can deal with only a combination of the number of pixels and the level of one condition, and there is a problem that it is difficult to deal with when one or both of the number of pixels or the level is changed. . Furthermore, the sequence of encoding processing (processing to convert image data into each frequency band signal, wavelet order conversion) and decoding processing (processing to convert each frequency band signal into image data, wavelet inverse conversion) are different, but this Careful design was necessary so that the entire pipeline operation would not fail in consideration of the difference.
[0020]
An object of the present invention is to improve the above-mentioned problems in a wavelet transform apparatus and an encoding / decoding apparatus using the wavelet transform, and more specifically, enabling higher-speed operation. Pipeline processing is unnecessary, design or design change is facilitated, an increase in circuit scale is avoided, and so on.
[0021]
[Means for Solving the Problems]
In order to achieve the above object, a wavelet transform apparatus according to the invention of claim 1 includes a plurality of independent memories associated with each level of the wavelet transform, and a filter common to all levels of the wavelet transform. And a control unit that controls data transfer inside the device, data transfer outside the device, and operations of the plurality of memories. According to the second aspect of the present invention, each of the plurality of memories associated with each level on a one-to-one basis has a read address and a write address that are independent and can be read and written at the same time.
[0022]
In order to achieve the above object, the invention according to claim 3 is the wavelet transform device according to claim 1 or 2, wherein the control unit is provided for at least the plurality of memories. Address and enable signal A part for selectively supplying the control signal, a part for controlling the data transfer inside the apparatus and the data transfer with the outside of the apparatus, and a part for controlling these two parts and for transmitting / receiving control information to / from the outside of the apparatus; It is characterized by dividing into two. According to the invention of claim 4, at least three parts constituting the control unit are divided into a part dedicated for encoding and a part dedicated for decoding.
[0023]
In order to achieve the above object, a wavelet transform apparatus according to the invention of claim 5 includes a plurality of mutually independent memories associated with each level of the wavelet transform one by one, and a maximum memory among the plurality of memories. A buffer memory common to all levels of the wavelet transform having at least the same number of words, a filter unit common to all levels of the wavelet transform, data transfer inside the apparatus, data transfer outside the apparatus, and the plurality of memories And a control unit for controlling the operation of the buffer memory. According to the invention of claim 6, the filter unit according to claim 5 is composed of two independent filter units, and one of the two filter units is assigned to horizontal processing and the other is assigned to vertical processing.
[0024]
In order to achieve the above object, a wavelet transform device according to the invention of claim 7 includes a plurality of independent memories that are associated with each level of the wavelet transform in a one-to-one manner, and a line that is common to all levels of the wavelet transform. A configuration including a memory, a filter unit common to all levels of wavelet transform, a data transfer inside the device, a data transfer outside the device, and a control unit that controls operations of the plurality of memories and the line memory It is said. According to an eighth aspect of the present invention, a horizontal processing filter unit and a vertical processing filter unit are provided as the filter unit according to the seventh aspect.
[0025]
In order to achieve the above object, according to a ninth aspect of the present invention, in the wavelet transform device according to the first or second aspect, each of the plurality of memories is an independent memory having a number equal to the number of types of wavelet transform coefficients, and the filter unit. It comprises two independent filter units used for both horizontal processing and vertical processing.
[0026]
In order to achieve the above object, according to a tenth aspect of the present invention, there is provided the wavelet transform device according to the fifth aspect, wherein each of the memories and the buffer memory has the same number of independent memories as the number of types of wavelet transform coefficients. In addition, the filter unit includes two independent filter units used for both horizontal processing and vertical processing.
[0027]
In order to achieve the above object, the invention of claim 11 is the wavelet transform device according to claim 1 or 2, wherein each of the plurality of memories has a bit depth equal to a sum of all kinds of bit depths of wavelet transform coefficients. It is characterized by having at least a memory.
[0028]
In order to achieve the above object, according to a twelfth aspect of the present invention, in the wavelet transform device according to the fifth aspect, each memory of the plurality of memories and the buffer memory are summed up to the bit depth of all kinds of wavelet transform coefficients. The memory is characterized by having an equal bit depth.
[0032]
In order to achieve the above purpose, Claim 13 An encoding / decoding device according to the invention of claim 1 is provided. 12 The wavelet transform device according to claim 1 is connected to the wavelet transform device, the output data of the wavelet transform device is encoded, or the encoded data input from the outside is decoded and input to the wavelet transform device It is set as the structure which comprises the encoding part which performs.
[0033]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings. In addition, in order to avoid complication of description, the same or similar reference numerals are used for corresponding parts in a plurality of drawings. Moreover, about the same or similar structure in several Example, it abbreviate | omits or simplifies suitably from related drawing, or uses drawing of another Example.
[0034]
<First embodiment>
FIG. 1 is a block diagram of an encoding / decoding apparatus according to a first embodiment of the present invention. This encoding / decoding device includes a wavelet transform unit 100 and an encoding unit 200. The wavelet transform unit 100 can process up to level 4. Of course, any number of levels may be set, and the present invention exhibits its effect as the number of levels increases. Furthermore, the present invention exhibits its effect as the number of input data (number of pixels) increases.
[0035]
The wavelet transform unit 100 includes memories 102_1 to 102_4 (hereinafter referred to as mem1, mem2, mem3, and mem4) for storing level 1, 2, 3, and 4 frequency band signal data, and a control signal selection unit (s_mux). ) 111, a data selection unit (d_mux) 114, and a control unit (cnotroller) 110 including a main control unit (main) 118 that controls them, and a filter unit including a low-pass filter and a high-pass filter (filter) 130).
[0036]
The wavelet transform unit 100 has input / output of frequency band signal data between the input / output data of image data with the outside and the encoding unit 200. At the time of encoding, image data equal to or less than the maximum number (number of pixels) that can be processed at a time is input to the wavelet transform unit 100 from the input / output data. The maximum number of data is equal to the word number W1 of mem1. The word number W2 of mem2 is set to W2 = W1 × (1/4). The word number W3 of mem3 is set to W3 = W2 × (1/4), and the word number W4 of mem4 is set to W4 = W3 × (1/4). Therefore, the total number of words in the memories mem1 to mem4 is relative to the total number of input data.
1+ (1/4) + (1/4) ^ 2 + (1/4) ^ 3 = 1 + 21/64
That is an increase of about 33%.
[0037]
The main control unit 118 of the control unit 110 is a part that exchanges control signals with the outside and controls the control signal selection unit 111 and the data selection unit 114. The control signal selection unit 111 controls access to the memories mem1 to mem4. It is a part to do. The data selection unit 114 is a part that performs control of internal data transfer of the wavelet transform unit 100, control of transfer of image data to / from the outside, and control of transfer of frequency band signal data to / from the encoding unit 200. The main control unit 118 notifies the outside of the external input start for starting the wavelet transformation, the external input dir for selecting the operation of encoding (wavelet order transformation) or decoding (inverse wavelet transformation), and the end of the wavelet transformation. Output end, control signal output cnt1 for the control signal selector 111, and control signal output cnt2 for the data selector 114. Outputs mem_cnt1 to mem_cnt4 of the control signal selection unit 111 are control signals for the memory, and include an address and a series of enable signals. The data selection unit 114 has outputs mem_out1 to mem_out4 to the memory and inputs mem_in1 to mem_in4 from the memory, and further has an output fil_out to the filter unit 130 and an input fil_in from the filter unit 130.
[0038]
Hereinafter, the operation of this embodiment will be described with reference to the memory map shown in FIGS. 2 and 3 and the timing charts shown in FIGS.
[0039]
First, the encoding operation will be described. It is assumed that a start signal is input to the main control unit 118 of the control unit 110, and image data is input from data with a delay of about several pixel hours. FIG. 4 is a timing chart at the time of encoding. The encoding operation is notified by the dir signal.
[0040]
When the start signal is input at time t0 in FIG. 4, the register in the main control unit 118 is reset, and the control signal selection unit through cnt1 and cnt2 indicates that it is level 1 encoding processing and horizontal processing. 111 and the data selection unit 114 are notified. In response to these signals, the input image data is first input to the filter unit 130 through the fil_out by the data selection unit 114, and the S coefficient from the horizontal low-pass filter and the high-pass filter from the filter unit 130. The D coefficient from is output to the data selection unit 114 through fil_in. Data input to the data selection unit 114 is written to the memory mem1 through mem_out1. At this time, a write address and various enable signals are given to the memory mem1 from the control signal selector 111 through mem_cnt1. The timing of these controls is scheduled by the main control unit 118. These operations are performed in raster order until time t1, and the horizontal processing ends.
[0041]
When image data of 32 pixels × 32 lines is input, the total number of read / write cycles necessary for level 1 horizontal processing (1H) is 1024 cycles. For example, as shown in FIG. 2, the S and D coefficient data are written in the left half of the memory mem1 and in the right half, respectively.
[0042]
Subsequently, the main control unit 118 notifies the control signal selection unit 111 and the data selection unit 114 that the vertical processing is performed through cnt1 and cnt2. In response to these signals, the read address and various enable signals are output from the control signal selection unit 111 to the memory mem1 through mem_cnt1. However, it should be noted that the addressing performed at this time is not the raster direction (horizontal direction) but the line direction (vertical direction). First, the S coefficient obtained by the processing of 1H is output from the memory mem1 to mem_in1, and this is input to the filter unit 130 by the data selection unit 114 through fil_out. Then, the SS coefficient from the low-pass filter in the vertical direction and the SD coefficient from the high-pass filter are output from the filter unit 130 to the data selection unit 114 through fil_in. Next, the D coefficient that is the processing result of 1H is similarly output from the memory mem1 to mem_in1, and this is input from the data selection unit 114 to the filter unit 130 through fil_out. Then, the DS coefficient from the vertical low-pass filter and the DD coefficient from the high-pass filter are output from the filter unit 130 to the data selection unit 114 through fil_in. However, although the S coefficient is processed first and then the D coefficient is processed next, the order may be reversed. The SS, SD, DS, and DD coefficient data input from the filter unit 130 to the data selection unit 114 is written to the memory mem1 through mem_out1. At this time, the write address and various enable signals for the memory mem1 are given from the control signal selector 111 through mem_cnt1. The above operation is repeated until time t2, and the level 1 vertical processing (1V) is completed.
[0043]
When image data of 32 pixels × 32 lines is input, the total number of read / write cycles necessary for 1V processing is 2048 cycles because there are two types of input to the filter unit 130, S coefficient and D coefficient. .
[0044]
As a result of the 1V processing, for example, as shown in FIG. 3, the SS coefficient is in the upper left quadrant of the memory mem1, the SD coefficient is in the lower left quadrant, the DS coefficient is in the upper right quadrant, and the DD coefficient is in the lower right quadrant. Are written respectively.
[0045]
Next, level 2 processing is performed from time t2 to time t4. The data to be subjected to horizontal processing is the level 1 SS coefficient. Therefore, in level 2 horizontal processing (2H) from time t2 to time t3, the control signal selection unit 111 and the data selection unit 114 read the input data from the area in which the SS coefficient of the memory mem1 is stored, and filter The output data of the unit 130 is written into the memory mem2 for level 2, and the data can be read and written at the same time as in the case of the 1H processing. In level 2 vertical processing (2 V), data is read from and written to the memory mem2. Therefore, the total number of read / write cycles required for 2H processing and 2V processing is 256 cycles and 512 cycles, respectively.
[0046]
Subsequently, processing from time t4 to t6 is level 3 processing. In the horizontal processing (3H), reading of the SS coefficient of the memory mem2 and writing of data to the memory mem3 are simultaneously performed, and in the vertical processing (3V), the memory mem3 is read. Reading from and writing to are performed. Therefore, the total number of read / write cycles required for 3H processing and 3V processing is 64 cycles and 128 cycles, respectively. Subsequently, level 4 processing is performed from time t6 to t8, and the total number of read / write cycles required for the horizontal processing (4H) and vertical processing (4V) is 16 cycles and 32 cycles, respectively. Therefore, the total number of read / write cycles from level 1 to level 4 is 4080.
[0047]
Next, the operation during decoding will be described. FIG. 5 is a timing chart at the time of decoding. At the time of decoding, frequency band signal data is input from the encoding unit 200 to the wavelet transform unit 100. The wavelet transform unit 100 operates in the reverse order to that at the time of encoding and performs wavelet inverse transform, and the restored image data is externally output from data. Output to. The decoding operation is notified by the dir signal.
[0048]
In FIG. 5, when a start signal is input at time t0, level 4 vertical processing is started on the input level 4 coefficient data. The input coefficient is sent from the data selection unit 114 to the filter unit 130 through filter_out, and the output data from the filter unit 130 is input to the data selection unit 114 through filter_in, which is written to the memory mem4 under the control of the control signal selection unit 111. It is. This ends at time t1, and then level 4 horizontal processing is started. Data in the memory mem4 is read out, sent to the filter unit 130 via the data selection unit 114, and output from the filter unit 130. That is, the SS coefficient of level 3 is written into the corresponding area of the memory mem3 via the data selection unit 114. This vertical processing ends at time t2. From time t2, it is level 3 processing, vertical processing is performed on the inputted level 3 SD, DS, DD coefficients and SS coefficient read from the memory mem3, and the result is written in the corresponding area of the memory mem3. Level 3 vertical processing ends at time t3, and then level 3 horizontal processing is performed. As a result, level 2 SS coefficient data is written in a corresponding area of the memory mem2. Level 3 processing ends at time t4, level 2 vertical processing is performed, and level 2 horizontal processing is performed from time t5. Level 1 vertical processing is performed from time t6 to time t7, then level 1 horizontal processing is performed from time t8 to time t8, and the restored image data is output to the outside through data.
[0049]
At each level, the number of output pixels is double the number of input pixels in both the horizontal and vertical directions. Therefore, for example, the time from time t0 to t1 is four times the time from time t2 to t3 or the time from time t2 to t3 with respect to the time from time t1 to t2. Therefore, the number of cycles necessary for the decoding operation is 4080 cycles as in the encoding.
[0050]
Note that the memory maps shown in FIGS. 2 and 3 are merely examples shown to make it easy to grasp the processing image, and take any form in consideration of ease of addressing or design. Try to be free. This is because the memories mem1 to mem4 are independent in each level of computation, and are storages for finally storing frequency band signals, and also function as buffer memories for temporarily storing the results of horizontal processing. This is because.
[0051]
<Second embodiment>
FIG. 6 is a block diagram showing a second embodiment of the present invention. However, the encoding unit is omitted, and only the wavelet transform unit is shown.
[0052]
The configuration of this embodiment is suitable for an IC. That is, in the first embodiment, the case where the most general memory (one type of address, that is, one having a common read address and write address) is used as the memories mem1 to mem4. A memory such as a memory that is relatively frequently used in an IC, in which a read address and a write address are independent and can be read and written simultaneously, is used. Accordingly, the memories mem1 to mem4 have two types of address inputs ra (dedicated address input at the time of reading) and wa (dedicated address input at the time of writing), and two types of enable inputs reb (dedicated enable bar at the time of reading) and web (write). And a data input i and a data output o.
[0053]
The control signal selection unit (s_mux) 111 outputs signals of ra1 to ra4, wa1 to wa4, reb1 to reb4, and web1 to web4 to the memories mem1 to mem4. The signals ra, wa, reb, and web that are the basis of these signals are input from the main control unit (main) 118 to the control signal selection unit 111. Further, the control signal selection unit 111 receives the signals hb_s_mux (holizontal bar, s_mux) and level_s_mux (level, s_mux) from the main control unit in order to select the signal output according to the level. The encoding / decoding state is given from the outside by a dir (direction) signal. Similarly, the start signal is given from the outside.
[0054]
The data selection unit (d_mux) 114 receives the output signals o1 to o4 of the memories mem1 to mem4 and the output signals sin, din, xin1, and xin2 of the filter unit (filter) 130, and the input signals to the memories mem1 to mem4. Input signals so, do, xo1, and xo2 to i1 to i4 and the filter unit (filter) 130 are output.
[0055]
Hereinafter, the operation at the time of encoding of this embodiment will be described based on the timing chart of FIG. The time scale is defined such that all image data (here, 32 pixels × 32 lines, total 1024 pixels) is input from data between times t0 and t1, for example. Some of the signals in the figure change in units of pixels, but since they cannot be drawn in units of pixels, they are simplified as follows. That is, X is a don't care signal, x = x is a signal having a bit depth, and a web or start signal is a signal with a cadence such as ￣ | _ | ￣ (other than x = x), A signal in which two high / low signals are drawn at the same time is a 1-bit signal.
[0056]
A start signal is input at time t0, and image data is input from data in synchronization therewith. The main control unit (main) 118 outputs wa, web, hb_mux, level_s_mux, hb_d_mux, and level_d_mux signals. The control signal selection unit (s_mux) 111 and the data selection unit (d_mux) 114 select input / output signals for the memory to be accessed based on the level_s_mux, hb_d_mux, and level_d_mux signals. In FIG. 36, level_s_mux, level_d_mux, hb_d_mux, and hb_d_mux are depicted as the same signal, but it should be noted that there may be a time lag in units of pixels. From time t0 to t1, level 1 horizontal processing is performed on the input image data, and the result is written in the memory mem1. The selection of the memory to be accessed is notified when the signals level_s_mux and level_d_mux are 1, and the horizontal processing is notified when the signals hb_d_mux and hb_d_mux are at the low level. From time t1 to t2, level 1 vertical processing is performed on the S coefficient and D coefficient stored in the memory mem1, and the result is written in the memory mem1. The selection of the memory to be accessed is notified when the signals level_s_mux and level_d_mux are 1, as in the horizontal processing, and the vertical processing is notified when the signals hb_s_mux and hb_d_mux are at the high level. In this embodiment, a memory in which the read address and the write address are independent is used, and the read / write can be performed at the same time, so that the operation shown in the timing chart of FIG. 36 is possible.
[0057]
Level 2 processing is started from time t2. This time, the level 1 SS coefficient data (1SS) obtained in the memory 1 is processed. However, since the address signals of the memories mem1 and mem2 are independent, the level 1 processing sequence is almost unchanged. It can be applied to a sequence (the input changes from image data to 1SS coefficient data, and the size is only ¼). Therefore, the level 2 processing continued until time t4 can be easily understood by referring to the operation described in level 1.
[0058]
Next, level 3 processing is performed in the section from time t4 to t6, and level 4 processing is performed in the section from time t6 to t8. In the same manner as obtained in the first embodiment, when calculating the total number of memory accesses for this embodiment,
32x32x2 (H, V) + 16x16x2 + 8x8x2 + 4x4x2 = 2720 cycles
It becomes. However, even when multiple accesses were made, if it was possible at the same time, it was counted as one time. This number of times is ½ that of the prior art, and the number of accesses is 2/3 of that of the first embodiment.
[0059]
The operation at the time of decoding is in the reverse order to the operation described above, but this will be easily understood from the above description and the description related to the first embodiment, and the description will be omitted.
[0060]
<Third embodiment>
FIG. 7 is a block diagram showing a third embodiment of the present invention. The overall block configuration of the present embodiment is basically the same as that shown in FIG. 6, but the memories mem1 to mem4 and the filter unit 130 of the wavelet transform unit are omitted, and only the control unit 110 is illustrated.
[0061]
In the present embodiment, the control unit 110 has a configuration in which an encoding block and a decoding block are allocated exclusively. That is, the main control unit 118, the control signal selection unit 111, and the data selection unit 114 are divided into an encoding-only emin block 118e, an es_mux block 118d, and an ed_mux block 114e, and a decoding-only dmain block 118d, ds_mux block 111d, and a dd_mux block 114d. In addition, a main selection unit (m_mux) 120 and a start / end control unit (se) 122 are added. Input / output to and from the control unit is the same as in FIG.
[0062]
If attention is paid to each block of encoding main, es_mux and ed_mux, it will be understood that the configuration is the same as the configuration of FIG. Similarly, it will be understood that the configurations of the decoding dmain, ds_mux and dd_mux blocks are the same as those in FIG. The added main selection unit (m_mux) 120 simply switches the input or output of each block of es_mux and ed_mux and the input or output of each block of ds_mux and dd_mux in accordance with an external input signal dir. The added start / end control unit (se) 122 merely selects whether to operate the encoding block or the decoding block in accordance with external input signals start and dir. That is, the design of these two blocks is very easy and the circuit scale is very small. Therefore, the designer only needs to focus on the design of the three blocks 118e, 111e, and 114e for encoding, which are the main blocks, and the three blocks 118d, 111d, and 114d for decoding. In addition, since the blocks for encoding and decoding can be designed separately or independently, the design period and efficiency can be dramatically improved.
[0063]
<Fourth embodiment>
FIG. 8 is a block diagram showing a fourth embodiment of the present invention, and shows only the wavelet transform unit. The block configuration of the wavelet transform unit of this embodiment is basically the same as that of the first or second embodiment, but in this embodiment, the buffer memory 106 (hereinafter bmem) is independent of the memories mem1 to mem4. Is different). The buffer memory bmem has the same number of words as the memory mem1, and the write address and the read address are independent. Needless to say, wiring between the buffer memory bmem is added to the control signal selection unit 111 and the data selection unit 114.
[0064]
The operation in this embodiment can be easily understood with reference to the timing charts of FIGS. That is, as shown in FIG. 4, in the first embodiment or the second embodiment, the writing of the S coefficient and the D coefficient as the result of the horizontal processing is performed recursively in the memories mem1 to mem4. Then, as shown in FIG. 9, the writing of the result of the horizontal processing is performed in the buffer memory bmem. Accordingly, the total number of cycles in the present embodiment is 2720 cycles in the case of image data of 32 pixels × 32 lines, as in the first and second embodiments, and a processing time that is ½ that of the prior art is sufficient. It will be understood.
[0065]
The buffer memory bmem is used for storing the S coefficient and the D coefficient as a result of the horizontal processing during the wavelet processing. However, there is no problem even if the buffer memory bmem is used in another block after the wavelet processing is completed. In other words, if a work memory is required in the encoding unit, this buffer memory bmem can be used as the work memory. Since the work memory is usually prepared in most cases, the configuration of this embodiment is suitable.
[0066]
Furthermore, this embodiment exhibits its effect when the most general memory described in the first embodiment is used for the memories mem1 to mem4. That is, it is not necessary to use a memory that is relatively frequently used in an IC and that has an independent read address and write address. A memory having an independent read address and write address has a disadvantage that its area is larger than that of the most common memory in which both addresses are independent because of its multi-function. According to the present embodiment, high-speed processing equivalent to that of the third embodiment can be realized with a minimum memory area.
[0067]
<Fifth embodiment>
According to the fifth embodiment of the present invention, in the same configuration as the fourth embodiment, as shown in FIG. 10, two filter units (filterH) 130h and a filter unit (filterV) 130v are prepared. Assigned exclusively for horizontal processing and the latter for dedicated vertical processing. Although omitted in FIG. 10, it should be noted that in this embodiment, the memories mem1 to mem4 and the buffer memory bmem all use memories having independent read addresses and write addresses.
[0068]
The operation of this embodiment will be described based on the timing chart of FIG. When the input of image data (data) is started at time t0, this is sequentially input to the filter unit 130h, and the obtained S coefficient and D coefficient are written in the buffer memory bmem. Horizontal processing of the first line (note that the lines and pixels are counted from 0th) is started from time t1, and horizontal processing of the second line is started from time t2. When the S or D coefficient of the first pixel in the second line is output, data for calculating the D coefficient in the vertical processing is prepared, and vertical processing using the filter 130v unit is also started. Here, the data for calculating the SS coefficient and the DS coefficient in the vertical processing is already aligned and output at the time when the S or D coefficient of the first pixel of the first line is output. Note that it describes the processing. Now, the image data is sequentially input in the raster order, and the S and D coefficients subjected to the horizontal processing are sequentially written in the buffer memory bmem, and simultaneously read in the vertical direction, and further subjected to the vertical processing at the same time SS, SD. , DS and DD coefficients are written. Such an operation is possible because the configuration includes two sets of filter units 130h and 130v. Level 1 processing ends at time t4 and level 2 processing begins. At time t7, level 2 processing ends and level 3 processing starts. At level t10, level 3 processing ends and level 4 processing ends. Start, and the entire process ends at time t13.
[0069]
When processing image data of 32 pixels × 32 lines by simultaneous execution of the horizontal processing and vertical processing as described above, the total number of cycles in this embodiment is 1480 cycles, which is about 1 of the total number of cycles of the prior art. Because of / 4, a very high speed operation is possible.
[0070]
<Sixth embodiment>
FIG. 12 is a block diagram showing a sixth embodiment of the present invention. The block configuration of this embodiment is basically the same as that of the second embodiment, but in this embodiment, as shown in FIG. 12, three line memories 107s (line mem) for storing S coefficients are used. (S0), line mem (S1), line mem (S2)) and three line memories 107d (line mem (D0), line mem (D1), line mem (D2)) for storing D coefficients, Further, the data selection unit (d_mux) 114 includes signal lines for these line memories.
[0071]
The operation of the present invention will be described based on the timing chart of FIG. Input of image data (data) is started at time t0, and is input to the filter unit 130 and subjected to horizontal processing. The obtained S coefficient is stored in line mem (S0) in the line memory 107s, and the D coefficient is stored in the line memory 107d. Written to the line mem (D0). When the horizontal processing of the first line is started from time t1, the S coefficient written in lime mem (S0) is sequentially shifted to line mem (S1). The same applies to the D coefficient of line mem (D0). When the horizontal processing of the first line is completed at time t2, data for calculating each frequency band signal in the vertical processing is prepared. Here, once the input of the image data is stopped, vertical processing is performed on the S coefficient written in the line memory 107s and the D coefficient written in the line memory 107v, and SS, SD, DS and The DD coefficient is written into the memory mem1. At time t3, the input of image data is resumed, and the S coefficient of the third line is written in line mem (S0) and the D coefficient is written in line mem (D0). The S coefficient and D coefficient one line before are shifted to line mem (S1) and line mem (D1). This process is sequentially performed. The vertical processing for the last two lines (from time t4 to t5, from time t5 to t6) has all the necessary S coefficients and D coefficients (mirror processing is applied to the last line), so continues. Processed. Level 1 processing ends at time t6. In the following, time t6 to t9 are level 2 processing, time t9 to t10 are level 3 processing, and time t10 to t11 are level 4 processing.
[0072]
As long as the six line memories used in this embodiment can perform reading in parallel, any one such as a general register or a shift register may be used. When processing image data of 32 pixels × 32 lines, the total number of cycles in this embodiment is
(32x32x2) + (16x16x2) + (8x8x2) + (4x4x2) = 2720 cycles
It turns out that it becomes. The number of processing cycles is about half that of the prior art, and high-speed operation can be realized with a small circuit scale.
[0073]
<Seventh embodiment>
FIG. 14 is a block diagram showing a seventh embodiment of the present invention. The block configuration of this embodiment is basically the same as that of the sixth embodiment (FIG. 12), but this embodiment includes two filter units 130h and 130v as shown in FIG. And dedicated to vertical processing.
[0074]
The operation of this embodiment will be described based on the timing chart of FIG. When input of image data (data) is started at time t0, the S coefficient is written in line mem (S0) and the d coefficient is written in line mem (D0). When horizontal processing of the first line is started from time t1, the S coefficient written in line mem (S0) is sequentially shifted to line mem (S1). The same applies to the D coefficient of line mem (D0). When the horizontal processing of the first line is completed at time t2, data for calculating each frequency band signal in the vertical processing is prepared. After time t2, the horizontal processing is performed by the horizontal processing filter unit 130h, and the results are sequentially written in the line memories 107s and 107d. Reading is performed simultaneously with this writing, vertical processing is performed by the filter unit 130v dedicated to vertical processing, and the result is written to the memory mem1. This process is sequentially executed, and the level 1 process ends at time t3. Level 2 processing is performed from time t3, but level 2 processing is subject to level 1 SS coefficients. In the following, processing from level t3 to t6 is level 2, processing from time t6 to t8 is processing at level 3, and processing from time t8 to t10 is processing at level 4.
[0075]
As long as the six line memories used in this embodiment can perform reading in parallel, any one such as a general register or a shift register may be used. The above operation can be realized because two filter units are used. When processing image data of 32 pixels × 32 lines, since the total number of cycles in this embodiment is increased by 2 lines at each level, it will be apparent from FIG. 15 that it is 1480 cycles. This number of processing cycles is about ¼ of the processing time of the prior art, and a very high speed operation can be realized with a small circuit scale.
[0076]
<Eighth embodiment>
According to the eighth embodiment of the present invention, the basic block configuration is the same as that of the first embodiment or the second embodiment. However, as shown in FIG. , DS and DD are divided into four independent memories dedicated to coefficient data, and further, as shown in FIGS. 18 and 19, two filter units 130_1, which can be used for both horizontal processing and vertical processing, 130_2, and the difference is that each is used by switching input / output between horizontal processing and vertical processing.
[0077]
The operation of this embodiment will be described based on the memory map of FIG. 17, the connection diagrams of FIGS. 18 and 19, and the timing chart of FIG. FIG. 17 is a diagram showing a memory map. As preprocessing, raster data (external input image data or previous level SS coefficient data) as shown on the left side is dedicated to each of the four coefficients as shown on the right side in advance. It must be relocated on the memory. This rearrangement is performed according to the following rules. For example, in the case of level 1 processing, the 0th pixel data of the 0th line of raster data (data) is stored in the 0th pixel of the 0th line of the SS memory in the memory mem1, and the 0th line of the SD memory is stored. The first pixel data of the 0th line of the raster data is stored in the 0th pixel. The 0th pixel of the first line of the raster data is stored in the 0th pixel of the 0th line of the DS memory, and the 1st pixel of the first line of the raster data is stored in the 0th pixel of the 0th line of the DD. Eye data is stored. That is, the even-numbered pixels (the 0th line is also counted as even) are stored in the SS memory, the even-numbered odd-numbered pixels are stored in the SD memory, and the odd-numbered lines. The even-numbered pixels of the eye are stored in the DS memory, and the odd-numbered pixels of the odd-numbered lines are stored in the DD memory.
[0078]
The data rearranged in the four memories in the memory mem1 in this way is first connected as shown in FIG. 18 by the data selection unit 114, and the horizontal processing is performed using the two filter units 130_1 and 130_2 simultaneously. Done. Since each of the memories SS to DD in the memory mem1 is independent, reading / writing can be performed at the same time. The even lines are processed by the filter unit (filter1) 130_1, the odd lines are processed by the filter unit (filter2) 130_2, and the results are simultaneously written in the respective memories. Next, the connection is changed as shown in FIG. 19, and vertical processing is performed. That is, even-numbered pixels are processed by the filter unit (filter1) 130_1, odd-numbered pixels are processed by the filter unit (filter2) 130_2, and the result is simultaneously written in each memory.
[0079]
The above is the case of level 1 processing, but in the case of processing after level 2 as well, the processing target data is the SS coefficient data of the previous level, which is processed after being rearranged, and the memories mem2 to mem4 Is the same except using.
[0080]
Next, the operation of this embodiment will be described based on the timing chart of FIG. When input of image data (data) is started at time t0, first, image data (raster data) is distributed and stored in four memories in the memory mem1 as shown in FIG. 17 (tr in the figure). Input of image data is completed at time t1, horizontal processing is performed by the connection shown in FIG. 18 until time t2, and then vertical processing is performed by the connection shown in FIG. 19 until time t3. Level 1 processing ends at time t3. Between time t3 and t4, the level 1 SS coefficient data stored in the SS memory in the memory mem1 is sorted and stored in the four memories in the memory mem2 according to the rules shown in FIG. At time t4, the input of the SS coefficient of level 1 is completed, horizontal processing is performed on the SS coefficient data until time t5, and then vertical processing is performed until time t6. Thereafter, the same processing is performed until time t12, and the processing ends.
[0081]
The total number of cycles in the present embodiment is as follows when processing image data of 32 pixels × 32 lines.
32x32 + (16x16x3) + (8x8x3) + (4x4x3) + (2x2x2) = 1843 cycles
It becomes. This processing cycle number is about 1/3 of the processing time of the prior art, and a very high speed operation can be realized. The number of cycles excluding the time for data distribution performed prior to processing at each level is only 824 cycles.
[0082]
<Ninth embodiment>
According to the ninth embodiment of the present invention, the buffer memory bmem is provided in the same basic block configuration as in the eighth embodiment, as in the fourth embodiment, and the buffer memory bmem is also stored in the memories mem1 to mem4. As shown in FIG. 21, it is divided into four independent memories dedicated to SS, SD, DS, and DD coefficients as shown in FIG. However, each of the four memories constituting the buffer memory bmem has the same number of words as the memory mem1.
[0083]
The operation of this embodiment will be described based on the timing chart of FIG. When input of image data (data) is started at time t0, this data (raster data) is distributed and stored in four memories in the memory mem1 as in the case of the eighth embodiment (in the figure). tr). Input of image data (data) is completed at time t1, horizontal processing is performed until time t2, and the result is written in the buffer memory bmem. The operation at this time is the same as that in the eighth embodiment except that the horizontal processing result is written in the buffer memory bmem (see FIG. 18).
[0084]
Subsequently, data is read from the buffer memory bmem from time t2 to t3, and vertical processing is performed. For each coefficient, only the SS coefficient is written in the corresponding memory in the memory mem2, and the other three coefficients SD, DS, and DD are written in the corresponding memory in the memory mem1. Level 1 processing ends at time t3. In the present embodiment, since reading of the SS coefficient of level 1 from the buffer memory bmem can be performed in parallel with the processing of level 2, the level 2 processing is immediately started from time t3. Thereafter, level 2 processing is performed from time t3 to time t5, level 3 processing is performed from time t5 to time t7, and level 4 processing is performed from time t7 to time t9.
[0085]
The total number of cycles in this embodiment is when processing image data of 32 pixels × 32 lines.
32x2 + (16x16x2) + (8x8x2) + (4x4x2) + (2x2x2) = 1704 cycles
It can be seen from FIG. The number of processing cycles is less than 1/3 that of the prior art, and high-speed operation can be realized. The number of cycles excluding the time for first distributing data is only 680 cycles.
[0086]
<Tenth embodiment>
According to the tenth embodiment of the present invention, the basic block configuration is the same as the first embodiment, the second embodiment, the third embodiment, the sixth embodiment or the seventh embodiment, but at each level. As shown schematically in FIG. 23, each of the memories mem1 to mem4 corresponding to the number of words is ¼, and the bit depth is a bit depth required for each coefficient of SS, SD, DS, and DD. Equal to the sum. For example, when each coefficient of SS, SD, DS, and DD is 8 bits, the bit depth of each memory is 32 bits. Of course, even if the number of bits of each coefficient of SS, SD, DS, and DD is different, there is no problem as long as the sum of necessary bit depths is secured. The operation of the present embodiment can be represented by the same timing chart as in FIG. 20 when applied to the configuration of the eighth embodiment (FIG. 16), for example.
[0087]
According to the present embodiment, it is possible to reduce the memory size by collecting the memory locations separated for each frequency band signal into one memory location. This is because, in general, in the memory in the IC, when the number of words × bit depth is the same, the smaller the number of words, the smaller the area. Further, since the number of control signals such as addresses can be reduced to 1/4, there is an advantage that wiring is reduced, delay time is reduced, and noise is reduced.
[0088]
<Eleventh embodiment>
According to the eleventh embodiment of the present invention, in the same configuration as the fourth and ninth embodiments using the buffer memory bmem, not only the memories mem1 to mem4 but also the buffer memory bmem is the tenth embodiment. As described in the example, the word number is 1/4 and the bit depth is the sum of the required bit depths for the SS, SD, DS, and DD coefficients. FIG. 24 schematically shows such a state. Even if the number of bits of the SS, SD, DS, and DD coefficients is different, there is no problem as long as the sum of the necessary bit depths is secured.
[0089]
The operation of the present embodiment can be represented by the same timing chart as in FIG. 22 when applied to the configuration of the ninth embodiment (FIG. 21), for example. According to the present embodiment, the memory locations divided for each frequency band signal are combined into one memory location, so that the memory size including the buffer memory bmem can be reduced as described in the tenth embodiment. Further, there are advantages such as a reduction in delay time and noise due to a decrease in the number of control signals such as addresses.
[0090]
<Twelfth embodiment>
FIG. 25 is a block diagram showing a twelfth embodiment of the present invention. The wavelet transform unit 100 includes a level 1 wavelet transform unit module 100_level1, a level 2 wavelet transform unit module 100_level2, a level 3 wavelet transform unit module 100_level3, and a level 4 wavelet transform unit module 100_level4. Each wavelet transform unit module has basically the same configuration as that of each of the above embodiments except that it is related to only one level of processing corresponding to each of the wavelet transform unit modules. , Control of data transfer within the module, control of data transfer with an external or adjacent level module, and control unit for exchanging control signals with the external or adjacent level module.
[0091]
That is, the wavelet transform unit module 100_level1 for level 1 includes a memory mem1, a control unit 110_level1, and a filter unit 130_level1. The memory mem1 has the same number of words as image data (data) input from the outside. The wavelet transform unit module 100_level2 for level 2 includes a memory mem2, a control unit 110_level2, and a filter unit 130_level2, and the memory mem2 has a ¼ word number of the memory mem1. The wavelet transform unit module 100_level3 for level 3 includes a memory mem3, a control unit 110_level3, and a filter unit 130_level3. The memory mem3 has a word number that is 1/4 that of the memory mem2. The wavelet transform unit module 100_level4 for level 4 includes a memory mem4, a control unit 110_level4, and a filter unit 130_level41, and the memory mem4 has a ¼ word number of the memory mem3.
[0092]
The level 1 control unit 110_level1 has an external start signal as an input signal and a decode end signal dend1 from the level 2, and has an external decode end signal dend and an encode end signal eend1 to the level 2 as output signals. As input / output signals, there are a data input / output signal data1 for level 2, an input / output signal for memory mem1, and an input / output signal for filter unit 130_level1. The input signal of the control unit 110_level2 of the level 2 is the encoding start signal eend1 from the level 1 and the decoding end signal dend2 from the level 3, and the output signal is the decoding end signal dend1 to the level 1 and the encoding end to the level 3 The input / output signal is a data input / output signal data2 for level 3, an input / output signal for the memory mem2, and an input / output signal for the filter unit 130_level2. The input signal of the control unit 110_level3 of level 3 is the encoding start signal eend2 from level 2 and the decoding end signal dend3 from level 4, and the output signal is the decoding end signal dend2 to level 2 and the encoding end to level 4 The input / output signal is a data input / output signal data3 for level 4, an input / output signal for memory mem3, and an input / output signal for filter 130_level3. An input signal of the control unit 110_level4 of level 4 is an encode start signal eend3 from level 3 and an external decode start signal dstart, and an output signal is a decode end signal dend3 to level 3 and an external encode end signal eend. The input / output signals are an input / output signal for the memory mem4 and an input / output signal for the filter unit 130_level4.
[0093]
The timing chart of the operation of this embodiment is almost the same as that shown in FIG. 4 (for encoding) or FIG. 5 (for decoding). The operation during encoding will be described. When the start signal is input, the wavelet transform unit module 100_level1 performs horizontal processing on image data (data) input from the outside, and then executes vertical processing. When the level 1 processing is completed, this is notified to the control unit 110_level2 by the end1 signal, and the level 2 processing is started in the wavelet transform unit module 200_level2. Thereafter, when the processing up to level 4 is completed, an eend signal is output to the outside.
[0094]
Next, the operation during decoding will be described. When the dstart signal is input, the vertical processing is executed by the wavelet transform unit module 200_level4 on the level 4 coefficient data input from the encoding unit 200, and then the horizontal processing is executed. When the level 4 processing is completed and the level 3 SS coefficient data is reproduced, this is notified to the control unit 110_level3 by the dend3 signal, and the level 3 processing by the wavelet transform unit module 100_level3 is started. When the reproduction of the image data is completed by such wavelet inverse transform processing, a dend signal is output to the outside.
[0095]
According to this embodiment, it is not necessary to design while considering the entire operation over all levels, and only one basic module needs to be designed. By applying this design result to each level, it is possible to design the desired number of wavelet transform units, improving reliability and requesting design changes such as changing the data size or adding levels. Even in cases, it is possible to respond promptly.
[0096]
<Thirteenth embodiment>
FIG. 26 is a block diagram showing a thirteenth embodiment of the present invention. The wavelet transform unit 100 of this embodiment is basically composed of four wavelet transform unit modules as in the twelfth embodiment, but the internal configuration of each wavelet transform unit module is partially different. That is, each of the wavelet transform unit modules 100_level1 to 100_level4 is provided with a buffer memory (bmemi) 106_leveli (i = 1 to 4), for example, as in the fourth embodiment and the fifth embodiment, and further similar to the fourth embodiment. Are provided with two filter units 130h_leveli and 130v_leveli for horizontal processing and vertical processing. The buffer memories bmem1 to bmem4 have the same number of words as the memories mem1 to mem4, respectively. In FIG. 13, the internal configurations of the wavelet transform unit modules 100_level3 and 100_level4 are omitted.
[0097]
The operation of the present embodiment can be represented by a timing chart almost similar to that shown in FIG. The operation during encoding will be described. When the start signal is input, in the wavelet transform unit module 100_level1, horizontal processing and vertical processing for image data input from the outside are executed substantially in parallel. When the level 1 processing is completed, the control unit 110_level2 is notified by the eend1 signal, and the level 2 processing is started in the wavelet transform unit module 100_level2. Thereafter, when the processing up to level 4 is completed, an eend signal is output to the outside and the encoding process is terminated.
[0098]
Next, the operation during decoding will be described. When the dstart signal is input, the wavelet transform unit module 100_level4 executes the wavelet inverse transform vertical processing and then the horizontal processing. When the level 4 processing is completed and the level 3 SS coefficient is reproduced, the control unit 110_level3 is notified by the dend3 signal, and the level 3 processing is started in the wavelet transform unit module 100_level3. When the reproduction process of the image data (data) is completed in this way, a dend signal is output to the outside.
[0099]
In the present embodiment, the processing speed can be made faster than in the twelfth embodiment, and in the same way as in the twelfth embodiment, the design is simplified, the reliability is improved, and the response to the design change request is accelerated. There are advantages.
[0100]
<14th embodiment>
FIG. 27 is a block diagram showing a fourteenth embodiment of the present invention. According to this embodiment, instead of the buffer memory bmemi in each wavelet transform unit module 100_leveli in the thirteenth embodiment, the three line memories 107s_leveli for temporarily holding the S coefficient and the D coefficient are temporarily held. Three line memories 107d_leveli are provided. Naturally, the control unit 100_leveli has input / output signals for these line memories. In FIG. 27, the internal configuration of the wavelet transform unit modules 100_level3 and 100_level4 is the same as that of the wavelet transform unit module 100_level1 or 100_level2, and is omitted, and only input / output signals for other modules or the outside are shown.
[0101]
The encoding timing chart of this embodiment is almost the same as FIG. When the start signal is input, level 1 processing is executed in the wavelet transform unit module 100_level1. When the level 1 processing is completed, the level 2 processing is started in the wavelet transform unit module 100_level2 by the end1 signal. When the processing up to level 4 ends, an eend signal is output to the outside, and the entire processing ends. At the time of decoding, when a dstart signal is input, vertical processing and horizontal processing are performed by the wavelet transform unit module 100_level4. When the level 4 processing is completed and the level 3 SS coefficient is reproduced, the wavelet transform unit module 100_level3 is notified by the dend3 signal, and the level 3 processing is started. When the image data reproduction process is completed, a dend signal is output to the outside.
[0102]
The present embodiment has the same advantages as the seventh embodiment in addition to the same advantages as the twelfth embodiment.
[0103]
【The invention's effect】
As is clear from the above detailed description, the following effects can be obtained.
[0104]
According to the invention of claim 1, by providing a memory for each wavelet transform level, it is possible to shorten the time of horizontal processing (or vertical processing when vertical processing is performed first) and to speed up the processing. it can. The level-corresponding memory is independent for each level operation, and is a storage for finally storing frequency band signals, and as a buffer memory for temporarily storing the result of horizontal processing (or vertical processing). Therefore, address mapping can be freely performed in consideration of ease of addressing or design. Previously, three filters were prepared for each level, and a control mechanism had to be prepared for each. However, the control mechanism was separated from the filter and consolidated into the control unit, and a common filter unit was prepared for all levels. By doing so, it is only necessary to provide the filter unit only with the original calculation function of the filter, and it is possible to perform loop processing, simplifying the circuit configuration and reducing the scale, and facilitating the design.
[0105]
According to the second aspect of the present invention, it is possible to simultaneously read and write the corresponding memory in each level of processing, thereby further speeding up the processing.
[0106]
According to the third aspect of the present invention, since the part constituting the control unit is functionally simple and the circuit scale is reduced, the design or design change of the control unit is further facilitated and the reliability can be enhanced.
[0107]
According to the invention of claim 4, the encoding part and the decoding part of the control unit can be handled independently, and each part becomes functionally simpler and the circuit scale becomes smaller. The design of the part becomes easier and the reliability can be improved.
[0108]
According to the invention of claim 5, in addition to the level-corresponding memory, a buffer memory having at least the same number of words as the maximum memory is provided, and this buffer memory can be shared with buffer memories of other blocks. By using each level of processing, a memory with a small memory area compared to a memory with independent read and write addresses and a memory with independent addresses is used as a level-corresponding memory. High-speed processing comparable to the case becomes possible.
[0109]
According to the invention of claim 6 or 8, further high speed processing is possible by simultaneously executing the horizontal processing and the vertical processing.
[0110]
According to the invention of claim 7, high-speed processing can be realized without increasing the circuit scale by using a line memory that can be configured by a general register or shift register.
[0111]
According to the invention of claim 9 or 10, horizontal processing and vertical processing can be parallelized to further increase the speed, and a general memory whose read address and write address are not independent can be used as a level-corresponding memory, Since it can be used as a buffer memory, an increase in memory area can be avoided.
[0112]
According to the invention of claim 11 or 12, the memory size of the level-corresponding memory or the level-corresponding memory and the buffer memory can be reduced, and the signal delay and noise caused by the wiring related to the memory or buffer can be reduced. .
[0114]
Claim 13 According to this, it is possible to realize an encoding / decoding device having the advantages as described above.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a first embodiment of the present invention.
FIG. 2 is a diagram illustrating an example of a memory map for an S coefficient and a D coefficient in the first embodiment.
FIG. 3 is a diagram showing an example of a memory map for SS, SD, DS, and DD coefficients in the first embodiment.
FIG. 4 is a timing chart at the time of encoding in the first embodiment.
FIG. 5 is a timing chart at the time of decoding in the first embodiment.
FIG. 6 is a block diagram showing a configuration of a second exemplary embodiment of the present invention.
FIG. 7 is a block diagram showing a configuration of a third exemplary embodiment of the present invention.
FIG. 8 is a block diagram showing a configuration of a fourth exemplary embodiment of the present invention.
FIG. 9 is a timing chart at the time of encoding in the fourth embodiment.
FIG. 10 is a block diagram showing a configuration of a fifth exemplary embodiment of the present invention.
FIG. 11 is a timing chart at the time of encoding in the fifth embodiment.
FIG. 12 is a block diagram showing a configuration of a sixth embodiment of the present invention.
FIG. 13 is a timing chart at the time of encoding in the sixth embodiment.
FIG. 14 is a block diagram showing a configuration of a seventh exemplary embodiment of the present invention.
FIG. 15 is a timing chart at the time of encoding in the seventh embodiment.
FIG. 16 is a diagram showing a configuration of a memory in an eighth embodiment of the present invention.
FIG. 17 is an explanatory diagram of data rearrangement in the eighth embodiment.
FIG. 18 is a diagram showing connections during horizontal processing in the eighth embodiment.
FIG. 19 is a diagram showing connections during vertical processing in the eighth embodiment.
FIG. 20 is a timing chart at the time of encoding in the eighth embodiment.
FIG. 21 is a diagram showing a configuration of a buffer memory in a ninth embodiment of the present invention.
FIG. 22 is a timing chart at the time of encoding in the ninth embodiment.
FIG. 23 is a diagram showing a memory configuration in a tenth embodiment of the present invention.
FIG. 24 is a diagram showing a configuration of a memory and a buffer memory in an eleventh embodiment of the present invention.
FIG. 25 is a block diagram showing a configuration of a twelfth embodiment of the present invention.
FIG. 26 is a block diagram showing a configuration of a thirteenth embodiment of the present invention.
FIG. 27 is a block diagram showing a configuration of a fourteenth embodiment of the present invention.
FIG. 28 is a block diagram showing a conventional technique.
FIG. 29 is a timing chart at the time of encoding in the prior art.
FIG. 30 is an explanatory diagram of a calculation method of horizontal processing and vertical processing of wavelet transform.
FIG. 31 is a diagram illustrating an example of a memory map of image data.
FIG. 32 shows an example of a memory map for level 1 S and D coefficients.
FIG. 33 is a diagram illustrating an example of a memory map for level 1 SS coefficients, SD coefficients, DS coefficients, and DD coefficients;
FIG. 34 shows an example of a memory map for level 2 S and D coefficients.
FIG. 35 is a diagram illustrating an example of a memory map for level 2 SS coefficients, SD coefficients, DS coefficients, and DD coefficients;
FIG. 36 is a timing chart at the time of encoding in the second embodiment of the present invention.
[Explanation of symbols]
100 Wavelet transform unit
100_level1 ~ 100_level4 wavelet transform unit module
102_1 to 102_4 memory (mem1 to mem4)
106 Buffer memory (bmem)
107s, 107s_level1, 107s_level2 S coefficient line memory
107d, 107d_level1, 107d_level2 D coefficient line memory
110, 110_level1 to 110_level4 control unit
111 Control signal selector (s_mux)
111e Encoding dedicated control signal selection unit (es_mux block)
111d Dedicated decoding control signal selection unit (ds_mux block)
114 Data selection part
114e Encoding dedicated data selection part (ed_mux block)
114d Dedicated decoding data selection unit (dd_mux block)
118 Main control unit
118e Encoding main control unit (emain block)
118d Decode-dedicated main control unit (dmain block)
122 Start / end control unit (se block)
130, 130_1, 130_2 filter section
130h, 130h_level1 to 130h_level4 Horizontal processing filter section
130v, 130v_level1 to 130v_level4 Vertical processing filter section
130_level1 ~ 130_level4 filter part
200 Coding unit

Claims

A wavelet transform device that performs a plurality of levels of wavelet transform, wherein each of the wavelet transform levels has a one-to-one correspondence with each other, a plurality of independent memories, and a filter unit common to all wavelet transform levels, A wavelet transform device comprising: a control unit that controls data transfer inside the device, data transfer with the outside of the device, and operations of the plurality of memories.

A wavelet transform device that performs a plurality of levels of wavelet transform, wherein each of the wavelet transform levels has a one-to-one correspondence with each other, a plurality of independent memories, and a filter unit common to all wavelet transform levels, A control unit that controls data transfer inside the device, data transfer to the outside of the device, and operation of the plurality of memories, and the plurality of memories have independent read addresses and write addresses. A wavelet transform device characterized by being a memory capable of simultaneous writing.

3. The wavelet transform device according to claim 1, wherein the control unit selectively supplies at least a control signal of an address and an enable signal to the plurality of memories, data transfer inside the device, and data outside the device. 1. A wavelet transform device, which is divided into a part for controlling transfer and a part for controlling these two parts and for transmitting / receiving control information to / from the outside of the apparatus.

4. The wavelet transform apparatus according to claim 3, wherein at least three parts constituting the control unit are divided into a part dedicated for encoding and a part dedicated for decoding, respectively.

A wavelet transform apparatus that performs a plurality of levels of wavelet transform, and has a plurality of independent memories associated with each level of the wavelet transform, and a maximum number of words in the plurality of memories. A buffer memory common to all levels of wavelet transform having at least the same number of words as the memory, a filter unit common to all levels of wavelet transform, data transfer inside the device, data transfer outside the device, and the plurality And a controller for controlling the operation of the buffer memory.

6. The wavelet transform apparatus according to claim 5, wherein the filter unit includes two independent filter units, one of the two filter units being assigned to horizontal processing and the other being assigned to vertical processing. Conversion device.

A wavelet transform device that performs a plurality of levels of wavelet transform, wherein each of the levels of the wavelet transform has a one-to-one correspondence with each other, a plurality of independent memories, a line memory that is common to all levels of the wavelet transform, A filter unit common to all levels of the wavelet transform, and a control unit that controls data transfer inside the apparatus, data transfer to the outside of the apparatus, and operations of the plurality of memories and the line memory. Wavelet transform device.

8. The wavelet transform device according to claim 7, further comprising a horizontal processing filter unit and a vertical processing filter unit as the filter unit.

3. The wavelet transform device according to claim 1, wherein each of the plurality of memories includes a number of independent memories equal to the number of types of wavelet transform coefficients, and is used for both horizontal processing and vertical processing as the filter unit. A wavelet transform device comprising two independent filter units.

6. The wavelet transform device according to claim 5, wherein each of the plurality of memories and the buffer memory includes a number of independent memories equal to the number of types of wavelet transform coefficients, and performs horizontal processing and vertical processing as the filter unit. 2. A wavelet transform device comprising two independent filter units used for both.

3. The wavelet transform apparatus according to claim 1, wherein each of the plurality of memories has at least a bit depth equal to a sum of all kinds of bit depths of wavelet transform coefficients.

6. The wavelet transform device according to claim 5, wherein each of the plurality of memories and the buffer memory has at least a bit depth equal to a sum of all types of bit depths of the wavelet transform coefficients. Conversion device.

The wavelet transform device according to any one of claims 1 to 12 and the wavelet transform device, wherein the output data of the wavelet transform device is encoded, or encoded data input from the outside is decoded and the An encoding / decoding device comprising: an encoding unit that inputs to a wavelet transform device.