JP4486755B2

JP4486755B2 - Video memory management for MPEG video decoding and display system

Info

Publication number: JP4486755B2
Application number: JP2000591813A
Authority: JP
Inventors: ギル，アハロン; ローゼンタール，エラン; フランケル，ミリ; オフィール，ラム; アニスマン，デビッド; アイロニ，アロン; アール．ゴールドバーグ，パール
Original assignee: ゾランコーポレイション
Priority date: 1998-12-23
Filing date: 1998-12-23
Publication date: 2010-06-23
Anticipated expiration: 2018-12-23
Also published as: JP2002534865A; EP1142345A1; WO2000040033A1

Description

【０００１】
【発明の属する技術分野】
本発明は、ＭＰＥＧ（"Moving Picture Experts Groups"）復号・表示システムに関し、より詳細には、ビデオイメージの復号・表示用のＭＰＥＧ復号・表示システムで必要とされるビデオメモリサイズの減少に関する。
【０００２】
【従来の技術】
１９８０年代末期、モーションビデオおよびそれに関連する音声を第１世代ＣＤ−ＲＯＭ上に１．４Ｍビット／秒で載せる必要が生じた。この目的のため、１９８０年代末期および１９９０年代初頭、ＩＳＯ（「国際標準化機構」）ＭＰＥＧ委員会は、ビデオおよび２チャンネルステレオ音声のためのデジタル圧縮規格を開発した。この規格は、通称ＭＰＥＧ−１、公式にはＩＳＯ１１１７２として知られている。
【０００３】
ＭＰＥＧ−１に続き、衛星、カセットテープ、放送およびＣＡＴＶなどの伝送媒体用のエンターテインメントテレビを圧縮する必要が生じた。従って、フル解像度の標準解像度テレビ（ＳＤＴＶ）画像または高品位テレビ（ＨＤＴＶ）画像用のデジタル圧縮方法を利用可能にするため、ＩＳＯは通称ＭＰＥＧ−２、公式にはＩＳＯ１３８１８として知られる第２の規格を開発した。ＭＰＥＧ−２を最適化するために選ばれたビットレートは、ＳＤＴＶについては４Ｍビット／秒および９Ｍビット／秒、ＨＤＴＶについては２０Ｍビット／秒であった。
【０００４】
ＭＰＥＧ−１規格もＭＰＥＧ−２規格も、どの符号化方法を使用するか、符号化プロセス、または符号器の詳細について規定していない。これらの規格は、復号器へのデータ入力を表現するためのフォーマット、およびこれらのデータを解釈するためのルールセットを規定しているのみである。データを表すためのこれらのフォーマットは、構文と呼ばれ、ビットストリームと称される種々の有効なデータストリームを構築するのに用いることができる。データを解釈するためのルールは、復号意味論と呼ばれる。復号意味論の順序集合は、復号プロセスと呼ばれる。
【０００５】
ＭＰＥＧ構文は、空間的冗長性と時間的冗長性の両方を利用する種々の符号化方法をサポートしている。空間的冗長性は、８×８画素ブロックのブロックベースの離散コサイン変換（「ＤＣＴ」）符号化を用いて利用され、その後に量子化、ジグザグスキャン、および０量子化されたインデックスおよびこれらインデックスの振幅の可変長符号化が行なわれる。ＤＣＴ係数の知覚的重み付け量子化を可能にする量子化行列は、知覚的に無関係の情報を捨てるために用いることができ、従って、符号化効率がさらに向上する。一方、時間的冗長性は、動き補償予測、順方向予測、逆方向予測および双方向予測を用いて利用される。
【０００６】
ＭＰＥＧは２タイプのビデオデータ圧縮方法、すなわちフレーム内符号化およびフレーム間符号化を提供する。
【０００７】
フレーム内符号化は、空間的冗長性を利用するためのものである。対話的要件の多くは、フレーム内符号化のみで満たし得る。しかしながら、低ビットレートのいくつかのビデオ信号においては、フレーム内符号化のみで達成できる画像品質は十分ではない。
【０００８】
従って、時間的冗長は、予測エラーと呼ばれるフレーム間差信号を計算するＭＰＥＧアルゴリズムにより利用される。予測エラーの計算において、動き補償の技術は、動きについての予測を補正するために使用される。Ｈ.２６１におけるように、マクロブロック（ＭＢ）アプローチは、ＭＰＥＧにおける動き補償に採用されている。前方向予測と呼ばれる一方向の運動評価において、符号化しようとするピクチャ内の目標ＭＢは、参照ピクチャと呼ばれる過去のピクチャ内の同サイズの移動されたマクロブロックのセットと突き合わせられる。Ｈ.２６１におけるように、目標マクロブロックに最もよく一致する参照ピクチャ内のマクロブロックが予測ＭＢとして用いられる。予測エラーは、次に目標マクロブロックと予測マクロブロックとの差として計算される。
【０００９】
Ｉ．ピクチャバッファサイズ
（１）２つの参照フレーム
要約すれば、ＭＰＥＧ−２はビデオピクチャを３タイプのピクチャ（すなわち、イントラ「Ｉ」、予測「Ｐ」および双方向予測「Ｂ」）に分割する。定義により、Ｉピクチャ内のすべてのマクロブロックは、（ベースラインＪＰＥＧピクチャのように）符号化イントラでなければならない。さらに、Ｐピクチャ内のマクロブロックはイントラあるいは非イントラとして符号化し得る。Ｐピクチャの非イントラ符号化の間に、Ｐピクチャは、前に再構築されたピクチャから一時的に予測され、それにより直前のＩまたはＰピクチャについて符号化される。最後に、Ｂ（すなわち双方向的予測）ピクチャ内のマクロブロックは、独立して、イントラあるいは順方向予測、逆方向予測または順方向および逆方向（補間）予測などの非イントラとして選択され得る。Ｂピクチャの非イントラ符号化の間に、ピクチャは、直後のＩまたはＰピクチャだけでなく、直前のＩまたはＰピクチャを基準として符号化される。符号化順序については、Ｐピクチャは因果的であるのに対し、Ｂピクチャは因果的ではなく、予測用に偶然に符号化された２つの包囲するピクチャを使う。圧縮効率については、Ｉピクチャは効率が最も低く、Ｐピクチャは多少よく、Ｂピクチャは最も効率的である。
【００１０】
すべてのマクロブロックヘッダーは、マクロブロックタイプと呼ばれる要素を含み、これはスイッチのようにこれらのモードをオンおよびオフにできる。マクロブロック（または、ＭＰＥＧ−２におけるようなモーションタイプ）タイプは、ことによると、ビデオ構文全体の中で単独で最も強力な要素である。ピクチャタイプ（Ｉ、ＰおよびＢ）は、単に、意味論の範囲を広げてマクロブロックモードを可能にする。
【００１１】
ピクチャのシーケンスは、Ｉ、ＰおよびＢピクチャのほとんどどのようなパターンからでも成り得る。固定パターン（例えば、ＩＢＢＰＢＢＰＢＢＰＢＢＰＢＢ）を持つことは工業的には一般的であるけれども、より進化した符号器は、よりグローバルな特徴の文脈中のローカルなシーケンスの特徴に従って３つのピクチャタイプの配置を最適化することを試みるであろう。
【００１２】
上記で説明したように、Ｂピクチャを再構築するためには復号器は２つの参照フレーム（すなわち、２つのＰフレーム、ＰおよびＩフレーム各１つ、または２つのＩフレーム）を必要とするので、ビデオ復号・表示システムは、２つの参照フレームを格納するために、ビデオメモリの少なくとも２つのフレームを割り当てなければならない。
【００１３】
（２）２つの半フレーム
（Ｉ）インターレースビデオ
さらに、ＭＰＥＧ−２は、フレームがプログレッシブに符号化、またはインターレースされ「プログレッシブフレーム」変数により信号で伝えられ得ることを定義する。
【００１４】
プログレッシブフレームは、フィルムから組織されたビデオ材料のための論理選択であり、そこではすべての「画素」が統合、すなわちほとんど同じ瞬間にキャプチャされる。ピクチャ上の場面の光学的イメージは、１度に１ラインが、左から右そして上から下へ走査される。垂直方向に表し得る細部は、走査線数により制限される。従って、ラスタ走査低下の結果として、垂直解像度の細部のいくらかが失われる。同様に、各走査線のサンプリングのため、水平方向の細部のいくらかが失われる。
【００１５】
走査線の選択は、帯域幅、フリッカおよび解像度の矛盾した要件間のトレードオフを伴う。インターレースフレーム走査は、異なる時間にサンプリングされた２つのフィールドのラインで構成されたフレームを用いてこれらのトレードオフを達成しようとし、２つのフィールドのラインはインターリーブされており、それにより、フレームの２つの連続するラインは交互のフィールドに属する。これは空間的および時間的解像度における垂直−時間的トレードオフである。
【００１６】
インターレースされたピクチャのために、ＭＰＥＧ−２は２つの「ピクチャ構造」の選択を提供する。「フィールドピクチャ」では、各々がマクロブロックに分割されて、かつ別々に符号化される個々のフィールドから構成される。他方、「フレームピクチャ」で、各インターレースされたフィールドペアは、１つのフレームに共にインターリーブされ、そのフレームは次にマクロブロックに分割され、符号化される。ＭＰＥＧ−２は、インターレースされたビデオが交互のトップおよびボトムフィールドとして表示されることを必要とする。しかしながら、フレーム内では、トップまたはボトムフィールドの一方が、一時的に最初に符号化され、フレームの第１フィールドピクチャとして送られる。フレーム構造の選択は、ＭＰＥＧ−２パラメータの１つにより示される。
【００１７】
インターレースされたフレームピクチャを処理する従来の復号・表示システムにおいては、たとえトップフィールドおよびボトムフィールドのために再構築された両方のデータが復号器により同時に生成されたとしても、ボトムフィールドはトップフィールドの表示完了後にやっと表示されるか、あるいは逆の場合も同じである。ボトムフィールド表示においては、この遅延のため、遅延したものについてボトムフィールドを格納するために、１フィールド（半フレーム）サイズのバッファが必要である。付加的な１フィールド（半フレーム）のこの付加的なビデオメモリ要件が、２つの参照フレーム（すなわち、Ｉおよび／またはＰフレーム）を格納するために必要な２つのフレームの上述の要件に加えられることに留意されたい。
【００１８】
（ｉｉ）３：２プルダウン
リピート・ファースト・フィールドは、現在のフレームからのフィールドまたはフレームが、フレームレート変換の目的のために繰り返されるべきである（３０Ｈｚ表示対２４Ｈｚ符号化の下記例のように）ことを信号で伝えるために、ＭＰＥＧ−２に導入された。平均して、２４フレーム／秒符号化シーケンスにおいて、１つおきの符号化されたフレームは、リピート・ファースト・フィールドフラッグを信号で伝えるであろう。従って、２４フレーム／秒（または４８フィールド／秒）符号化シーケンスは、３０フレーム／秒（６０フィールド／秒）表示シーケンスになるであろう。このプロセスは数十年来、３：２プルダウンとして知られている。テレビ出現以来、ＮＴＳＣディスプレイ上で見られたほとんどの映画は、この方法で表示されてきた。ＭＰＥＧ−２フォーマットにおいては、リピート・ファースト・フィールドフラッグはあらゆるフレーム構造化ピクチャにおいて独立して決定されるので、実際のパターンは不規則となり得る（それは、文字どおり１つ置きのフレームである必要はない）。
【００１９】
ＭＰＥＧ−２ビデオについては、ビデオディスプレイおよびメモリコントローラは、復号されたビデオデータと共に来るフラッグをチェックすることにより、いつ３：２プルダウンを実行するかを自身で決定しなければならない。ＭＰＥＧ−２は、フレームまたはフィールドが繰り返されるべきかどうかを明示的に記述する２つのフラッグ（リピート・ファースト・フィールドおよびトップ・フィールド・ファースト）を提供する。プログレッシブシーケンスにおいては、フレームは２ないし３回繰り返され得る。一方、シンプルおよびメインプロファイルは、繰り返されたフィールドにのみ限定される。さらに、リピート・ファースト・フィールドは、プログレッシブフレーム構造化ピクチャにおいてのみ信号で伝えられ得る（値＝１）ということは一般的な構文上の制限である。
【００２０】
例えば、最も一般的なシナリオでは、１つのフィルムシーケンスは毎秒２４フレームを含む。しかしながら、シーケンスヘッダー中のフレームレートエレメントは、３０フレーム／秒を示すであろう。平均して、１つおきの符号化されたフレームは、フレームレートを２４Ｈｚから３０Ｈｚまで水増しするために、リピートフィールド（リピート・ファースト・フィールド＝１）の信号を送る。すなわち、（２４符号化フレーム／秒）＊（２フィールド／符号化フレーム）＊（５表示フィールド／４符号化フィールド）＝６０表示フィールド／秒＝３０表示フレーム／秒。
【００２１】
３：２プルダウン能力を有するシステムについては、ビデオメモリの別の余分な１フィールド（半フレーム）が、最初に表示されたフィールドを後に繰り返す目的で格納するために必要とされる（３：２プルダウンプロトコルによる）。なぜならば、最初に表示されたフィールドは、第２フィールドの表示終了後に表示用に再度必要とされるからである。この要件を説明すると、復号中にトップフィールドが表示されると、次にボトムフィールドはトップフィールドの表示完了に続いて表示されるであろう。しかしながら、このトップフィールドは、システムがボトムフィールド表示を終了した後に再び表示用に必要とされるであろう（すなわち、３：２プルダウン）。トップフィールドは、２つの異なる瞬間（すなわち、ボトムフィールド表示の前後）に表示用に必要とされるので、トップフィールドを格納するためにビデオメモリの別の半フレーム（１フィールド）が必要になる。２つの参照フレームを格納するのに必要な上記の２．５フレームに、インターレースされたピクチャを表示するためにの半フレームを加えて、ＭＰＥＧ−２ビデオピクチャを表示するために、従来システムは合計３フレームのビデオメモリを必要とする。
【００２２】
（ｉｉｉ）静止フレーム
いくつかの新設計のＭＰＥＧ−２ビデオ復号・表示システムでは、現在表示中のフレームを利用者が静止させることもできる。「静止フレーム」条件下では、ビデオ復号・表示システムは、利用者からさらに指示があるまで現在表示中のピクチャを繰り返し表示する。停止中はそれ以上の復号および３：２プルダウンが必要とされないので、上述の２つの半フレーム（すなわち、インターレースされたピクチャの表示と３：２プルダウン用）は要求されない。しかしながら、もし表示中のフリーズされたフレームがプログレッシブＢフレームであれば、Ｂフレーム全体がビデオシステム中に格納される必要があるので、現在表示中のＢフレームを格納するために、ビデオメモリの余分なフレームが必要である。フリーズされたフレームがＩまたはＰフレームであれば、余分なビデオメモリが全く必要とされないことは真実である。なぜなら、これら２つの参照フレームはすでにビデオメモリ中に（順方向予測フレームおよび逆方向予測フレームとして）格納されているからである。しかしながら、通常Ｂフレームは参照用にビデオメモリ中に格納されないので、ビデオメモリの余分なフレームがＢフレーム表示用に必要である。従って、Ｂフレームピクチャの正確なイメージを表示するためには、参照フレームを格納するために必要な２つのフレームに加え、ビデオメモリの別のフルフレームが必要である。
【００２３】
ＩＩ．従来のビデオ復号・表示システムが直面する問題
ＭＰＥＧ−２メインプロファイル・メインレベルシステムは、ＣＣＩＲ６０１パラメータにおいてサンプリング限界（ＮＴＳＣについては７２０×４８０×３０Ｈｚ、ＰＡＬについては７２０×５７６×２４Ｈｚ）を有すると規定されている。用語「プロファイル」は、ＭＰＥＧにおいて使われる構文（すなわち、アルゴリズム）を限定するために用いられ、用語「レベル」は、符号化パラメータ（サンプルレート、フレーム寸法、符号化されたビットレート等）を限定するために用いられる。それと共に、ビデオメインプロファイル・メインレベルは、複雑さを１９９４ＶＬＳＩ技術の実現可能な制限で標準化し、それでもやはり大半のアプリケーションの要求を満たす。
【００２４】
ＣＣＩＲ６０１レートビデオについては、ＰＡＬ／ＳＥＣＡＭ復号・表示システムに必要な３フレームが、復号器のビデオメモリ要件を、通常の１６Ｍビット（２Ｍバイト）限界を越えて、３２−Ｍビットの次のレベルのメモリデザインへと押しやる。このメモリ制約は、アメリカ特許第５，６４６，６９３号（１９９７年７月８日、Ｃｉｓｍａｓに対し発行され、本出願と同じ譲受人に譲渡された）中で論じられている。Ｃｉｓｍａｓ引例はこれにより、参照により完全に含まれる。従来技術においてよく立証されてきたように、ビデオメモリを１６−Ｍビット以上に増大させると、メモリコントローラのデザインを複雑にし、また別の１６Ｍビットビデオメモリの製造費用が加わることになるであろう。
【００２５】
【発明が解決しようとする課題】
Ｃｉｓｍａｓは、サイズが通常のビデオメモリ要件よりも小さい（すなわち、メモリの３フレーム）フレームバッファを用いて、ＭＰＥＧビデオの復号・表示を行なうシステムおよび方法を開示している。復号・表示の間、Ｃｉｓｍａｓのシステムは、完全な第２のフィールドを格納する代わりに、第１のフィールドが表示されつつある時は、第２のフィールドの一部のみを格納する。表示のために第２のフィールドが必要になると、復号器は、第２のフィールドの足りない部分を再度復号して、第２のフィールド表示用の残りのデータを生成する。第２のフィールドを表示するために、最初に表示されたフィールドについて割り当てられた同じ格納場所を再度利用することによって、復号・表示システムは、メモリの半フレームまでの節約をする（パーティションの数による）。しかしながら、表示中のビデオメモリ不足の問題をＣｉｓｍａｓシステムがたとえ解決したとしても、Ｃｉｓｍａｓシステムは、第２のフィールドの足りない部分を２度目に復号するための付加的な復号パワーを必要とする。従って、ビデオメモリの要件を低減する一方で、第２の復号段階を無くすというＣｉｓｍａｓと同じ目的を達成できる新しいビデオメモリ管理システムが望まれている。
【００２６】
本発明の付加的な目的および利点は、以下の説明中で述べられ、またその説明からある程度明らかになるか、あるいは本発明の実施により学ばれ得る。本発明の目的および利点は、特に特許請求の範囲において指摘される手段および組合せによって実現および達成され得る。
【００２７】
【発明の要約】
本発明は、圧縮されたビデオデータのビットストリームの復号・表示に用いられるビデオメモリを取り扱うための改良されたビデオメモリコントロールシステムに向けられている。詳細には、本発明は、圧縮されたビデオデータのビットストリームの受信、圧縮されたビデオデータの復号、およびその中に含まれているイメージの表示のためのビデオメモリ管理システムに向けられている。
【００２８】
このメモリ管理システムの１つの側面は、ビデオメモリ管理を取り扱うためのモジュール式メモリ管理システムまたは装置を提供することである。より詳細には、本発明のビデオ復号・表示システムは、２つの別個のメモリマネージャを有する分割メモリマネージャから成る。これらのメモリマネージャの各々は、特定のメモリ管理機能を取り扱う。メモリ管理機能を種々のメモリマネージャ間で分けることにより、ビデオデータの復号・表示はより効率的に実行できる。
【００２９】
本発明の別の側面は、ビデオメモリの斬新な処理および取り扱方法を提供することで、これによりＭＰＥＧ−２データストリームの効率的な取り扱い方法が提供される。特に、セグメント化された再利用可能なビデオメモリ管理システムが開示されている。
【００３０】
本発明の別の側面は、ビデオ復号・表示システムに必要とされるビデオメモリのサイズをさらに減らすことが可能なフレーム内ビデオデータ圧縮システムを提供することである。
【００３１】
添付図面は、本明細書中に含まれかつその一部を構成しており、本発明の好ましい実施態様を例示するもので、上述の全般的説明および以下に示される好ましい実施態様の詳細な説明と共に、本発明の原理の説明を助けるものである。
【００３２】
【発明の実施の形態】
本発明は、いくつかの好ましい実施態様について説明される。好ましい実施態様は、ビデオデータの取り扱いおよび処理のための装置および方法である。以下の本発明の３つの側面が詳細に論じられる。すなわち、（１）分割ビデオメモリマネージャ、（２）セグメント化再利用可能メモリデザイン（本発明者により「ローリングメモリデザイン」と名付けられている）、および（３）方法および用途を考慮しないシンプル固定低ビットレート補助圧縮技術（本発明者により「ＦｌｅｘｉＲａｍ」と名付けられている）である。本発明のこれら３つの側面すべては、ビデオメモリデータの効率的取り扱いについて開示され、単独のビデオ復号・表示システムにおいて実行できる。
【００３３】
１．分割ビデオメモリマネージャ
図１は、従来のビデオ復号・表示システムを示すもので、復号器１１０、メモリマネージャ１２０、ビデオメモリ１３０、および表示サブシステム１４０から成っている。符号化されたビデオデータの連続したビットストリームは、復号および表示のため復号器１１０に供給される。ビデオデータの連続したビットストリームは、全体的なシステムデザインに応じて、一定または可変ビットレートで供給される。復号器１１０はビットストリームを復元し、次に復元されたデータをメモリマネージャ１２０に供給する。メモリマネージャ１２０は、復元されたビデオデータをビデオメモリ１３０中に格納し、そのビデオデータを表示サブシステム１４０に供給する役割を果たしている。
【００３４】
ビデオメモリ１３０は基本的に、２つの主要な機能を果たしている。すなわち、第１に、これはビデオデータの入り来るビットストリームの復号と復号されたビデオイメージの表示との間で異なる処理測度を緩衝するためのビデオデータバッファとして働く。ビデオイメージ表示のいかなる中断も防止するため、復号器１１０は、表示サブシステム１４０からの要求よりも速くビデオデータを復元することが一般に要求される。第２に、ビデオメモリ１３０は、参照フレームデータ（すなわち、ＩおよびＰフレーム）を格納し、データの入り来るビットストリームからビデオイメージを再構築するために復号器１１０に提供することが必要である。従って、図１に示すように、復号器１１０とメモリマネージャ１２０との間のデータバスは双方向であり、それにより復元されたデータは、メモリマネージャ１２０へおよびそこから流れている。
【００３５】
図２は、本発明により開示される斬新な分割メモリマネージャデザインを説明するものである。図１に示すような従来のシングルメモリマネージャデザインの代わりに、本発明のメモリマネージャは２つの別個のメモリマネージャから成っている。すなわち、第１メモリマネージャ２１０と第２メモリマネージャ２２０である。各メモリマネージャはビデオメモリ（すなわち、第１メモリ２３０および第２メモリ２４０）に連結されている。
【００３６】
前の段落で論じたように、基本的に、ビデオメモリに格納される２つのビデオデータグループがある。参照フレームデータ（すなわち、ＩおよびＰフレーム）と、双方向フレームデータ（すなわち、Ｂフレーム）である。ビデオデータのこれら２つのグループは、復号および表示両方において異なって処理されるので、ビデオデータのこれら２つのグループ間のビデオメモリの取り扱いはまったく異なる。
【００３７】
本発明の好ましい実施態様において、第１メモリマネージャ２１０および結合第１メモリ２３０は、参照フレーム（すなわち、ＩおよびＰフレーム）を取り扱いかつ格納するために割り当てられるのに対し、第２メモリマネージャ２２０および結合第２メモリ２４０は、双方向フレーム（すなわち、Ｂフレーム）データを取り扱いかつ格納するために割り当てられる。ビデオメモリコントローラを２つの分離された別個のメモリコントローラに分割することにより、各種類のビデオデータを取り扱うのに最も効率的なメモリ管理および／または圧縮方法を実行するために、２つのコントローラの各々を特別にデザインできる。従来のメモリマネージャを２つの別個のマネージャに分割することにより、従来のビデオ復号・表示システムに優る、ビデオデータをより効率的に取り扱うことができる一方で、単純なシステムデザインを有するという利点が本発明によりもたらされる。
【００３８】
例えば、Ｃｉｓｍａｓが開示したビデオメモリ管理システムにおいては、メモリ管理方法はＢフレームに向けられる。分割メモリマネージャデザインを用いた好ましい実施態様においては、第１メモリマネージャ２１０または３１０および付属する第１メモリ２３０または３３０が従来のビデオメモリ管理機能を実行するのに対して、第２メモリマネージャ２２０または３２０および付属する第２メモリ２４０または３４０のみがＣｉｓｍａｓのメモリ管理技術を実行するようにデザインすることができる。従って、個々の種類のビデオデータ用に特定のビデオメモリ管理技術をカスタマイズできるという本発明の能力は、従来デザインに優るより大きなデザイン柔軟性をビデオシステム設計者にもたらす。
【００３９】
さらに、別の好ましい実施態様においては、本発明は３つのメモリマネージャから成る。第１メモリマネージャはＩフレームを取り扱う。第２マネージャはＰフレームを取り扱う。さらに、第３メモリマネージャはＢフレームのみを取り扱う。
【００４０】
別の好ましい実施態様においては、本発明は３つメモリマネージャから成る。３つのメモリマネージャの各々は、３つの色成分（例えば、Ｙ、Ｕ、およびＶ）の１つを取り扱う。
【００４１】
多数のメモリマネージャを有するさらなる好ましい実施態様においては、メモリマネージャの各々は、ビデオデータのタイプ（例えば、色成分）および／またはグループ（例えば、参照フレームまたは双方向フレーム）の異なる組合せを取り扱うことができる。例えば、好ましい実施態様においては、ビデオ復号・表示システムは、Ｉピクチャ用のＹクロマ成分を取り扱うための第１メモリマネージャ、Ｂピクチャ用のＵおよびＶクロマ成分を取り扱うための第２メモリマネージャ、ＩおよびＰピクチャ用のＵおよびＶクロマ成分を取り扱うための第３メモリマネージャ、および残りのビデオデータを取り扱うための第４メモリマネージャから成る。４つのメモリマネージャの各々は、異なるメモリ管理技術を利用できる。
【００４２】
本発明の分割メモリマネージャデザインによりもたらされるデザイン柔軟性はほとんど無制限なので、ビデオデータの特定の種類または定義に適合するように、個々のメモリマネージャをカスタマイズし得る。
【００４３】
図３は、本発明の別の好ましい実施態様を示すものである。ＭＰＥＧ−２のメインプロファイル・メインレベルの定義の下では、リピート・ファースト・フィールドは、ＰＡＬ規格についてのＢフレームについては許されない。従って、ＭＰＥＧ−２メインプロファイル・メインレベルＰＡＬシステムについてデザインされた好ましい実施態様においては、リピート・ファースト・フィールドは第２メモリマネージャ３２０に供給される必要はない。なぜならば、第２メモリマネージャ３２０はＢフレームピクチャのみを処理するからである。しかしながら、トップ・フィールド・ファースト信号は、インターレースされたピクチャを取り扱うための第１メモリマネージャ３１０および第２メモリマネージャ３２０の両方にとり依然として必要である。
【００４４】
２．セグメント化された再利用可能なビデオメモリデザイン
本発明の別の側面は、フィールドピクチャおよびフレームピクチャ両方を取り扱うことができるビデオ復号・表示システムのためのビデオメモリ割り当てを取り扱う方法に向けられている。
【００４５】
前に説明したように、ＭＰＥＧ−２は、インターレースされたピクチャに対し２つのピクチャ構造の選択を提供する。フィールドピクチャは、各々がマクロブロックに分割され、個別に符号化された個々のフィールドから成る。一方、フレームピクチャでは、インターレースされた各フィールドペアは一緒に１つのフレームにインターリーブされ、そのフレームは次にマクロブロックに分割され、符号化される。ＭＰＥＧ−２は、インターレースされたビデオが交互のトップおよびボトムフィールドとして表示されることを必要とする。しかしながら、フレーム内では、トップまたはボトムフィールドの一方が一時的に先に符号化され、あらかじめ定められた定義に従ってフレームの第１フィールドピクチャとして送られ得る。２つのピクチャ構造についての復号されたデータのシーケンスにおける違いのため、これらの２つのフォーマットにより定義されたビデオデータの読み取りおよび書き込みは、異なる２つの順番で実行される。
【００４６】
図４は、本発明の好ましい実施態様を説明するものである。図に示されるように、ビデオメモリ４１０は多数のメモリセグメントに区切られている。メモリセグメントの各々は、２つのフィールドのうちの１つのビデオデータの８本の連続するラインを格納することができる。メモリマネージャは、２つのリスト、ＭＳ（メモリセグメント）４２０およびＤＳ（表示セグメント）４３０を作り出しかつ維持する。各メモリセグメントは、メモリセグメントリスト４２０のエントリを介して、表示セグメントリスト４３０のエントリにより間接的にアドレス指定される。ＭＳリスト４２０は、ビデオメモリセグメントと同数のエントリを有し、ＭＳリストの各エントリは１つのメモリセグメントに対応している。好ましい実施態様においては、ＭＳリスト４２０は、４８個のメモリセグメントをビデオメモリ４１０中にアドレスするための４８個のエントリを有している。ＰＡＬのケースについては、ビデオメモリ中の４８個のメモリセグメントがフレームの２／３に対応している。ＭＳリスト４２０の各エントリは、ビデオメモリ４１０中のメモリセグメントの第１セルアドレスを定義する２０ビットアドレスを含んでいる。これらのアドレスは、システム設計に応じて、静的または動的とし得る。例えば、ＭＰＥＧ−２ビデオデータの復号・表示用に特にデザインされた専用システムにおいては、システムオーバーヘッドを減らすため、アドレスは、好ましくは静的であって動的ではない。しかしながら、ＭＰＥＧ−１およびＭＰＥＧ−２ビデオデータ両方を表示可能なシステムのような多目的デジタル式復号・表示システムにおいては、これらのアドレスは、種々のビデオメモリ管理方法に対処できるように、好ましくは動的に維持される。
【００４７】
図４に示されるような好ましい実施態様においては、ＤＳリスト４３０は、表示スクリーンへビデオメモリ４１０をマッピングする。表示スクリーンは多数のセグメントに論理的に分割されており、ＤＳリスト４３０の各エントリは表示スクリーンの１つのセグメントに対応している。好ましい実施態様においては、ＤＳリスト４３０は７２個のエントリを含み、１つが各表示セグメントに対応している。７２という数が選ばれたのは、ＰＡＬフォーマットにおいては、５７６本のラインが表示され、かつ各セグメントなビデオデータの８本のラインを含んでいるからである。従って、数７２＝５７６／８が選択される。
【００４８】
図４と図５に示されるように、ＭＳリスト４２０の各エントリは、ビデオメモリの特定のメモリセグメントがビジーであるかどうかを示すために、１つの付加的なビジービット５１０（「ビジービット」）を含んでいる。具体的には、あるメモリセグメントに書き込まれてそれが表示されなければ、それはビジーとして示される。同様に、ビジーでないメモリセグメントとは、そのメモリセグメントが表示されたが、新しいデータはなんらその中に含まれていないことを意味する。本発明にとり、ビジービット５１０を用いることは非常に重要である。なぜならば、ＭＳリスト４２０中の各エントリ上のビジービット５１０のスイッチングは、ビデオメモリ４１０中の論理的スライディングウインドウを定義することと特徴付け得るからである。データがビデオメモリセグメントに書き込みされおよびそこから読み出された時にビジービット５１０を「オン」および「オフ」に設定することにより、論理的スライディングウインドウは、ビジービットが「オン」のＭＳリスト中の対応するエントリを有するすべてのメモリセグメントとして定義される。さらに、論理的スライディングウインドウは、図７に示されるように、２つのＤＳ書き込みポインタ（すなわち、７１０および７２０、または７３０および７４０）および読み取りポインタ（すなわち、７５０または７６０）のインクリメントおよび更新によっても維持される。
【００４９】
ビデオメモリ４１０中のメモリセグメントおよびＭＳリスト４２０中のエントリの物理的位置が、ＤＳリスト４３０中のエントリのように連続して割り当てられなくても、論理的スライディングウインドウは、ＤＳリスト４３０中の連続しているエントリにより定義される。ビデオメモリ４１０中のメモリセグメントにより多くの復号データが格納された後に論理的スライディングウインドウはサイズが増大するのに対し、ビデオメモリ４１０中のメモリセグメントから復号データが読み出された後では、サイズが減少する。さらに、ビデオデータがビデオメモリから読み出されおよびそれに書き込まれると、論理的スライディングウインドウは位置がシフトする（すなわち、ＤＳリスト４３０下方へ）。本発明のキーコンセプトは、表示スクリーンの一部に対応する論理的に連続するメモリスペースを定義できることあり、対応する復号されたビデオデータは、たとえそのビデオデータが物理的ビデオメモリにランダムに格納されることがあったとしても、現在ビデオメモリに書き込まれており、かつ表示されていない。本発明の利点は、物理的ビデオメモリのサイズをフルピクチャより小さくできることである。理論上は、物理的メモリサイズはフルピクチャの格納に必要なサイズの半分（すなわち、フレームの半分）まで小さくし得る。
【００５０】
図４は、本発明の好ましい実施態様により用いられる間接的アドレッシングスキームを示すものである。ＤＳリスト４３０の各エントリは、ＭＳリスト４２０の１つのエントリを指すインデックスを格納するのに対し、ＭＳリスト４２０の各エントリは、ビデオメモリ４１０のメモリセグメントのアドレスを格納する。
【００５１】
好ましい実施態様において、本発明のビデオ復号・表示システムは、フレームピクチャフォーマットおよびフィールドピクチャフォーマット両方におけるインターレースされたピクチャを取り扱う。従って、図６に示されるように、ＤＳリスト４３０は、論理的に２つの部分に区切られる。すなわち、トップフィールドに対応するトップポーション６１０およびボトムフィールドに対応するボトムポーション６２０である。０から３５まで索引を付けられたエントリはトップフィールドに属し、３６から７１まで索引を付けられたエントリはボトムフィールドに属する。ＤＳリスト４３０を対応するビデオメモリセグメント４２０に一致させるため、ＤＳリスト４３０の各エントリは、ＭＳリスト４２０への６ビットインデックス（０から４７）を含む。従って、ＤＳリスト４３０の各エントリはＭＳリスト４２０のエントリを参照し、かつビデオメモリの１つのセグメントに間接的にアドレスする（図４参照）。
【００５２】
図７に示されるように、好ましい実施態様において、ＤＳリスト４３０用に３つのポインタが維持される。すなわち、読み取り・表示用の読み取りポインタ（すなわち、７５０または７６０）１つ、およびビデオメモリ書き込み用の書き込みポインタ（すなわち、７１０、７２０または７３０、７４０）２つである。
【００５３】
フィールドピクチャおよびフレームピクチャについては、これら３つのＤＳポインタは異なる位置に割り当てられる。
【００５４】
図７（ａ）は、ビデオシステムがフレームピクチャを取り扱っている時のこれら３つのポインタの位置を示すものである。データが２つのフィールド間でインターリーブ的に復号される、フレームピクチャ用ビデオデータのビットストリームを取り扱うため、２つの書き込みポインタ７１０、７２０は、間に３６のインタバルを有する２つのＤＳエントリを指すように設定され、それにより、２つの書き込みポインタの各々は、異なるフィールド中の同じオフセット位置にあるエントリを指している（すなわち、トップフィールドおよびボトムフィールド）。さらに、読み取りポインタ７５０は、次に表示されるべきＤＳリスト４３０のエントリを指している。フレームピクチャのトップフィールドおよびボトムフィールド両方に対応するため、論理的スライディングウインドウは、論理的トップウインドウおよび論理的ボトムウインドウからさらに成っている。論理的トップウインドウは、トップウインドウ内にスライディングウインドウを定義し、これはビデオメモリ中のメモリセグメントに間接的にアドレッシングしているＤＳリストのトップポーション中の連続するエントリにより示され、このメモリセグメントは書き込まれたがまだ表示されていないものである。同様に、論理的ボトムウインドウは、スライディングウインドウをボトムフィールド中に定義し、これはビデオメモリ中のメモリセグメントに間接的にアドレッシングしているＤＳリストのボトムポーション中の連続するエントリにより示され、このメモリセグメントは書き込まれたがまだ表示されていないものである。
【００５５】
図７（ｂ）は、ビデオ復号・表示システムがフィールドピクチャを取り扱っている時のこれらのポインタの位置を示すものである。フィールドピクチャのビデオデータは復号器により逐次復号および供給されるので、２つの書き込みポインタは７３０、７４０は、それらの間に１の固定インタバルを有するように設定されている。さらに、読み取りポインタ７６０は、次に表示されるべきＤＳリスト４３０のエントリを指している。フィールドピクチャ中では、ビデオデータは逐次復号および供給され、フレームピクチャにおけるように、論理的スライディングウインドウを論理的トップおよびボトムウインドウにさらに分割する必要は全く無い。従って、前の段落において論じられたような論理的スライディングウインドウが維持される。
【００５６】
図８は本発明のためのフローチャートを示すものである。
【００５７】
最初に、ビデオ復号・表示システムの初期化中に、ＭＳエントリのすべてのビジービットは、メモリマネージャによりＮＯＴビジー（すなわち、ＭＳエントリの全ビジービットを論理０にセットする）初期化される（段階８１０）。
【００５８】
メモリマネージャが、符号化されたビデオデータの入り来るストリーム中に新しいピクチャを検出すると、図９に示されるように、メモリマネージャはＤＳリスト４３０を初期化する（段階８２０）。
【００５９】
図９は、各ピクチャについてＤＳリスト４３０を初期化する段階を詳しく示すものである（段階８２０）。最初に、メモリコントローラが入り来るビットストリーム中のピクチャのフォーマットをチェックし、ピクチャ構造パラメータ値をチェックすることによりそのピクチャがフィールドピクチャかフレームピクチャかを決定する（段階９１０）。
【００６０】
もしビデオデータがフレームピクチャフォーマットであれば、メモリコントローラは次にトップ・フィールド・ファースト変数をチェックしてフラッグが立てられているどうかを決定する（段階９３０）。もしトップフィールドが最初に復号される必要があれば、ＤＳ読み取りポインタ７５０は３６に初期化され、第１ＤＳ書き込みポインタ７１０は０にセットされ、第２ＤＳ書き込みポインタ７２０は３６にセットされる（段階９６０）。一方、もしボトムフィールドが最初に復号される必要があれば、ＤＳ読み取りポインタ７５０は０にセットされ、第１ＤＳ書き込みポインタ７１０は０にセットされ、第２ＤＳ書き込みポインタ７２０は３６にセットされる（段階９７０）。
【００６１】
図９に示されるように、もしビデオデータがフィールドピクチャフォーマットであれば、メモリコントローラはピクチャ構造パラメータをチェックして、入り来るビデオデータがトップフィールドピクチャかボトムフィールドピクチャかを決定する（段階９２０）。もしトップフィールドが復号される必要があれば、３つのＤＳポインタは以下の通りセットされる。すなわち、ＤＳ読み取りポインタ７６０は３６にセットされ、第１ＤＳ書き込みポインタ７３０は０にセットされ、第２ＤＳ書き込みポインタ７４０は１にセットされる（段階９４０）。一方、もしボトムフィールドが復号される必要があれば、ＤＳ読み取りポインタ７６０は０にセットされ、第１ＤＳ書き込みポインタ７３０は３６にセットされ、第２ＤＳ書き込みポインタ７４０は３７にセットされる（段階９５０）。
【００６２】
図８に示されるように、ＤＳリストが初期化された後、メモリマネージャは、ＭＳリスト４２０をスキャンして、ＭＳリスト４２０のエントリのビジービット５１０をチェックすることにより、ビデオメモリ４１０中の利用可能なスペースを調べる（段階８３０）。メモリマネージャは、ＭＳリスト４２０をインデックス０からインデックス４７まで順番にスキャンし、ビジービット５１０がセットされていない最初の２つのエントリを見つける。もしそのような「ビジーでない」セグメントが全く無いか、あるいはそのようなセグメントが１つしか無いのであれば、条件が満たされるまで復号は停止される。復号器はそれにより、２つの「ビジーでない」ビデオメモリセグメントを待つ待機ステージに入る。ビデオメモリセグメント中のデータが読み出されかつ表示されると、メモリマネージャは、メモリセグメントの対応するビジービット５１０を論理０にセットすることにより、新たに復号されたビデオデータを格納するためにメモリセグメントを解放する。
【００６３】
２つのビジーでないメモリセグメントが見つかった時は、第１ＤＳ書き込みポインタ（すなわち、７１０または７２０）により指し示されるＤＳリスト４３０に、第１の利用可能メモリセグメントのＭＳインデックス４２０（すなわち、０、１、２、…または４７）が書き込まれ、第２ＤＳ書き込みポインタ（すなわち、７２０または７４０）により指し示されるＤＳエントリに第２の利用可能メモリセグメントが書き込まれる、それにより２つのＤＳ書き込みポインタは、ビデオメモリ４１０中の２つの「ビジーでない」メモリセグメントに間接的にアドレスしている。
【００６４】
ＤＳエントリが更新された後、復号器はビデオデータの１６本のラインを含む新しいマクロブロック行を復号する（段階８３０）。
【００６５】
ビデオデータのマクロブロック行が復号された後、メモリマネージャは、復号されたビデオデータを段階８４０においてビデオメモリに保存する。図１０は段階８４０の詳細を示すものである。
【００６６】
図１０に示されるように、ピクチャがフィールドピクチャであれば、メモリマネージャは、各マクロブロック行のトップ８ラインのビデオデータを、第１ＤＳ書き込みポインタ７３０により間接的にアドレスされたＭＳセグメントのエントリによりアドレスされたメモリセグメントに書き込む（段階１０３０）。次にメモリマネージャは、各マクロブロック行のボトム８ラインのビデオデータを、第２ＤＳ書き込みポインタ７４０により間接的にアドレスされたＭＳセグメントのエントリによりアドレスされたメモリセグメントへ書き込む（段階１０３０）。
【００６７】
一方、もしピクチャがフレームピクチャであれば、復号器が入り来るビットストリームのビデオデータのライン１６本（すなわち、１マクロブロック行）を復号した後、メモリマネージャは、現在復号しているフレームのトップフィールドに対応するビデオデータのライン８本を、第１ＤＳ書き込みポインタ７１０により間接的にアドレスされた第１ＭＳセグメントのエントリによりアドレスされたメモリセグメントへ書き込む。次にメモリマネージャは、現在復号中のフレームのボトムフィールドに対応するビデオデータのライン８本を、第２ＤＳ書き込みポインタ７２０により間接的にアドレスされた第２ＭＳセグメントのエントリによりアドレスされたメモリセグメントへ書き込む（段階１０２０）。
【００６８】
図８に示されるように、対応するメモリセグメントへのビデオデータの書き込みに続き、ＤＳポインタ（すなわち、７１０、７２０、および７５０または７３０、７４０、および７６０）が次に、図１１に示されるような段階（段階８６０）により更新される。
【００６９】
図１１は、ＤＳポインタ（すなわち、７１０、７２０、および７５０または７３０、７４０、および７６０）の更新操作（段階８６０）の詳細を示すものである。メモリマネージャは最初に、ビデオデータがフィールドピクチャフォーマットであるかフレームピクチャフォーマットであるかをチェックし（段階１１１０）、次にそれに応じてメモリマネージャが２つのＤＳ書き込みポインタを更新する（段階１１２０および１１３０）。
【００７０】
もしビデオデータがフレームピクチャフォーマットであれば、２つのＤＳ書き込みポインタ７１０、７２０のどちらも１進められる（段階１１２０）。一方、もしビデオデータがフィールドピクチャフォーマットであれば、２つのＤＳ書き込みポインタ７３０、７４０のどちらも２進められる（段階１１３０）。
【００７１】
ＤＳポインタが更新された後、メモリマネージャは、現在のピクチャ用にそれ以上入り来るビデオデータがあるかどうかをチェックする。もし入り来るビデオビットストリームがまだあるなら、その操作は、図８に示されるようなＭＳリストスキャニング段階に戻る（段階８７０）。
【００７２】
もし現在のピクチャ用の入り来るビットストリームが終結すれば、次にビデオシステムは、さらに復号・表示されるピクチャがさらにあるかどうかをチェックする（段階８８０）。もし復号・表示のために入り来るピクチャがさらにあれば、全体の操作は段階８２０に戻る。全体の操作は、表示用に入り来るピクチャがもはやなくなれば完了する（段階８９０）。
【００７３】
システムが表示用に付加的なビデオデータを必要とする時は、データの次のライン８本が、ＤＳ読み取りポインタ（すなわち、７５０または７６０）によりアドレスされたＭＳインデックスを用いて、ビデオメモリ４１０のメモリセグメントから取り出される。ＭＳリストエントリ中に格納されているアドレスは、ビデオメモリ４１０から次に表示されるメモリセグメントの位置を特定するのに用いられる。表示用に特定のメモリセグメントが読み出された後、そのＭＳエントリ４２０のビジービット５１０は非ビジーにセットされ、ＤＳ読み出しポインタは次に１増大される（７２を法として）。ポインタが７１に達した後、ＤＳ読み取りポインタは「０」位置にリセットされる。
【００７４】
本発明の１実施態様において、もしメモリコントローラが、上記の「間接的アドレッシング」スキームを処理するのに十分な速度を有していなければ、ＤＳリスト４３０のエントリは６ビットから２６ビットに増大され、それにより、メモリセグメントのアドレスがＭＳエントリ中で選択されると、それはＭＳリスト４２０インデックスと共にＤＳエントリにコピーされる。メモリセグメントのアドレスをＭＳリスト４２０エントリと共にＤＳリストに書き込むことにより、アドレッシングおよびＭＳリスト４２０からのデータ取り出しの余分な段階が回避できる。
【００７５】
３．ＦＬＥＸＩＲＡＭデザイン
本発明の別の側面は、フレーム内ビデオメモリ圧縮／復元方法に向けられている。
【００７６】
本発明においては、ビデオデータは、ビデオメモリに格納される前に、損失性または非損失性に圧縮され、圧縮されたデータは、それがビデオメモリから読み出される時に復元される。このビデオデータ圧縮・復元装置は、本発明者らにより「ＦｌｅｘｉＲａｍ」と名付けられている。
【００７７】
図１２は、本発明の３つの異なる実施態様を示す。
【００７８】
図１２（ａ）に示されるような好ましい実施態様において、本発明のＦｌｅｘｉＲａｍ１２２０ａは、メモリマネージャ１２１０ａに組み込まれており、ビデオデータのすべての圧縮および復元はメモリマネージャ１２１０ａ内で実行される。この実施態様の利点は、ビデオメモリ１２３０ａのための特別なハードウェア要件が全く無いことおよびビデオデータの圧縮および復元のどちらもビデオメモリ１２３０ａに対してトランスペアレントであることである。
【００７９】
図１２（ｂ）は本発明の別の好ましい実施態様を示すもので、ＦｌｅｘｉＲａｍ１２２０ｂは、メモリマネージャ１２１０ｂおよびビデオメモリ１２３０ｂの両方から分離されている。この実施態様の利点は、ビデオ復号・表示システム全体の実質的な修正なしで本発明の実行が可能なことである。
【００８０】
図１２（ｃ）は本発明の別の実施態様を示すもので、ＦｌｅｘｉＲａｍ１２２０ｃはビデオメモリ１２３０ｃに組み入れられている。この実施態様の利点は、データ圧縮および復元の操作全体がメモリマネージャ１２１０ｃに対してトランスペアレントであり、メモリマネージャ１２１０ｃに対しオーバーヘッドが全く付加されないことである。
【００８１】
本発明がすべてのピクチャタイプ（Ｉ、ＰおよびＢ）で作用し、かつどのようなタイプのピクチャに対しても限定されないことは留意すべきである。従って、もし所望されれば、本発明は、分割メモリマネージャデザインにおいて、（ａ）第２メモリマネージャ２４０、あるいは（ｂ）第１メモリマネージャ２３０と第２メモリマネージャ２４０両方を用いて実行し得る（図２参照）。
【００８２】
図１３ａ、１３ｂおよび１３ｃは、ＦｌｅｘｉＲａｍデザインと分割メモリマネージャデザインとを組み合せた、本発明の３つの異なる実施態様を示すものである。図１３ａは、本発明の２つの好ましい実施態様の１つを示す。この実施態様においては、２つのＦｌｅｘｉＲａｍ圧縮器／復号器が用いられる。第１ＦｌｅｘｉＲａｍ１３７０ａは、第１メモリマネージャ１３１０ａと第１メモリ１３３０ａとの間に置かれている。第２ＦｌｅｘｉＲａｍ１３８０ａは、第２メモリマネージャ１３２０ａと第２メモリ１３４０ａとの間に置かれ、ＦｌｅｘｉＲａｍ技術を用いて、格納されたすべてのビデオメモリデータが圧縮・復元される。図１３ｂは、本発明の別の実施態様を示すものである。ＦｌｅｘｉＲａｍ１３７０ｂは、第２メモリマネージャ１３２０ｂと第２メモリ１３４０ｂとの間に置かれ、双方向フレーム（すなわち、Ｂ）のためのビデオデータのみが、ビデオメモリに格納される時に圧縮される。最後に、図１３ｃは本発明のさらに別の実施態様を示すものである。ＦｌｅｘｉＲａｍ１３７０ｃは、第１メモリマネージャ１３１０ｃと第１メモリ１３３０ｃとの間に置かれ、予測フレーム（すなわち、２Ｉ；１Ｉおよび１Ｐ；または２Ｐ）用ビデオデータのみが、ビデオメモリに格納される時に圧縮される。本発明はメモリマネージャの数を２または３に限定するものではないことに留意すべきである。メモリマネージャは、ビデオデータの種々のタイプおよび／またはグループを処理するため、多数（すなわち、３を超える）の別個のメモリマネージャに分割し得る。
【００８３】
特定のシステムについてどの実施態様を実施すべきかを考慮する場合に最も重要な因子は、利用可能なビデオメモリ量、利用可能な処理速度、および所望の画質等である
【００８４】
例えば、上記の実施態様に加え、ＦｌｅｘｉＲａｍデザインは、３つの色成分すべて、あるいは２つのクロマ色成分のみの圧縮および復元に適用できる。
【００８５】
デザインは、上記で開示されたすべての実施態様および側面を組み合せることによりさらに改善し得る。例えば、１つの実施態様においては、Ｂピクチャ中のＵおよびＶ色成分のみがＦｌｅｘｉＲａｍデザインにより圧縮される。あるいは、別の実施態様においては、２つのピクチャグループ（すなわち、参照フレーム、または双方向フレーム）すべてについての３つの色成分すべてが、ＦｌｅｘｉＲａｍデザインにより圧縮される。
【００８６】
分割メモリマネージャ、ＦｌｅｘｉＲａｍ、および原色の分割を組み合せる上での本発明のデザイン柔軟性により、種々のビデオシステム要件、ビデオメモリサイズ、および所望の画質等に応じて、システムデザイナーが最も適切かつ効果的な組み合わせを選択できるようになるという、デザイン自由度におけるたいへんな利点がもたらされる。
【００８７】
図１４は、ＦｌｅｘｉＲａｍデザインの好ましい実施態様の概要を示すものである。図に示されるように、データ圧縮は２つの段階から成る。すなわち、（１）エラー拡散（段階１４１０）、および（２）カルテットあたり１画素切り捨て（段階１４２０）である。
【００８８】
最初、ビデオデータがエラー拡散アルゴリズムに従って圧縮される（段階１４１０）。好ましい実施態様においては、基本的圧縮単位は、４つの連続した水平画素を含む「カルテット」と定義される。エラー拡散圧縮段階（段階１４１０）は、カルテットの各画素を個別に圧縮する。１ビットエラー拡散アルゴリズムを用いることにより、本来３２ビット（４画素＊８ビット／画素）を有する画素のカルテットは、２８ビットに圧縮される（４画素＊７ビット／画素）。本発明は１ビットエラー拡散アルゴリズムに限定されるものではなく、多ビットエラー拡散アルゴリズムも同様に、ただし知覚可能な画質低下のトレードオフを伴って実施し得ることに留意すべきである。
【００８９】
ビデオデータ圧縮の第２段階は、「カルテットあたり１画素切り捨て」である（段階１４２０）。カルテットあたり１画素切り捨てアルゴリズムは、４つの７ビット画素のカルテットを２４ビットカルテットに圧縮するものである。データ圧縮器は、切り捨てされるカルテットの最良の画素を計算し、再構築記述子（「ＲＤ」）としてカルテットの最後の３ビットに再構築方法を格納する。この圧縮段階の後、画素の圧縮されたカルテットは２４ビットを有する（３画素＊７ビット／画素＋３ビットＲＤ）。このカルテットあたり１画素切り捨ての詳細なアルゴリズムは後述する。
【００９０】
ビデオデータがビデオメモリから必要とされる時は、ＦｌｅｘｉＲａｍは、メモリから読み出された圧縮データに復元を実行する。復元段階は、図１４に示される圧縮段階の逆である。
【００９１】
最初に、１つの画素が、カルテットあたり１画素再構築アルゴリズムにより２４ビットカルテットに加え戻されるが、このアルゴリズムはカルテットあたり１画素切り捨てアルゴリズムの逆である（段階１４３０）。この復元段階は、カルテットを表す２４ビットの最後の３ビットとして格納されたＲＤにより供給された再構築方法に従って、７ビットの切り捨てられた画素を再構築する。この復元段階（段階１４３０）の後、カルテットは再形成されて、４つの７ビット画素を持つ。好ましい再構築アルゴリズムは後に詳述する。
【００９２】
次に、各７ビット画素に「０」が連接され、圧縮前の４つの８ビット画素カルテットに戻る（段階１４４０）。この復元段階（段階１４４０）は、すべての画素に最も有意でないビットとして「０」を１つ連接して４つの８ビット画素を再形成する。
【００９３】
図１５は、圧縮・復元段階における種々のプロセス中のビデオデータのカルテット（すなわち、４つの８ビット画素）のデータフォーマットを示すものである。
【００９４】
最初、カルテット１５１０の各８ビット画素が、エラー拡散アルゴリズム１５２０を用いて個別に圧縮される。中間的な圧縮データは、４つの７ビット画素１５３０から成る。次に、これら４つの７ビット画素１５３０は、カルテットあたり１画素切り捨てアルゴリズム１５４０によりさらに圧縮される。最終的な圧縮データは、３つの７ビット画素１５５１と１つの３ビットＲＤ１５５２とから成る。圧縮されたデータの全長は２４ビットである。
【００９５】
ビデオデータがビデオメモリから必要とされる時は、圧縮されたデータは以下の２つの連続的段階により復元される。
【００９６】
最初、１つの付加的画素が、カルテットあたり１画素再構築アルゴリズム１５６０に従って生成される。このアルゴリズムは、２４ビット圧縮データ１５５０の３ビットＲＤ１５５２に基づくカルテットあたり１画素切り捨てアルゴリズム１５４０の逆である。中間的な復元データ１５７０は４つの７ビット画素から成る。次に、４つの７ビット画素の各々が、最も有意でないビット１５８０としての「０」と連接され、圧縮前の４つの８ビット画素が再形成される。
【００９７】
エラー拡散およびカルテットあたり１画素切り捨てアルゴリズムの詳細は以下で説明する。
【００９８】
Ａ．エラー拡散
この圧縮段階に様々なエラー拡散アルゴリズムが使用し得ることは留意すべきである。以下に開示される好ましい実施態様は、１ビットエラー拡散アルゴリズムを利用している。しかしながら、本発明は、エラー拡散処理に用いられるどのような特定のアルゴリズムにも限定されない。さらに、前に論じたように、もしより多くの画質劣化が許されるならば、複数のビット（１を超える）がエラー拡散アルゴリズムにより除去され得る。従って、以下の特定の１ビットエラー拡散アルゴリズムは、説明目的だけのために開示されている。
【００９９】
本発明のこの好ましい実施態様においては、エラー拡散（ＥＤ）アルゴリズムはピクチャ中のすべてのラインに独立して作用する。ラインの最初では、「ｅ」と表示された１ビットレジスターが１にセットされる。「ｅ」は、現在走っているエラーを格納し、画素ベースで更新される。１ビットＥＤアルゴリズムは、以下の２つの方程式により説明される。
Ｉ_out （ｊ）＝２＊ｆｌｏｏｒ［（Ｉ_in（ｊ）＋ｅ（ｊ））／２］
ただし、Ｉ_in（ｊ）＝２５５およびｅ（ｊ）＝１であれば、Ｉｏｕｔ（ｊ）＝２５５
ｅ（ｊ＋１）＝Ｉ_in（ｊ）＋ｅ（ｊ）＋Ｉ_out （ｊ）
式中、
ｊ：行中の現在の画素のインデックス
Ｉ_in（ｊ）：画素の本来の値
Ｉ_out （ｊ）：画素の新しい値（ＥＤ後）
ｅ（ｊ）：画素ｊの時間におけるエラーアキュムレータ
ｆｌｏｏｒ（ｘ）：ｘ以下で最も近い整数
【０１００】
Ｉ_out （ｊ）が計算された後、Ｉ_out （ｊ）の最も有意でないビットが切り捨てられ、このアルゴリズムにより８ビット画素が７ビット画素へ量子化される一方で、擬似的輪郭効果が除去される。
【０１０１】
このアルゴリズムは例えば以下のように説明し得る。すなわち、１つのケースにおいて、Ｉ_in（ｊ）が１８２、２進数１００１０１１１であり、ｅ（ｊ）が１であると仮定すると、
Ｉ_out （ｊ）＝２＊ｆｌｏｏｒ（（１００１０１１１＋１／２）
＝２＊ｆｌｏｏｒ（１００１１０００／２）
＝２＊ｆｌｏｏｒ０１００１１００
＝１００１１００
ｅ（ｊ＋１）＝１００１０１１１＋１−１００１１００
＝０
【０１０２】
最も有意でないビットを切り捨てることにより、格納された量子化画素は＝１００１１０である。
【０１０３】
「ｅ」値は次に、ラインの終わりに伝播される。本発明のＥＤアルゴリズムがラスタ構造に作用し、ピクチャがブロック構造のメモリに書き込まれるので、e(j)をブロックラインの終わりに格納することが必要である。この値は、ＥＤが同じライン（これは次のブロックに属している）中の次の画素と続いている時に、使われる。従って、８ビットラッチが、すべての「中断」ブロックに使用されなければならない。（ｅは０または１にしかなれないので、すべての中断ラインあたり１ビットで十分である）。
【０１０４】
復元段階１５８０は、上記の圧縮段階の逆である。好ましい実施態様においては、データは、「０」と共にあらゆる画素に連接され、４つの８ビット画素を再形成する。
【０１０５】
各７ビット画素は次に最も有意でないビットとして「０」が連接される。例えば、「１１０１１０１」は「１１０１１０１０」に再形成される。従って、結果として得られるデータは、オリジナルな未圧縮データと同様に４つの８ビット画素である。
【０１０６】
データが、１ビットエラー拡散アルゴリズムを用いて圧縮された後、４ビットが切り捨てられ、それにより４つの８ビット画素を有するカルテットが４つの７ビット画素に圧縮される。４つの８ビット画素を有するカルテットを４つの６ビット画素に圧縮できる「２」ビットエラー拡散アルゴリズムを実行することにより、より高い圧縮率が達成できることは真実である。２ビットエラー拡散アルゴリズムを用いた時の圧縮率は、２４／３２×１００％（すなわち、７５％）である。しかしながら、２ビットエラー拡散技術を用いた圧縮および復元後に知覚可能な画質劣化があることが見出される。従って、２ビットエラー圧縮技術を用いる代わりに、本発明では２段階の圧縮／復元プロセス、すなわち（１）１ビットエラー拡散、および（２）カルテットあたり１画素切り捨てを開示している。たとえ２つの技術の圧縮率が同一（すなわち、７５％）であっても、本発明として開示される２段階圧縮／復元プロセスの結果得られる画質は、２ビットエラー拡散技術に対し知覚可能な改善をもたらす。
【０１０７】
Ｂ．１カルテットあたり１画素切り捨て
本発明の好ましい実施態様の第２圧縮段階は、カルテットあたり１画素切り捨てである。カルテットあたり１画素切り捨てアルゴリズム１５４０は、４つの７ビット画素を３つの７ビット画素プラス３ビット再構築記述子（「ＲＤ」）（すなわち、総計で２４ビット）に圧縮するものである。
【０１０８】
ビデオメモリ中に格納される画素数を減らすため、種々の画素切り捨てアルゴリズムが使用し得ることに留意すべきである。本発明は、圧縮および復元プロセス中に用いられるどのような特定の画素切り捨てアルゴリズムおよび方程式にも限定されない。さらに、多画素（１を超える）は、画質劣化とのトレードオフで切り捨てられ得る。以下のカルテットあたり１画素切り捨ては、メモリ節約と画質への知覚可能な影響がほとんど無いという最良のバランスを備えた、本発明の好ましい実施態様として開示される。
【０１０９】
好ましい実施態様において、カルテットの４画素は各々Ｐ０、Ｐ１、Ｐ２、またはＰ３と連続的に名付けられる。画素Ｐ１およびＰ２は、切り捨てられる２つの候補である。このアルゴリズムは、（１）どちらの画素が切捨てにより適しているか、および（２）切り捨てられた画素を推定するために、復元においてどのような再構築方法を用いるべきかを予測する。各候補について５つの再構築方法が考えらえる。すなわち、
１．左隣りのものをコピーする。
２．右隣りのものをコピーする。
３．左隣りのものと右隣りのものを平均する。
４．左隣りのものの１／４と右隣りのものの３／４を合計する。
５．左隣りのものの３／４と右隣りのものの１／４を合計する。
【０１１０】
理論的には、Ｐ１を切り捨ててＰ２のコピーによりそれを推定することと、Ｐ２を切り捨ててＰ１のコピーによりそれを推定することとは等価である。従って２つの候補（すなわち、Ｐ１およびＰ２）については、考えられる推定量は９つ（１０ではない）しかない。さらに、カルテットから１画素を切り捨てた後、残る３画素を表すために２１ビットが残されることになる。好ましい実施態様においては、各カルテットについての最終的な圧縮データの目標サイズは２４ビットである（２５％圧縮）。このことは、選択され推定量を記述するために３ビットのみが割り当てられ得ることを意味し、従って、実際は８つの推定量のみが考慮し得る。考慮から省略される推定量は、左隣りのものの３／４と右隣りのものの１／４によりＰ２を再構築することである。この特定の推定量を省略する理由は、ＭＰＥＧピクチャの統計学により、これが最善のものとして選ばれることが最も少ないとことが示されるからである（４つの１／４、３／４推定量の中で選択は恣意的である）。
【０１１１】
最初に、８つの推定量が計算される。
Ｐ⁰ ₁=Ｐ０（P ^k _j はｋ法を用いたＰｉの推定量を表す）
Ｐ⁰ ₁=Ｐ２
Ｐ² ₁=０．５Ｐ０＋０．５Ｐ２
Ｐ³ ₁=０．２５Ｐ０＋０．７５Ｐ２
Ｐ⁴ ₁=０．７５Ｐ０＋０．２５Ｐ２
Ｐ¹ ₂=Ｐ３
Ｐ² ₂=０．５Ｐ１＋０．５Ｐ３
Ｐ³ ₂=０．２５Ｐ１＋０．７５Ｐ３
【０１１２】
すべての推定量について、推定誤差の絶対値は、Ｅ^k _i = ｜Ｐ^k _i - Ｐｉ｜と定義される。最小誤差は、ｍｉｎ_i,k ｛Ｅ^k _l ｝である。１を超える最小誤差があれば、選択は恣意的である。（決定は実行のために未決のままにされる）。最後に、最小誤差に従って、Ｐｊ_iminが切り捨てられ、３ビットの再構築記述子（ＲＤ）が設定されて選択された推定量、Ｐ^kmin _iminが示される。３ビットＲＤは、残る３つの７ビット画素に連接され、一緒になって２４ビットの圧縮されたカルテットを形成する。連接方法は実行のために未決のままにされる。
【０１１３】
圧縮されたデータの復元は、単純に圧縮アルゴリズムの逆である。欠けている７ビットの切り捨てられた画素は、（１）再構築にどの画素（すなわち、Ｐ１またはＰ２）を、および（２）再構築にどの再構築方法を選択するかのために各カルテットの最後にあるＲＤを用いて再構築される。
【０１１４】
図１５に示されるように、データ圧縮器は、ＲＤの３ビットにより提供される再構築方法を得る。次に、再構築のための画素（Ｐ１またはＰ２）が決定され、上記の８つの方程式の逆に従って、画素再構築が実行される。
【０１１５】
本発明は（１）エラー拡散、および（２）画素切り捨てのビデオデータ圧縮段階において用いられるどのような特定の方法および方程式にも限定されないことは指摘されるべきである。本発明として開示されたＦｌｅｘｉＲａｍデザインの重要な特徴は、これら２つのデータ圧縮／復元段階の特定のシーケンスである。すなわち、ビデオデータは最初に、エラー拡散アルゴリズムを用いて画素あたりのビット数を減らして圧縮される。次に、中間的な圧縮ビデオデータは、特定の画素グループから１つの画素を切り捨てる画素切り捨てアルゴリズムを用いてさらに圧縮される。前に論じたように、復元プロセスはまさに圧縮プロセスの逆である。
【０１１６】
本発明が上記で好ましい特定の実施態様と共にされる一方で、説明および実施例は例示を意図するものであって、かつ本発明の範囲を限定するものではなく、それは添付の特許請求の範囲により規定されるものであるということは理解されるべきである。
【図面の簡単な説明】
【図１】従来のビデオ復号・表示システムを示す。
【図２】本発明の分割メモリマネージャデザインの好ましい実施態様を示す。
【図３】本発明の分割メモリマネージャデザインの別の好ましい実施態様を示す。
【図４】、表示セグメントリストとメモリセグメントリストとを用いた、ビデオメモリを間接的にアドレスする方法を示す。
【図５】メモリセグメントリストの詳細を示す。
【図６】表示セグメントリストの詳細を示す。
【図７】（ａ）フレームピクチャフォーマットおよび（ｂ）フィールドピクチャフォーマットのための表示セグメントリスト上の２つの書き込みポインタおよび読み取りポインタの位置決めを示す。
【図８】セグメント化された再利用可能なビデオメモリデザインの詳細を示すブロック図である（本発明者によりローリングメモリデザインと名付けられている）。
【図９】各ピクチャについての表示セグメントリストの初期化の詳細を示す。
【図１０】各マクロブロック行の復号されたビデオデータのビデオメモリへの書き込みを示す。
【図１１】表示セグメントリストの更新の詳細を示す。
【図１２】ＦｌｅｘｉＲａｍデザインの３つの好ましい実施態様を示す。
【図１３ａ】ＦｌｅｘｉＲａｍデザインの別の３つの好ましい実施態様を示す。
【図１３ｂ】ＦｌｅｘｉＲａｍデザインの別の３つの好ましい実施態様を示す。
【図１３ｃ】ＦｌｅｘｉＲａｍデザインの別の３つの好ましい実施態様を示す。
【図１４】ビデオデータ圧縮・復元の段階を示すブロック図である。
【図１５】圧縮および復元の種々の段階の間のカルテットのビデオデータフォーマットを示す。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an MPEG ("Moving Picture Experts Groups") decoding and display system, and more particularly to a reduction in video memory size required in an MPEG decoding and display system for decoding and displaying video images.
[0002]
[Prior art]
In the late 1980's, it became necessary to load motion video and related audio on a first generation CD-ROM at 1.4 Mbit / s. To this end, in the late 1980s and early 1990s, the ISO (“International Organization for Standardization”) MPEG committee developed a digital compression standard for video and two-channel stereo audio. This standard is commonly known as MPEG-1, officially known as ISO11172.
[0003]
Following MPEG-1, there has been a need to compress entertainment television for transmission media such as satellite, cassette tape, broadcast and CATV. Therefore, in order to make available a digital compression method for full resolution standard definition television (SDTV) or high definition television (HDTV) images, ISO is a second standard known as MPEG-2, officially known as ISO 13818. Developed. The bit rates chosen to optimize MPEG-2 were 4 Mbit / sec and 9 Mbit / sec for SDTV and 20 Mbit / sec for HDTV.
[0004]
Neither the MPEG-1 nor the MPEG-2 standard defines which encoding method to use, the encoding process, or details of the encoder. These standards only specify a format for representing the data input to the decoder and a rule set for interpreting these data. These formats for representing data are called syntaxes and can be used to construct various valid data streams called bitstreams. The rules for interpreting the data are called decoding semantics. The ordered set of decoding semantics is called the decoding process.
[0005]
The MPEG syntax supports a variety of encoding methods that utilize both spatial and temporal redundancy. Spatial redundancy is exploited using block-based discrete cosine transform (“DCT”) coding of 8 × 8 pixel blocks, followed by quantization, zigzag scanning, and zero-quantized indices and their indices. A variable length encoding of the amplitude is performed. A quantization matrix that allows perceptual weighted quantization of DCT coefficients can be used to discard perceptually irrelevant information, thus further improving coding efficiency. On the other hand, temporal redundancy is used by using motion compensation prediction, forward prediction, backward prediction, and bidirectional prediction.
[0006]
MPEG provides two types of video data compression methods: intraframe coding and interframe coding.
[0007]
Intraframe coding is for using spatial redundancy. Many of the interactive requirements can be met with intraframe coding alone. However, for some video signals with low bit rates, the image quality that can be achieved with only intra-frame coding is not sufficient.
[0008]
Therefore, temporal redundancy is utilized by the MPEG algorithm that calculates the interframe difference signal called the prediction error. In calculating prediction errors, motion compensation techniques are used to correct predictions for motion. As in H.261, the macroblock (MB) approach has been adopted for motion compensation in MPEG. In a one-way motion estimation called forward prediction, the target MB in the picture to be encoded is matched with a set of moved macroblocks of the same size in past pictures called reference pictures. As in H.261, the macroblock in the reference picture that best matches the target macroblock is used as the prediction MB. The prediction error is then calculated as the difference between the target macroblock and the prediction macroblock.
[0009]
I. Picture buffer size
(1) Two reference frames
In summary, MPEG-2 divides a video picture into three types of pictures (ie, intra “I”, prediction “P”, and bi-directional prediction “B”). By definition, all macroblocks in an I picture must be coded intra (like a baseline JPEG picture). Furthermore, the macroblocks in the P picture can be encoded as intra or non-intra. During non-intra coding of a P picture, the P picture is temporarily predicted from a previously reconstructed picture, thereby coding for the immediately preceding I or P picture. Finally, macroblocks within a B (ie bi-directional prediction) picture may be independently selected as non-intra, such as intra or forward prediction, reverse prediction or forward and reverse (interpolated) prediction. During non-intra coding of a B picture, the picture is coded with reference to the immediately preceding I or P picture as well as the immediately following I or P picture. In terms of coding order, P pictures are causal, while B pictures are not causal and use two surrounding pictures that have been coded by chance for prediction. Regarding the compression efficiency, the I picture has the lowest efficiency, the P picture is somewhat better, and the B picture is the most efficient.
[0010]
Every macroblock header contains an element called a macroblock type, which can turn these modes on and off like a switch. The macroblock (or motion type as in MPEG-2) type is perhaps the single most powerful element in the overall video syntax. Picture types (I, P and B) simply extend the scope of semantics to allow macroblock mode.
[0011]
A sequence of pictures can consist of almost any pattern of I, P and B pictures. Although having a fixed pattern (eg, IBBPBBPBBPBBPBB) is common in the industry, more advanced encoders optimize the placement of three picture types according to local sequence features in the context of more global features Will try to make it.
[0012]
As explained above, in order to reconstruct a B picture, the decoder needs two reference frames (ie, two P frames, one P and I frame each, or two I frames). The video decoding and display system must allocate at least two frames of video memory in order to store two reference frames.
[0013]
(2) Two half frames
(I) Interlace video
In addition, MPEG-2 defines that frames can be progressively encoded or interlaced and signaled by a “progressive frame” variable.
[0014]
Progressive frames are a logical choice for video material organized from film, where all “pixels” are integrated, that is, captured almost at the same moment. The optical image of the scene on the picture is scanned one line at a time, left to right and top to bottom. The details that can be represented in the vertical direction are limited by the number of scan lines. Thus, some of the details of the vertical resolution are lost as a result of raster scan degradation. Similarly, because of the sampling of each scan line, some of the horizontal details are lost.
[0015]
Scan line selection involves a trade-off between inconsistent requirements of bandwidth, flicker and resolution. Interlaced frame scans attempt to achieve these trade-offs using frames composed of two field lines sampled at different times, and the two field lines are interleaved, so that Two consecutive lines belong to alternating fields. This is a vertical-temporal tradeoff in spatial and temporal resolution.
[0016]
For interlaced pictures, MPEG-2 offers a choice of two “picture structures”. A “field picture” is made up of individual fields that are each divided into macroblocks and encoded separately. On the other hand, in a “frame picture”, each interlaced field pair is interleaved together into one frame, which is then divided into macroblocks and encoded. MPEG-2 requires interlaced video to be displayed as alternating top and bottom fields. However, within a frame, one of the top or bottom fields is temporarily encoded first and sent as the first field picture of the frame. The selection of the frame structure is indicated by one of the MPEG-2 parameters.
[0017]
In conventional decoding and display systems that process interlaced frame pictures, the bottom field is the top field even if both data reconstructed for the top field and the bottom field are generated simultaneously by the decoder. It is the same when it is finally displayed after the display is completed, or vice versa. In the bottom field display, because of this delay, a buffer of one field (half frame) size is required to store the bottom field for the delayed one. This additional video memory requirement of an additional field (half frame) is in addition to the above requirement of two frames required to store two reference frames (ie I and / or P frames) Please note that.
[0018]
(Ii) 3: 2 pulldown
The repeat first field signals that the field or frame from the current frame should be repeated for frame rate conversion purposes (as in the example below for 30 Hz display vs. 24 Hz encoding). Introduced into MPEG-2. On average, every other encoded frame in a 24 frame / second encoding sequence will signal a repeat first field flag. Thus, a 24 frame / second (or 48 field / second) encoded sequence would be a 30 frame / second (60 field / second) display sequence. This process has been known as 3: 2 pulldown for decades. Since the advent of television, most movies seen on NTSC displays have been displayed this way. In the MPEG-2 format, the repeat first field flag is determined independently in every frame-structured picture, so the actual pattern can be irregular (it doesn't have to be every other frame literally ).
[0019]
For MPEG-2 video, the video display and memory controller must determine by itself when to perform 3: 2 pulldown by checking the flag that comes with the decoded video data. MPEG-2 provides two flags (repeat first field and top field first) that explicitly describe whether a frame or field should be repeated. In a progressive sequence, a frame can be repeated two to three times. On the other hand, simple and main profiles are limited to repeated fields only. Furthermore, it is a general syntactic limitation that the repeat first field can be signaled only in progressive frame structured pictures (value = 1).
[0020]
For example, in the most common scenario, one film sequence includes 24 frames per second. However, the frame rate element in the sequence header will indicate 30 frames / second. On average, every other encoded frame sends a repeat field (repeat first field = 1) signal to pad the frame rate from 24 Hz to 30 Hz. That is, (24 encoded frames / second) * (2 fields / encoded frames) * (5 display fields / 4 encoded fields) = 60 display fields / second = 30 display frames / second.
[0021]
For systems with 3: 2 pull-down capability, another extra field (half frame) of video memory is needed to store the first displayed field for later repetition (3: 2 pull-down). Depending on the protocol). This is because the first displayed field is needed again for display after the second field is displayed. To explain this requirement, if a top field is displayed during decoding, then the bottom field will be displayed following the completion of the display of the top field. However, this top field will be needed for display again after the system finishes the bottom field display (ie, 3: 2 pulldown). Since the top field is required for display at two different moments (ie before and after the bottom field display), another half frame (one field) of video memory is required to store the top field. In order to display MPEG-2 video pictures, the conventional system adds up to the above 2.5 frames required to store two reference frames plus a half frame to display interlaced pictures. Requires 3 frames of video memory.
[0022]
(Iii) Still frame
Some newly designed MPEG-2 video decoding and display systems also allow the user to freeze the currently displayed frame. Under the “still frame” condition, the video decoding / display system repeatedly displays the currently displayed picture until further instruction from the user. Since no further decoding and 3: 2 pulldown are required during the pause, the two half-frames described above (ie, display of interlaced pictures and 3: 2 pulldown) are not required. However, if the frozen frame being displayed is a progressive B frame, the entire B frame needs to be stored in the video system, so extra video memory is needed to store the currently displayed B frame. Frame is necessary. It is true that no extra video memory is required if the frozen frame is an I or P frame. This is because these two reference frames are already stored in the video memory (as a forward prediction frame and a backward prediction frame). However, normally B frames are not stored in the video memory for reference, so an extra frame of video memory is needed for B frame display. Thus, in order to display an accurate image of a B-frame picture, in addition to the two frames required to store the reference frame, another full frame of video memory is required.
[0023]
II. Problems faced by traditional video decoding and display systems
The MPEG-2 main profile main level system is defined as having a sampling limit (720 × 480 × 30 Hz for NTSC and 720 × 576 × 24 Hz for PAL) in the CCIR601 parameters. The term “profile” is used to limit the syntax (ie, algorithm) used in MPEG, and the term “level” limits the encoding parameters (sample rate, frame size, encoded bit rate, etc.). Used to do. Along with that, the video main profile main level standardizes complexity with the feasible limitations of 1994 VLSI technology and still meets the requirements of most applications.
[0024]
For CCIR 601 rate video, the 3 frames required for the PAL / SECAM decoding and display system exceed the decoder's video memory requirements beyond the normal 16 Mbit (2 Mbyte) limit, and the next level of 32-Mbit. Push to memory design. This memory constraint is discussed in US Pat. No. 5,646,693 (issued July 8, 1997 to Cismas and assigned to the same assignee as the present application). The Cismas reference is hereby fully incorporated by reference. As has been well documented in the prior art, increasing video memory beyond 16-Mbit would complicate the design of the memory controller and add to the manufacturing cost of another 16Mbit video memory. .
[0025]
[Problems to be solved by the invention]
Cismas discloses a system and method for decoding and displaying MPEG video using a frame buffer that is smaller in size than normal video memory requirements (ie, three frames of memory). During decoding and display, the Cismas system stores only a portion of the second field when the first field is being displayed, instead of storing the complete second field. When the second field is required for display, the decoder again decodes the missing portion of the second field to generate the remaining data for the second field display. By reusing the same storage location allocated for the first displayed field to display the second field, the decoding and display system saves up to half a frame of memory (depending on the number of partitions). ). However, even if the Cismas system solves the problem of lack of video memory during display, the Cismas system requires additional decoding power to decode the missing portion of the second field a second time. Therefore, a new video memory management system is desired that can achieve the same objective as Cismas of reducing the video memory requirements while eliminating the second decoding stage.
[0026]
Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and attained by means of the instrumentalities and combinations particularly pointed out in the appended claims.
[0027]
SUMMARY OF THE INVENTION
The present invention is directed to an improved video memory control system for handling video memory used for decoding and displaying a bit stream of compressed video data. In particular, the present invention is directed to a video memory management system for receiving a bitstream of compressed video data, decoding the compressed video data, and displaying the images contained therein. .
[0028]
One aspect of this memory management system is to provide a modular memory management system or apparatus for handling video memory management. More particularly, the video decoding and display system of the present invention consists of a split memory manager having two separate memory managers. Each of these memory managers handles a specific memory management function. By dividing the memory management function among the various memory managers, decoding and display of video data can be performed more efficiently.
[0029]
Another aspect of the present invention provides a novel method for processing and handling video memory, thereby providing an efficient method for handling MPEG-2 data streams. In particular, a segmented reusable video memory management system is disclosed.
[0030]
Another aspect of the present invention is to provide an intra-frame video data compression system that can further reduce the size of video memory required for video decoding and display systems.
[0031]
The accompanying drawings are included in and constitute a part of this specification and exemplify the preferred embodiments of the present invention, the general description given above and the detailed description of the preferred embodiments given below. At the same time, it helps to explain the principle of the present invention.
[0032]
DETAILED DESCRIPTION OF THE INVENTION
The invention will be described with reference to several preferred embodiments. The preferred embodiment is an apparatus and method for handling and processing video data. The following three aspects of the present invention are discussed in detail. (2) segmented reusable memory design (named by the inventor as “rolling memory design”), and (3) simple fixed low without considering method and application. Bit rate assisted compression technology (named "FlexiRam" by the inventor). All three aspects of the present invention are disclosed for efficient handling of video memory data and can be implemented in a single video decoding and display system.
[0033]
1. Split video memory manager
FIG. 1 shows a conventional video decoding / display system, which comprises a decoder 110, a memory manager 120, a video memory 130, and a display subsystem 140. A continuous bit stream of encoded video data is provided to decoder 110 for decoding and display. A continuous bit stream of video data is supplied at a constant or variable bit rate, depending on the overall system design. Decoder 110 restores the bitstream and then provides the restored data to memory manager 120. The memory manager 120 serves to store the restored video data in the video memory 130 and supply the video data to the display subsystem 140.
[0034]
The video memory 130 basically performs two main functions. That is, first, it serves as a video data buffer to buffer different processing measures between decoding the incoming bitstream of video data and displaying the decoded video image. In order to prevent any interruption of the video image display, the decoder 110 is generally required to recover the video data faster than the request from the display subsystem 140. Second, video memory 130 needs to store reference frame data (ie, I and P frames) and provide to decoder 110 to reconstruct the video image from the incoming bitstream of data. . Thus, as shown in FIG. 1, the data bus between the decoder 110 and the memory manager 120 is bi-directional so that the recovered data flows to and from the memory manager 120.
[0035]
FIG. 2 illustrates the novel partitioned memory manager design disclosed by the present invention. Instead of the conventional single memory manager design as shown in FIG. 1, the memory manager of the present invention consists of two separate memory managers. That is, the first memory manager 210 and the second memory manager 220. Each memory manager is coupled to a video memory (ie, first memory 230 and second memory 240).
[0036]
As discussed in the previous paragraph, there are basically two groups of video data stored in video memory. Reference frame data (ie, I and P frames) and bidirectional frame data (ie, B frames). Since these two groups of video data are handled differently in both decoding and display, the handling of video memory between these two groups of video data is quite different.
[0037]
In a preferred embodiment of the present invention, the first memory manager 210 and the combined first memory 230 are allocated to handle and store reference frames (ie, I and P frames), whereas the second memory manager 220 and The combined second memory 240 is allocated to handle and store bi-directional frame (ie, B frame) data. By dividing the video memory controller into two separate and separate memory controllers, each of the two controllers is implemented to perform the most efficient memory management and / or compression method for handling each type of video data. Can be specially designed. By dividing the conventional memory manager into two separate managers, video data can be handled more efficiently than conventional video decoding and display systems, while having the advantage of having a simple system design. Invented by the invention.
[0038]
For example, in the video memory management system disclosed by Cismas, the memory management method is directed to B frames. In a preferred embodiment using a split memory manager design, the first memory manager 210 or 310 and the attached first memory 230 or 330 perform conventional video memory management functions, whereas the second memory manager 220 or Only 320 and the attached second memory 240 or 340 can be designed to implement the Cismas memory management technique. Thus, the ability of the present invention to customize specific video memory management techniques for individual types of video data provides video system designers with greater design flexibility over conventional designs.
[0039]
Furthermore, in another preferred embodiment, the present invention comprises three memory managers. The first memory manager handles I frames. The second manager handles P frames. Furthermore, the third memory manager handles only B frames.
[0040]
In another preferred embodiment, the present invention comprises three memory managers. Each of the three memory managers handles one of the three color components (eg, Y, U, and V).
[0041]
In a further preferred embodiment having multiple memory managers, each of the memory managers handles different combinations of video data types (eg, color components) and / or groups (eg, reference frames or bi-directional frames). it can. For example, in a preferred embodiment, the video decoding and display system includes a first memory manager for handling Y chroma components for I pictures, a second memory manager for handling U and V chroma components for B pictures, I And a third memory manager for handling U and V chroma components for P pictures and a fourth memory manager for handling the remaining video data. Each of the four memory managers can utilize a different memory management technique.
[0042]
Since the design flexibility provided by the split memory manager design of the present invention is almost unlimited, individual memory managers can be customized to suit a particular type or definition of video data.
[0043]
FIG. 3 shows another preferred embodiment of the present invention. Under the MPEG-2 main profile main level definition, the repeat first field is not allowed for B frames for the PAL standard. Thus, in the preferred embodiment designed for the MPEG-2 main profile main level PAL system, the repeat first field need not be provided to the second memory manager 320. This is because the second memory manager 320 processes only B frame pictures. However, the top field first signal is still needed for both the first memory manager 310 and the second memory manager 320 for handling interlaced pictures.
[0044]
2. Segmented reusable video memory design
Another aspect of the invention is directed to a method of handling video memory allocation for a video decoding and display system that can handle both field pictures and frame pictures.
[0045]
As previously described, MPEG-2 provides a choice of two picture structures for interlaced pictures. A field picture consists of individual fields, each divided into macroblocks and individually encoded. On the other hand, in a frame picture, each interlaced field pair is interleaved together into one frame, which is then divided into macroblocks and encoded. MPEG-2 requires interlaced video to be displayed as alternating top and bottom fields. However, within a frame, one of the top or bottom fields can be temporarily encoded first and sent as the first field picture of the frame according to a predefined definition. Due to differences in the sequence of decoded data for the two picture structures, the reading and writing of the video data defined by these two formats is performed in two different orders.
[0046]
FIG. 4 illustrates a preferred embodiment of the present invention. As shown, the video memory 410 is partitioned into a number of memory segments. Each of the memory segments can store eight consecutive lines of video data in one of two fields. The memory manager creates and maintains two lists, MS (memory segment) 420 and DS (display segment) 430. Each memory segment is indirectly addressed by an entry in the display segment list 430 via an entry in the memory segment list 420. The MS list 420 has as many entries as video memory segments, and each entry in the MS list corresponds to one memory segment. In the preferred embodiment, MS list 420 has 48 entries for addressing 48 memory segments into video memory 410. For the PAL case, 48 memory segments in the video memory correspond to 2/3 of the frame. Each entry in the MS list 420 includes a 20-bit address that defines the first cell address of a memory segment in the video memory 410. These addresses may be static or dynamic depending on the system design. For example, in a dedicated system specifically designed for decoding and displaying MPEG-2 video data, the address is preferably static and not dynamic to reduce system overhead. However, in a multi-purpose digital decoding and display system, such as a system capable of displaying both MPEG-1 and MPEG-2 video data, these addresses are preferably dynamic so that various video memory management methods can be accommodated. Maintained.
[0047]
In the preferred embodiment as shown in FIG. 4, DS list 430 maps video memory 410 to a display screen. The display screen is logically divided into a number of segments, and each entry in the DS list 430 corresponds to one segment of the display screen. In the preferred embodiment, the DS list 430 includes 72 entries, one corresponding to each display segment. The number 72 was chosen because in the PAL format, 576 lines are displayed and contain 8 lines of video data for each segment. Therefore, the number 72 = 576/8 is selected.
[0048]
As shown in FIGS. 4 and 5, each entry in the MS list 420 has one additional busy bit 510 (“busy bit”) to indicate whether a particular memory segment of video memory is busy. ) Is included. Specifically, if it is written to a memory segment and it is not displayed, it is shown as busy. Similarly, a non-busy memory segment means that the memory segment has been displayed but no new data is contained therein. For the present invention, it is very important to use the busy bit 510. This is because the switching of busy bit 510 on each entry in MS list 420 can be characterized as defining a logical sliding window in video memory 410. By setting the busy bit 510 "on" and "off" when data is written to and read from the video memory segment, the logical sliding window is displayed in the MS list with the busy bit "on". Defined as all memory segments with corresponding entries. In addition, the logical sliding window is also maintained by incrementing and updating the two DS write pointers (ie, 710 and 720, or 730 and 740) and the read pointer (ie, 750 or 760), as shown in FIG. Is done.
[0049]
Even though the memory segment in the video memory 410 and the physical location of the entries in the MS list 420 are not assigned consecutively as the entries in the DS list 430, the logical sliding window is contiguous in the DS list 430. It is defined by the entry. The logical sliding window increases in size after more decoded data is stored in the memory segment in the video memory 410, whereas after the decoded data is read from the memory segment in the video memory 410, the size increases. Decrease. Furthermore, as video data is read from and written to video memory, the logical sliding window shifts in position (ie, down the DS list 430). The key concept of the present invention is that a logically contiguous memory space corresponding to a portion of the display screen can be defined, and the corresponding decoded video data is stored randomly in physical video memory. If present, it is currently written to video memory and is not displayed. An advantage of the present invention is that the size of the physical video memory can be made smaller than a full picture. In theory, the physical memory size can be as small as half the size required to store a full picture (ie half a frame).
[0050]
FIG. 4 illustrates the indirect addressing scheme used by the preferred embodiment of the present invention. Each entry in the DS list 430 stores an index that points to one entry in the MS list 420, whereas each entry in the MS list 420 stores the address of a memory segment in the video memory 410.
[0051]
In a preferred embodiment, the video decoding and display system of the present invention handles interlaced pictures in both frame picture format and field picture format. Accordingly, as shown in FIG. 6, the DS list 430 is logically divided into two parts. That is, the top portion 610 corresponding to the top field and the bottom portion 620 corresponding to the bottom field. Entries indexed from 0 to 35 belong to the top field, and entries indexed from 36 to 71 belong to the bottom field. In order to match the DS list 430 to the corresponding video memory segment 420, each entry in the DS list 430 includes a 6-bit index (0 to 47) into the MS list 420. Thus, each entry in the DS list 430 refers to an entry in the MS list 420 and indirectly addresses one segment of video memory (see FIG. 4).
[0052]
As shown in FIG. 7, in the preferred embodiment, three pointers are maintained for the DS list 430. That is, one read pointer (ie, 750 or 760) for reading and displaying, and two write pointers (ie, 710, 720 or 730, 740) for writing video memory.
[0053]
For field pictures and frame pictures, these three DS pointers are assigned to different positions.
[0054]
FIG. 7 (a) shows the positions of these three pointers when the video system is handling frame pictures. To handle a bitstream of video data for frame picture where data is interleaved between two fields, the two write pointers 710, 720 point to two DS entries with 36 intervals in between Set so that each of the two write pointers points to an entry at the same offset position in a different field (ie, top field and bottom field). Further, the read pointer 750 points to the entry of the DS list 430 to be displayed next. To accommodate both the top and bottom fields of the frame picture, the logical sliding window further comprises a logical top window and a logical bottom window. The logical top window defines a sliding window within the top window, which is indicated by successive entries in the top portion of the DS list that indirectly address the memory segment in the video memory, which memory segment is It has been written but not yet displayed. Similarly, a logical bottom window defines a sliding window in the bottom field, which is indicated by successive entries in the bottom portion of the DS list that indirectly address memory segments in video memory, A memory segment has been written but not yet displayed.
[0055]
FIG. 7B shows the positions of these pointers when the video decoding / display system is handling field pictures. Since the video data of the field picture is sequentially decoded and supplied by the decoder, the two write pointers 730, 740 are set to have a fixed interval of 1 between them. Further, the read pointer 760 points to an entry of the DS list 430 to be displayed next. In field pictures, the video data is decoded and supplied sequentially, and there is no need to further divide the logical sliding window into logical top and bottom windows as in frame pictures. Thus, a logical sliding window as discussed in the previous paragraph is maintained.
[0056]
FIG. 8 shows a flowchart for the present invention.
[0057]
First, during initialization of the video decoding and display system, all busy bits of the MS entry are initialized by the memory manager to NOT busy (ie, set all busy bits of the MS entry to logic 0) (stage). 810).
[0058]
When the memory manager detects a new picture in the incoming stream of encoded video data, the memory manager initializes the DS list 430, as shown in FIG. 9 (stage 820).
[0059]
FIG. 9 shows in detail the step of initializing the DS list 430 for each picture (step 820). Initially, the memory controller checks the format of the picture in the incoming bitstream and determines whether the picture is a field picture or a frame picture by checking the picture structure parameter value (step 910).
[0060]
If the video data is in frame picture format, the memory controller then checks the top field first variable to determine if the flag is set (step 930). If the top field needs to be decoded first, the DS read pointer 750 is initialized to 36, the first DS write pointer 710 is set to 0, and the second DS write pointer 720 is set to 36 (step 960). ). On the other hand, if the bottom field needs to be decoded first, the DS read pointer 750 is set to 0, the first DS write pointer 710 is set to 0, and the second DS write pointer 720 is set to 36 (steps). 970).
[0061]
As shown in FIG. 9, if the video data is in a field picture format, the memory controller checks the picture structure parameter to determine whether the incoming video data is a top field picture or a bottom field picture (step 920). . If the top field needs to be decoded, the three DS pointers are set as follows: That is, the DS read pointer 760 is set to 36, the first DS write pointer 730 is set to 0, and the second DS write pointer 740 is set to 1 (step 940). On the other hand, if the bottom field needs to be decoded, the DS read pointer 760 is set to 0, the first DS write pointer 730 is set to 36, and the second DS write pointer 740 is set to 37 (step 950). .
[0062]
As shown in FIG. 8, after the DS list is initialized, the memory manager scans the MS list 420 and checks the busy bit 510 of the entry in the MS list 420 to use it in the video memory 410. Check for possible space (stage 830). The memory manager scans the MS list 420 in order from index 0 to index 47 and finds the first two entries that do not have the busy bit 510 set. If there are no such “not busy” segments, or if there is only one such segment, decoding is stopped until the condition is met. The decoder thereby enters a wait stage waiting for two “not busy” video memory segments. Once the data in the video memory segment has been read and displayed, the memory manager sets the corresponding busy bit 510 of the memory segment to logic 0 to store the newly decoded video data. Free the segment.
[0063]
When two non-busy memory segments are found, the DS list 430 pointed to by the first DS write pointer (ie, 710 or 720) will have the MS index 420 of the first available memory segment (ie, 0, 1,. 2,... Or 47) is written and the second available memory segment is written to the DS entry pointed to by the second DS write pointer (ie, 720 or 740), so that the two DS write pointers are Two “non-busy” memory segments in 410 are indirectly addressed.
[0064]
After the DS entry has been updated, the decoder decodes a new macroblock row containing 16 lines of video data (stage 830).
[0065]
After the macroblock row of video data is decoded, the memory manager stores the decoded video data in video memory at step 840. FIG. 10 shows details of step 840.
[0066]
As shown in FIG. 10, if the picture is a field picture, the memory manager uses the MS segment entry indirectly addressed by the first DS write pointer 730 to store the top 8 lines of video data in each macroblock row. Write to the addressed memory segment (step 1030). The memory manager then writes the bottom eight lines of video data for each macroblock row to the memory segment addressed by the MS segment entry indirectly addressed by the second DS write pointer 740 (step 1030).
[0067]
On the other hand, if the picture is a frame picture, after the decoder decodes the 16 lines of video data of the incoming bitstream (ie, one macroblock row), the memory manager will return the top of the currently decoded frame. Eight lines of video data corresponding to the field are written to the memory segment addressed by the first MS segment entry indirectly addressed by the first DS write pointer 710. The memory manager then writes the eight lines of video data corresponding to the bottom field of the currently decoded frame to the memory segment addressed by the second MS segment entry indirectly addressed by the second DS write pointer 720. (Step 1020).
[0068]
As shown in FIG. 8, following the writing of the video data to the corresponding memory segment, the DS pointers (ie, 710, 720, and 750 or 730, 740, and 760) are then as shown in FIG. It is updated by a different stage (stage 860).
[0069]
FIG. 11 shows details of the update operation (stage 860) of the DS pointer (ie, 710, 720, and 750 or 730, 740, and 760). The memory manager first checks whether the video data is in field picture format or frame picture format (stage 1110), and then the memory manager updates the two DS write pointers accordingly (stages 1120 and 1130). ).
[0070]
If the video data is a frame picture format, both of the two DS write pointers 710 and 720 are advanced by 1 (step 1120). On the other hand, if the video data is in the field picture format, both of the two DS write pointers 730 and 740 are advanced by 2 (step 1130).
[0071]
After the DS pointer is updated, the memory manager checks whether there is more incoming video data for the current picture. If there is still an incoming video bitstream, the operation returns to the MS list scanning stage as shown in FIG. 8 (stage 870).
[0072]
If the incoming bitstream for the current picture ends, the video system then checks whether there are more pictures to be decoded and displayed (step 880). If there are more incoming pictures for decoding and display, the overall operation returns to step 820. The entire operation is completed when there are no more incoming pictures for display (step 890).
[0073]
When the system needs additional video data for display, the next 8 lines of data are stored in the video memory 410 using the MS index addressed by the DS read pointer (ie, 750 or 760). Retrieved from memory segment. The address stored in the MS list entry is used to locate the next memory segment to be displayed from the video memory 410. After a particular memory segment is read for display, the busy bit 510 of that MS entry 420 is set to non-busy and the DS read pointer is then incremented by 1 (modulo 72). After the pointer reaches 71, the DS read pointer is reset to the “0” position.
[0074]
In one embodiment of the invention, if the memory controller is not fast enough to handle the “indirect addressing” scheme described above, the DS list 430 entry is increased from 6 bits to 26 bits. Thereby, when the address of the memory segment is selected in the MS entry, it is copied to the DS entry along with the MS list 420 index. By writing the address of the memory segment to the DS list along with the MS list 420 entry, an extra step of addressing and retrieving data from the MS list 420 can be avoided.
[0075]
3. FLEXIRAM design
Another aspect of the present invention is directed to an intra-frame video memory compression / decompression method.
[0076]
In the present invention, the video data is compressed lossy or lossless before being stored in the video memory, and the compressed data is decompressed when it is read from the video memory. This video data compression / decompression apparatus is named “FlexiRam” by the present inventors.
[0077]
FIG. 12 shows three different embodiments of the present invention.
[0078]
In a preferred embodiment as shown in FIG. 12 (a), the FlexiRam 1220a of the present invention is incorporated into the memory manager 1210a, and all compression and decompression of video data is performed within the memory manager 1210a. The advantage of this embodiment is that there are no special hardware requirements for video memory 1230a and that both compression and decompression of video data are transparent to video memory 1230a.
[0079]
FIG. 12 (b) illustrates another preferred embodiment of the present invention, where FlexiRam 1220b is separate from both memory manager 1210b and video memory 1230b. An advantage of this embodiment is that the present invention can be implemented without substantial modification of the entire video decoding and display system.
[0080]
FIG. 12 (c) shows another embodiment of the present invention, where the FlexiRam 1220c is incorporated into the video memory 1230c. The advantage of this embodiment is that the entire data compression and decompression operation is transparent to the memory manager 1210c and does not add any overhead to the memory manager 1210c.
[0081]
It should be noted that the present invention works with all picture types (I, P and B) and is not limited to any type of picture. Thus, if desired, the present invention can be implemented in a split memory manager design using (a) the second memory manager 240 or (b) both the first memory manager 230 and the second memory manager 240 ( (See FIG. 2).
[0082]
Figures 13a, 13b and 13c illustrate three different embodiments of the present invention combining a FlexiRam design and a split memory manager design. FIG. 13a shows one of two preferred embodiments of the present invention. In this embodiment, two FlexiRam compressor / decoders are used. The first FlexiRam 1370a is placed between the first memory manager 1310a and the first memory 1330a. The second FlexiRam 1380a is placed between the second memory manager 1320a and the second memory 1340a, and all stored video memory data is compressed and decompressed using the FlexiRam technology. FIG. 13b shows another embodiment of the present invention. FlexiRam 1370b is placed between the second memory manager 1320b and the second memory 1340b, and only the video data for the bi-directional frame (ie, B) is compressed when stored in the video memory. Finally, FIG. 13c shows yet another embodiment of the present invention. The FlexiRam 1370c is placed between the first memory manager 1310c and the first memory 1330c, and only the video data for the prediction frame (ie, 2I; 1I and 1P; or 2P) is compressed when stored in the video memory. . It should be noted that the present invention does not limit the number of memory managers to two or three. The memory manager may be divided into a number (ie, more than 3) separate memory managers to handle various types and / or groups of video data.
[0083]
The most important factors when considering which implementation to implement for a particular system are the amount of video memory available, available processing speed, desired image quality, etc.
[0084]
For example, in addition to the embodiments described above, the FlexiRam design can be applied to compression and decompression of all three color components or only two chroma color components.
[0085]
The design can be further improved by combining all the embodiments and aspects disclosed above. For example, in one embodiment, only the U and V color components in a B picture are compressed with the FlexiRam design. Alternatively, in another embodiment, all three color components for all two picture groups (ie, reference frames or bi-directional frames) are compressed by the FlexiRam design.
[0086]
The design flexibility of the present invention in combining split memory manager, FlexiRam, and primary color splits makes system designers most appropriate and effective according to various video system requirements, video memory size, desired image quality, etc. This gives you a great advantage in design freedom, allowing you to choose the right combination.
[0087]
FIG. 14 outlines a preferred embodiment of the FlexiRam design. As shown in the figure, data compression consists of two stages. That is, (1) error diffusion (stage 1410) and (2) truncation of one pixel per quartet (stage 1420).
[0088]
Initially, the video data is compressed according to an error diffusion algorithm (stage 1410). In the preferred embodiment, a basic compression unit is defined as a “quartet” that includes four consecutive horizontal pixels. The error diffusion compression stage (stage 1410) compresses each pixel of the quartet individually. By using the 1-bit error diffusion algorithm, the quartet of pixels originally having 32 bits (4 pixels * 8 bits / pixel) is compressed to 28 bits (4 pixels * 7 bits / pixel). It should be noted that the present invention is not limited to a 1-bit error diffusion algorithm and that a multi-bit error diffusion algorithm can be implemented as well, but with a perceptible image quality degradation tradeoff.
[0089]
The second stage of video data compression is “1 pixel truncation per quartet” (stage 1420). The one-pixel truncation algorithm per quartet compresses four 7-bit pixel quartets into a 24-bit quartet. The data compressor calculates the best pixel of the quartet to be truncated and stores the reconstruction method in the last 3 bits of the quartet as a reconstruction descriptor (“RD”). After this compression stage, the compressed quartet of pixels has 24 bits (3 pixels * 7 bits / pixel + 3 bits RD). A detailed algorithm for truncating one pixel per quartet will be described later.
[0090]
When video data is needed from the video memory, FlexiRam performs decompression on the compressed data read from the memory. The decompression stage is the reverse of the compression stage shown in FIG.
[0091]
Initially, one pixel is added back to the 24-bit quartet by the 1-pixel reconstruction algorithm per quartet, which is the inverse of the 1-pixel truncation algorithm per quartet (stage 1430). This restoration phase reconstructs the 7-bit truncated pixel according to the reconstruction method supplied by the RD stored as the last 3 bits of 24 bits representing the quartet. After this restoration stage (stage 1430), the quartet is reshaped and has four 7-bit pixels. A preferred reconstruction algorithm will be described in detail later.
[0092]
Next, “0” is connected to each 7-bit pixel, and the process returns to the four 8-bit pixel quartets before compression (step 1440). This restoration step (step 1440) reconstructs four 8-bit pixels by concatenating one “0” as the least significant bit for all pixels.
[0093]
FIG. 15 shows the data format of the quartet of video data (ie, four 8-bit pixels) during various processes in the compression / decompression stage.
[0094]
Initially, each 8-bit pixel of quartet 1510 is individually compressed using error diffusion algorithm 1520. Intermediate compressed data consists of four 7-bit pixels 1530. These four 7-bit pixels 1530 are then further compressed by a 1 pixel truncation algorithm 1540 per quartet. The final compressed data consists of three 7-bit pixels 1551 and one 3-bit RD1552. The total length of the compressed data is 24 bits.
[0095]
When video data is needed from the video memory, the compressed data is decompressed in two successive stages:
[0096]
Initially, one additional pixel is generated according to a one pixel reconstruction algorithm 1560 per quartet. This algorithm is the inverse of the 1 pixel truncation algorithm 1540 based on 3-bit RD1552 of 24-bit compressed data 1550. Intermediate restoration data 1570 consists of four 7-bit pixels. Next, each of the four 7-bit pixels is concatenated with “0” as the least significant bit 1580 to recreate the four 8-bit pixels before compression.
[0097]
Details of the error diffusion and truncation of one pixel per quartet algorithm are described below.
[0098]
A. Error diffusion
It should be noted that various error diffusion algorithms can be used for this compression stage. The preferred embodiment disclosed below utilizes a 1-bit error diffusion algorithm. However, the present invention is not limited to any particular algorithm used for error diffusion processing. Further, as discussed previously, if more image quality degradation is allowed, multiple bits (greater than 1) can be removed by an error diffusion algorithm. Accordingly, the following specific 1-bit error diffusion algorithms are disclosed for illustrative purposes only.
[0099]
In this preferred embodiment of the present invention, the error diffusion (ED) algorithm operates independently on every line in the picture. At the beginning of the line, a 1-bit register labeled “e” is set to 1. “E” stores the currently running error and is updated on a pixel basis. The 1-bit ED algorithm is described by the following two equations.
I_out (J) = 2 * floor [(I_in(J) + e (j)) / 2]
However, I_inIf (j) = 255 and e (j) = 1, Iout (j) = 255
e (j + 1) = I_in(J) + e (j) + I_out (J)
Where
j: Index of the current pixel in the row
I_in(J): Original value of the pixel
I_out (J): New value of pixel (after ED)
e (j): error accumulator at time of pixel j
floor (x): closest integer less than or equal to x
[0100]
I_out After (j) is calculated, I_out The least significant bit of (j) is truncated, and this algorithm quantizes 8-bit pixels to 7-bit pixels while removing pseudo contour effects.
[0101]
This algorithm can be described, for example, as follows. That is, in one case I_inAssuming that (j) is 182 and binary 10010111 and e (j) is 1,
I_out (J) = 2 * floor ((10010111 + 1/2)
= 2 * floor (10011000/2)
= 2 * floor01001100
= 1001100
e (j + 1) = 10010111 + 1-1001100
= 0
[0102]
By truncating the least significant bit, the stored quantized pixel is = 100110.
[0103]
The “e” value is then propagated to the end of the line. Since the ED algorithm of the present invention operates on the raster structure and the picture is written into the block structure memory, it is necessary to store e (j) at the end of the block line. This value is used when the ED continues with the next pixel in the same line (which belongs to the next block). Therefore, an 8-bit latch must be used for all “interrupt” blocks. (Because e can only be 0 or 1, 1 bit is sufficient for every break line).
[0104]
The decompression stage 1580 is the reverse of the compression stage described above. In the preferred embodiment, the data is concatenated to every pixel with a “0” to recreate four 8-bit pixels.
[0105]
Each 7-bit pixel is then concatenated with “0” as the least significant bit. For example, “1101101” is reformed to “11011010”. Thus, the resulting data is four 8-bit pixels, similar to the original uncompressed data.
[0106]
After the data is compressed using a 1-bit error diffusion algorithm, 4 bits are truncated, thereby compressing a quartet having 4 8-bit pixels into 4 7-bit pixels. It is true that a higher compression ratio can be achieved by executing a “2” bit error diffusion algorithm that can compress a quartet having four 8-bit pixels into four 6-bit pixels. The compression ratio when using the 2-bit error diffusion algorithm is 24/32 × 100% (ie, 75%). However, it is found that there is perceptible image quality degradation after compression and decompression using a 2-bit error diffusion technique. Thus, instead of using a 2-bit error compression technique, the present invention discloses a two-stage compression / decompression process: (1) 1-bit error diffusion, and (2) 1 pixel truncation per quartet. Even if the two techniques have the same compression ratio (ie, 75%), the resulting image quality of the two-stage compression / decompression process disclosed in the present invention is a perceptible improvement over the two-bit error diffusion technique. Bring.
[0107]
B. 1 pixel rounded down per quartet
The second compression stage of the preferred embodiment of the present invention is truncation of one pixel per quartet. The one pixel truncation algorithm 1540 compresses four 7-bit pixels into three 7-bit pixels plus a 3-bit reconstruction descriptor (“RD”) (ie, a total of 24 bits).
[0108]
It should be noted that various pixel truncation algorithms can be used to reduce the number of pixels stored in the video memory. The present invention is not limited to any particular pixel truncation algorithm and equation used during the compression and decompression process. Furthermore, multiple pixels (greater than 1) can be truncated at a trade-off with image quality degradation. The following truncation of one pixel per quartet is disclosed as a preferred embodiment of the present invention with the best balance of memory savings and little perceptible impact on image quality.
[0109]
In the preferred embodiment, the four pixels of the quartet are each named consecutively as P0, P1, P2, or P3. Pixels P1 and P2 are two candidates to be truncated. The algorithm predicts (1) which pixels are more suitable for truncation, and (2) what reconstruction method should be used in reconstruction to estimate the truncated pixels. There are five possible reconstruction methods for each candidate. That is,
1. Copy the one on the left.
2. Copy the right neighbor.
3. Average the left neighbor and the right neighbor.
4). Add 1/4 of the left neighbor and 3/4 of the right neighbor.
5. Add 3/4 of the left neighbor and 1/4 of the right neighbor.
[0110]
Theoretically, truncating P1 and estimating it with a copy of P2 is equivalent to truncating P2 and estimating it with a copy of P1. Thus, for the two candidates (ie, P1 and P2), there are only 9 (not 10) possible estimators. Further, after truncating one pixel from the quartet, 21 bits are left to represent the remaining three pixels. In the preferred embodiment, the final compressed data target size for each quartet is 24 bits (25% compression). This means that only 3 bits can be assigned to describe the estimator that is selected, so in practice only 8 estimators can be considered. The estimator omitted from consideration is to reconstruct P2 by 3/4 of the left neighbor and 1/4 of the right neighbor. The reason for omitting this particular estimator is that the statistics of the MPEG picture show that it is least likely to be chosen as the best (four quarter, three quarter estimators). The choice is arbitrary.)
[0111]
Initially, eight estimators are calculated.
P⁰ ₁= P0 (P^k _j Represents the estimated amount of Pi using the k method)
P⁰ ₁= P2
P² ₁= 0.5P0 + 0.5P2
P^Three ₁= 0.25P0 + 0.75P2
P^Four ₁= 0.75P0 + 0.25P2
P¹ ₂= P3
P² ₂= 0.5P1 + 0.5P3
P^Three ₂= 0.25P1 + 0.75P3
[0112]
For all estimators, the absolute value of the estimation error is E^k _i = | P^k _i -Pi |. The minimum error is min_{i, k} {E^k _l }. If there is a minimum error greater than 1, the selection is arbitrary. (The decision is left pending for execution). Finally, according to the minimum error, Pj_imin, The estimator selected with the 3-bit reconstruction descriptor (RD) set and truncated, P^kmin _iminIs shown. The 3-bit RD is concatenated with the remaining three 7-bit pixels and together forms a 24-bit compressed quartet. The articulation method is left pending for execution.
[0113]
The decompression of compressed data is simply the reverse of the compression algorithm. The missing 7-bit truncated pixels are: (1) which pixel (ie, P1 or P2) to reconstruct, and (2) which reconstruction method to select for reconstruction. It is reconstructed using the last RD.
[0114]
As shown in FIG. 15, the data compressor obtains the reconstruction method provided by 3 bits of RD. Next, the pixel for reconstruction (P1 or P2) is determined and pixel reconstruction is performed according to the inverse of the above eight equations.
[0115]
It should be pointed out that the present invention is not limited to any particular method and equation used in the video data compression stage of (1) error diffusion and (2) pixel truncation. An important feature of the FlexiRam design disclosed as the present invention is the specific sequence of these two data compression / decompression stages. That is, the video data is first compressed using an error diffusion algorithm with a reduced number of bits per pixel. The intermediate compressed video data is then further compressed using a pixel truncation algorithm that truncates one pixel from a particular group of pixels. As discussed previously, the decompression process is just the opposite of the compression process.
[0116]
While the invention has been described in conjunction with the preferred embodiments described above, the description and examples are intended to be illustrative and not limiting the scope of the invention, which is defined by the appended claims. It should be understood that it is specified.
[Brief description of the drawings]
FIG. 1 shows a conventional video decoding / display system.
FIG. 2 illustrates a preferred embodiment of the split memory manager design of the present invention.
FIG. 3 illustrates another preferred embodiment of the split memory manager design of the present invention.
FIG. 4 illustrates a method of indirectly addressing video memory using a display segment list and a memory segment list.
FIG. 5 shows details of a memory segment list.
FIG. 6 shows details of a display segment list.
FIG. 7 shows the positioning of two write and read pointers on the display segment list for (a) frame picture format and (b) field picture format.
FIG. 8 is a block diagram showing details of a segmented reusable video memory design (named rolling memory design by the inventor).
FIG. 9 shows details of initialization of a display segment list for each picture.
FIG. 10 shows the writing of the decoded video data of each macroblock row to the video memory.
FIG. 11 shows details of updating the display segment list.
FIG. 12 shows three preferred embodiments of the FlexiRam design.
FIG. 13a shows another three preferred embodiments of the FlexiRam design.
FIG. 13b shows another three preferred embodiments of the FlexiRam design.
FIG. 13c shows another three preferred embodiments of the FlexiRam design.
FIG. 14 is a block diagram showing stages of video data compression / decompression.
FIG. 15 shows the quartet video data format during various stages of compression and decompression.

Claims

A system for decoding and displaying a bit stream of video data representing an encoded video image,
A decoder for decoding the bit stream of the video data,
A large number of video memory manager coupled to the decoder, wherein each of the plurality of memory manager, the reference frame data and bidirectional frames comprising data decoded customized memory management for each video data group handling different groups of video data the decoded using a policy, and the number of video memory manager includes a first memory manager for handling reference frame data of said decoded video data, which is the decoded and a second memory manager for handling bidirectional frame data of the video data, and a video memory manager,
A corresponding number of video memory, each of said plurality of video memory, the number of the respectively connected so as to correspond to the video memory manager, thereby a different memory management for each of the decoded video data groups performed, and the corresponding number of video memory is to store the reference frame data, a first video memory coupled to the first video memory manager, for storing the bidirectional frame data a, it consists of a second video memory connected to said second video memory manager, the storage capacity of the first video memory is to store the decoded video image information for two complete frames to be sufficient, the storage capacity of the second video memory is to store the decoded video image information for one complete frame Less than sufficient storage capacity for a corresponding number of video memory,
To display the video image, and a display subsystem coupled to the plurality of video memory manager,
A system consisting of

The system of claim 1, wherein
System characterized in that the top field first signal and repeat the top field signal is supplied to the first video memory manager.

The system of claim 1, wherein
System for supplying said top field first signal to said second video memory manager.

The system of claim 1, wherein
Wherein the first video memory manager to supply the top field first signal and repeat the top field signal, the system supplies the top field first signal to said second video memory manager.

The system of claim 1, wherein
A concatenated data compressor to said second video memory manager Contact and the second video memory, compressing the bidirectional frame data to be stored in the second video memory, the data compressor And
A concatenated data decompressor to the second video memory manager Contact and the second video memory, and restores the bidirectional frame data the compressed read the second video memory or al, and data recovery unit,
A system that further consists of.

The system of claim 5, wherein
Said data compressor, said video data compression for error diffusion unit Kamao and pixel truncated Organization or Rannahli, said data decompressor, said second video Note the compressed video data stored in the Li pixel reconstruction unit restores the Kamao and bit concatenation Organization or Ranaru system.

The system of claim 1, wherein
A concatenated data compressor in the first video memory manager Contact and the first video memory, compressing said reference frame data to be stored in the first video memory, a data compressor ,
The first a video memory manager Contact and concatenated data decompressor in the first video memory, to restore the reference frame data to which the compressed read the first video memory or al, data and restorer,
A system that further consists of.

The system of claim 1, wherein
A first data compressor coupled to the first video memory manager Contact and the first video memory, compressing said reference frame data to be stored in the first video memory, first and the data compressor,
A first data decompressor coupled to the first video memory manager Contact and the first video memory, to restore the reference frame data to which the compressed read the first video memory or al a first data recovery device,
A second data compressor coupled to said second video memory manager Contact and the second video memory, compressing the bidirectional frame data to be stored in the second video memory, the and 2 data compressor,
A second data decompressor coupled to the second video memory manager Contact and the second video memory, restores the bidirectional frame data the compressed read the second video memory or al to a second data restorer,
A system that further consists of.

The system of claim 7, wherein
Said data compressor, said video data compression for error diffusion unit Kamao and pixel truncated Organization or Rannahli, said data decompressor, said first video Note the compressed video data stored in the Li pixel reconstruction unit restores the Kamao and bit concatenation Organization or Ranaru system.

The system of claim 1, wherein
At least one data compressor coupled to at least one video memory manager of the plurality of video memory managers and a corresponding associated video memory, wherein the at least one data compressor is in the corresponding associated video memory. compressing the decoded video data to be stored, and at least one data compressor,
And at least one video memory manager, and at least one data decompressor coupled to video memory it as being corresponding connecting one of the plurality of video memory manager, read from the video memory being connected the corresponding restores the video data issued, and at least one data restorer,
A system that further consists of.

The system of claim 10 , wherein
At least one data compressor is a lossy system.

The system of claim 10 , wherein
The system in which at least one data compressor is lossless.

The system of claim 10 , wherein
Storing said at least one data compressor, said Kamao error diffusion apparatus for compressing video data and pixel truncated Organization or Rannahli, wherein the at least one data decompressor is in video memory is coupled the corresponding It is the compressed pixel reconstruction unit restores the video data are Kamao and bit joining Organization or Ranaru system.