JPH0973299A

JPH0973299A - Mpeg audio reproducing device and mpeg reproducing device

Info

Publication number: JPH0973299A
Application number: JP16945496A
Authority: JP
Inventors: Hideki Yamauchi; 英樹山内; Shigeyuki Okada; 茂之岡田; Masayuki Iida; 正幸飯田; Koji Tanaka; 浩司田中
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1995-06-30
Filing date: 1996-06-28
Publication date: 1997-03-18
Anticipated expiration: 2016-06-28
Also published as: JP3594409B2

Abstract

PROBLEM TO BE SOLVED: To provide an MPEG audio reproducing device which reproduces audio signals that are easily understood during a variable speed reproducing. SOLUTION: An MPEG audio reproducing device 1 consists of a reproducing speed detecting circuit 2, an MPEG audio decoder 3, a speech speed conversion processing circuit 4, a D/A converter 5 and an audio amplifier 6. Moreover, an MPEG reproducing device is provided with an audio-video purser (an AV purser) and an MPEG video decoder 12 in addition to the device 1. The circuit 4 consists of a DSP 31, a ring memory 32 and an up-down counter 33. The circuit 4 expands the time length of the voice segment inputted during a high speed reproducing and reduces the time length of each silence interval. During a low speed reproducing, the time length of each voice segment is expanded, the time length of each silence interval is reduced or each silence interval is deleted, each voice segment is connected together and inserted into a silecne interval.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明はＭＰＥＧ（Moving P
icture Expert Group ）オーディオ再生装置およびＭＰ
ＥＧ再生装置に係り、詳しくは、話速変換機能を備えた
ＭＰＥＧオーディオ再生装置およびＭＰＥＧ再生装置に
関するものである。BACKGROUND OF THE INVENTION The present invention relates to MPEG (Moving P
icture Expert Group) Audio player and MP
The present invention relates to an EG reproducing device, and more particularly to an MPEG audio reproducing device and an MPEG reproducing device having a speech rate conversion function.

【０００２】[0002]

【従来の技術】マルチメディアで扱われる情報は、膨大
な量で且つ多種多様であり、これらの情報を高速に処理
することがマルチメディアの実用化を図る上で必要とな
ってくる。情報を高速に処理するためには、データの圧
縮・伸長技術が不可欠となる。そのようなデータの圧縮
・伸長技術として「ＭＰＥＧ」方式が挙げられる。この
ＭＰＥＧ方式は、ＩＳＯ（International Organization
for Standardization）／ＩＥＣ（Intarnational Elec
trotechnical Commission ）傘下のＭＰＥＧ委員会（IS
O/IEC JTC1/SC29/WG11）によって標準化されつつある。2. Description of the Related Art The information handled by multimedia is enormous and diverse, and it is necessary to process such information at high speed in order to put multimedia into practical use. In order to process information at high speed, data compression / decompression technology is indispensable. As such data compression / decompression technology, the “MPEG” method is exemplified. This MPEG system is based on the ISO (International Organization).
for Standardization) / IEC (Intarnational Elec)
trotechnical Commission)
It is being standardized by O / IEC JTC1 / SC29 / WG11).

【０００３】ＭＰＥＧは３つのパートから構成されてい
る。パート１の「ＭＰＥＧシステムパート」（ISO/IEC
IS 11172 Part1:Systems）では、ビデオデータとオーデ
ィオデータの多重化構造（マルチプレクス・ストラクチ
ャ）および同期方式が規定される。パート２の「ＭＰＥ
Ｇビデオパート」（ISO/IEC IS 11172 Part2:Video）で
は、ビデオデータの高能率符号化方式およびビデオデー
タのフォーマットが規定される。パート３の「ＭＰＥＧ
オーディオパート」（ISO/IEC IS 11172 Part3:Audio）
では、オーディオデータの高能率符号化方式およびオー
ディオデータのフォーマットが規定される。[0003] MPEG is composed of three parts. Part 1 “MPEG System Part” (ISO / IEC
IS 11172 Part1: Systems) defines a multiplex structure (multiplex structure) of video data and audio data and a synchronization method. Part 2 "MPE
In the "G video part" (ISO / IEC IS 11172 Part2: Video), a high-efficiency encoding method of video data and a format of the video data are specified. Part 3, "MPEG
Audio Part ”(ISO / IEC IS 11172 Part3: Audio)
Defines a high-efficiency encoding method of audio data and a format of audio data.

【０００４】ＭＰＥＧビデオパートで取り扱われるビデ
オデータは動画に関するものであり、その動画は１秒間
に数十個（例えば、３０個）のフレーム（静止画、コ
マ）によって構成されている。ビデオデータは、シーケ
ンス（Sequence）、ＧＯＰ（Group Of Pictures ）、ピ
クチャ、スライス（Slice ）、マクロブロック（Macrob
lock）、ブロックの順に６層の階層構造から成る。[0004] The video data handled by the MPEG video part relates to a moving image, and the moving image is composed of several tens (eg, 30) frames (still images, frames) per second. Video data includes a sequence (Sequence), a GOP (Group Of Pictures), a picture, a slice (Slice), and a macro block (Macrob).
lock), and a block in order of 6 layers.

【０００５】また、ＭＰＥＧには主にエンコードレート
の違いにより、現在のところ、ＭＰＥＧ−１，ＭＰＥＧ
−２の２つの方式がある。ＭＰＥＧ−１においてフレー
ムはピクチャに対応している。ＭＰＥＧ−２において
は、フレームまたはフィールドをピクチャに対応させる
こともできる。フィールドは、２個で１つのフレームを
構成している。ピクチャにフレームが対応している構造
はフレーム構造と呼ばれ、ピクチャにフィールドが対応
している構造はフィールド構造と呼ばれる。[0005] At present, MPEG-1 and MPEG-1 are mainly used due to differences in encoding rates.
-2. In MPEG-1, a frame corresponds to a picture. In MPEG-2, a frame or a field can correspond to a picture. Two fields constitute one frame. A structure in which a frame corresponds to a picture is called a frame structure, and a structure in which a field corresponds to a picture is called a field structure.

【０００６】ＭＰＥＧでは、フレーム間予測と呼ばれる
圧縮技術を用いる。フレーム間予測は、フレーム間のデ
ータを時間的な相関に基づいて圧縮する。フレーム間予
測では双方向予測が行われる。双方向予測とは、過去の
再生画像（または、ピクチャ）から現在の再生画像を予
測する順方向予測と、未来の再生画像から現在の再生画
像を予測する逆方向予測とを併用することである。[0006] MPEG uses a compression technique called inter-frame prediction. Inter-frame prediction compresses data between frames based on temporal correlation. In the inter-frame prediction, bidirectional prediction is performed. The bidirectional prediction is to use both forward prediction for predicting a current reproduced image from a past reproduced image (or picture) and backward prediction for predicting a current reproduced image from a future reproduced image. .

【０００７】この双方向予測は、Ｉピクチャ（Intra-Pi
cture ），Ｐピクチャ（Predictive-Picture），Ｂピク
チャ（Bidirectionally predictive-Picture）と呼ばれ
る３つのタイプのピクチャを規定している。Ｉピクチャ
は、過去や未来の再生画像とは無関係に、独立して生成
される。Ｐピクチャは順方向予測（過去のＩピクチャま
たはＰピクチャからの予測）により生成される。Ｂピク
チャは双方向予測により生成される。双方向予測におい
てＢピクチャは、以下に示す３つの予測のうちいずれか
１つにより生成される。順方向予測；過去のＩピクチ
ャまたはＰピクチャからの予測、逆方向予測；未来の
ＩピクチャまたはＰピクチャからの予測、双方向予
測；過去および未来のＩピクチャまたはＰピクチャから
の予測。そして、これらＩ，Ｐ，Ｂピクチャがそれぞれ
エンコードされる。つまり、Ｉピクチャは過去や未来の
ピクチャが無くても生成される。これに対し、Ｐピクチ
ャは過去のピクチャが無いと生成されず、Ｂピクチャは
過去または未来のピクチャが無いと生成されない。[0007] This bidirectional prediction is based on an I-picture (Intra-Pi
), a P picture (Predictive-Picture), and a B picture (Bidirectionally predictive-Picture). The I picture is generated independently of a past or future reproduced image. The P picture is generated by forward prediction (prediction from a past I picture or P picture). B pictures are generated by bidirectional prediction. In bidirectional prediction, a B picture is generated by any one of the following three predictions. Forward prediction; prediction from past I or P pictures; backward prediction; prediction from future I or P pictures, bidirectional prediction; prediction from past and future I or P pictures. Then, these I, P, and B pictures are respectively encoded. That is, an I picture is generated even if there is no past or future picture. In contrast, a P picture is not generated without a past picture, and a B picture is not generated without a past or future picture.

【０００８】フレーム間予測では、まず、Ｉピクチャが
周期的に生成される。次に、Ｉピクチャよりも数フレー
ム先のフレームがＰピクチャとして生成される。このＰ
ピクチャは、過去から現在への一方向（順方向）の予測
により生成される。続いて、Ｉピクチャの前、Ｐピクチ
ャの後に位置するフレームがＢピクチャとして生成され
る。このＢピクチャを生成するとき、順方向予測，逆方
向予測，双方向予測の３つの中から最適な予測方法が選
択される。一般的に連続した動画では、現在の画像とそ
の前後の画像とは良く似ており、異なっているのはその
一部分に過ぎない。そこで、前のフレーム（例えば、Ｉ
ピクチャ）と次のフレーム（例えば、Ｐピクチャ）とは
同じであると仮定し、両フレーム間に変化があればその
差分（Ｂピクチャ）のみを抽出して圧縮する。これによ
り、フレーム間のデータを時間的な相関に基づいて圧縮
することができる。In the inter-frame prediction, first, an I picture is periodically generated. Next, a frame several frames ahead of the I picture is generated as a P picture. This P
The picture is generated by one-way (forward) prediction from the past to the present. Subsequently, a frame located before the I picture and after the P picture is generated as a B picture. When generating this B picture, an optimal prediction method is selected from three of forward prediction, backward prediction, and bidirectional prediction. In general, in a continuous moving image, a current image and images before and after the current image are very similar, and only a part thereof is different. Then, the previous frame (for example, I
It is assumed that the picture and the next frame (for example, P picture) are the same, and if there is a change between both frames, only the difference (B picture) is extracted and compressed. Thus, data between frames can be compressed based on temporal correlation.

【０００９】ＭＰＥＧビデオパートに準拠してエンコー
ドされたビデオデータのデータ列（ビットストリーム）
は、ＭＰＥＧビデオストリーム（以下、ビデオストリー
ムと略す）と呼ばれる。また、ＭＰＥＧオーディオパー
トに準拠してエンコードされたオーディオデータのデー
タ列は、ＭＰＥＧオーディオストリーム（以下、オーデ
ィオストリームと略す）と呼ばれる。そして、ビデオス
トリームとオーディオストリームは、ＭＰＥＧシステム
パートに準拠して時分割多重化され、１本のデータ列と
してのＭＰＥＧシステムストリーム（以下、システムス
トリームと略す）となる。システムストリームはマルチ
プレックスストリームとも呼ばれる。A data string (bit stream) of video data encoded according to the MPEG video part
Is called an MPEG video stream (hereinafter abbreviated as video stream). A data string of audio data encoded according to the MPEG audio part is called an MPEG audio stream (hereinafter, abbreviated as audio stream). Then, the video stream and the audio stream are time-division multiplexed according to the MPEG system part and become an MPEG system stream (hereinafter, abbreviated as system stream) as one data string. System streams are also called multiplex streams.

【００１０】ＭＰＥＧパートにおけるエンコードからデ
コードまでの流れは、以下のようになっている。ＭＰＥ
Ｇシステムエンコーダ（以下、システムエンコーダと略
す）は、ビデオデータとオーディオデータのそれぞれを
連係を保ちながら別個にエンコードを行い、ビデオスト
リームとオーディオストリームを生成する。次に、ＭＰ
ＥＧシステムエンコーダに装備されたマルチプレクサ
（ＭＵＸ；Multiplexer）は、伝送媒体または記録媒体
のフォーマットに適合するように、ビデオストリームと
オーディオストリームの多重化を行い、システムストリ
ームを生成する。そのシステムストリームは、伝送媒体
を介してＭＵＸから伝送されるか、または記録媒体に記
録される。The flow from encoding to decoding in the MPEG part is as follows. MPE
The G system encoder (hereinafter abbreviated as system encoder) separately encodes the video data and the audio data while maintaining the association with each other to generate a video stream and an audio stream. Next, MP
A multiplexer (MUX; Multiplexer) provided in the EG system encoder multiplexes a video stream and an audio stream so as to match a format of a transmission medium or a recording medium, and generates a system stream. The system stream is transmitted from the MUX via a transmission medium or recorded on a recording medium.

【００１１】ＭＰＥＧシステムデコーダ（以下、システ
ムデコーダと略す）に装備されたデマルチプレクサ（Ｄ
ＭＵＸ；DeMUltipleXer ）は、システムストリームをビ
デオストリームとオーディオストリームに分離する。次
に、システムデコーダは各ストリームを個別にデコード
して、ビデオのデコード出力（以下、ビデオ出力とい
う）とオーディオのデコード出力（以下、オーディオ出
力という）を生成する。ビデオ出力はディスプレイへ出
力され、ディスプレイで動画が再生される。オーディオ
出力はＤ／Ａ（Digital/Analog）コンバータおよびオー
ディオアンプを介してスピーカへ出力され、スピーカか
ら音声が再生される。A demultiplexer (D) provided in an MPEG system decoder (hereinafter abbreviated as system decoder)
MUX; DeMUltipleXer) separates the system stream into a video stream and an audio stream. Next, the system decoder individually decodes each stream to generate a video decoded output (hereinafter referred to as video output) and an audio decoded output (hereinafter referred to as audio output). The video output is output to the display, and the moving image is played on the display. The audio output is output to the speaker via the D / A (Digital / Analog) converter and the audio amplifier, and the sound is reproduced from the speaker.

【００１２】ところで、ＭＰＥＧ−１は主にビデオＣＤ
（Compact Disc），ＣＤ−ＲＯＭ（CD-Read Only Memor
y ），ＤＶＤ（Digital Video Disc）などの記録媒体を
用いた蓄積メディアに対応しており、ＭＰＥＧ−２はＭ
ＰＥＧ−１をも含む幅広い範囲のアプリケーションに対
応している。By the way, MPEG-1 is mainly a video CD.
(Compact Disc), CD-ROM (CD-Read Only Memor
y), DVD (Digital Video Disc) and other recording media are used, and MPEG-2 is M
It supports a wide range of applications including PEG-1.

【００１３】蓄積メディアにおいては、以下に示す２つ
の可変速再生が要求される。動画を通常（標準）の再
生速度より高速で再生（以下、高速再生という）する機
能。動画を通常の再生速度より低速で再生（以下、低
速再生という）する機能。高速再生機能は、例えば、ユ
ーザが短時間に動画を見るために早送り再生を行う際
や、見たい動画を探索するために早送り再生または早送
り逆転再生を行う際に用いられる。また、低速再生機能
は、例えば、ユーザが動画を注意深く見る際などに用い
られる。In the storage medium, the following two variable speed reproductions are required. A function to play a video at a higher speed than the normal (standard) playback speed (hereinafter referred to as high-speed playback). A function to play a movie at a slower speed than the normal playback speed (hereinafter referred to as slow playback). The high-speed playback function is used, for example, when a user performs fast-forward playback to view a moving image in a short time, or when performing fast-forward playback or fast-forward reverse playback to search for a desired moving image. Further, the slow-speed playback function is used, for example, when the user carefully watches a moving image.

【００１４】記録媒体から読み出されたシステムストリ
ームのビットレートは、読み出し速度に対応したものに
なる。従って、高速再生を行うには記録媒体からシステ
ムストリームを高速で読み出し、低速再生を行うには記
録媒体からシステムストリームを低速で読み出す。例え
ば、記録媒体としてビデオＣＤやＤＶＤを用いた場合に
は、ビデオＣＤやＤＶＤの回転速度を通常の再生時（標
準再生時）よりも速くしたり遅くしたりすることで、シ
ステムストリームを所望の速度で読み出すようにする。The bit rate of the system stream read from the recording medium corresponds to the read speed. Therefore, the system stream is read from the recording medium at high speed for high-speed reproduction, and the system stream is read at low speed from the recording medium for low-speed reproduction. For example, when a video CD or DVD is used as a recording medium, the system stream is set to a desired level by making the rotation speed of the video CD or DVD faster or slower than that during normal reproduction (standard reproduction). Read at speed.

【００１５】[0015]

【発明が解決しようとする課題】従来、ＭＰＥＧにおい
ては、前記したような動画の可変速再生については検討
されていたものの、音声の可変速再生については何らの
検討もなされていなかった。Conventionally, in MPEG, although the variable speed reproduction of moving images as described above has been studied, no consideration has been given to the variable speed reproduction of audio.

【００１６】オーディオストリームのビットレートはシ
ステムストリームのそれと同一である。そのため、動画
の高速再生時には、オーディオストリームのビットレー
トも大きくなり、再生される音声の音程（ピッチ）が上
がるのに加えて、発声速度（話速）が速くなる。また、
動画の低速再生時には、オーディオストリームのビット
レートも小さくなり、再生される音声のピッチは変化し
ないものの、音声が途切れ途切れになる。このように、
動画の可変速再生時には、音声が聞き苦しいものになる
という問題があった。The bit rate of the audio stream is the same as that of the system stream. Therefore, at the time of high-speed reproduction of a moving image, the bit rate of the audio stream also increases, and the pitch (pitch) of the reproduced audio increases, and the utterance speed (speech speed) also increases. Also,
During low-speed playback of a moving image, the bit rate of the audio stream also decreases, and the pitch of the played audio does not change, but the audio is interrupted. in this way,
There was a problem that the audio becomes uncomfortable when playing a variable speed video.

【００１７】ところで、近年、ピッチを変化させること
なく話速を任意に制御する話速変換技術の開発が進めら
れており、本出願人もＶＴＲやテープレコーダに利用可
能な話速変換処理ＬＳＩを既に開発している（特開平７
−１９２３９２号公報（G11B20/02）、日経エレクトロ
ニクス 1994 年11月21日号(No.622) P.93 〜98. 参
照）。しかし、話速変換技術をＭＰＥＧに利用する試み
はなされていない。By the way, in recent years, the development of a voice speed conversion technique for arbitrarily controlling the voice speed without changing the pitch has been promoted, and the applicant of the present invention has developed a voice speed conversion processing LSI which can be used for a VTR or a tape recorder. Already developed
-192392 (G11B20 / 02), Nikkei Electronics November 21, 1994 issue (No. 622) P.93-98.). However, no attempt has been made to utilize the speech speed conversion technology for MPEG.

【００１８】また、音声と動画（映像）の同期生成にお
いては、「リップシンク」を考慮する必要がある。リッ
プシンクとは、ディスプレイに映し出される人物の口の
動きと、スピーカから発声される音声との同期がとれて
いることをいう。口の動きより音声の方が早くなった
り、逆に遅くなったりする状態をリップシンクにずれが
あるという。リップシンクのずれが人間の聴覚の許容範
囲を外れると、視聴者は違和感を覚える。一般に、音声
が動画より遅れることによって生じるリップシンクのず
れとして許容できる時間は、約５０〜２５０msであると
いわれている。Further, in the synchronous generation of the audio and the moving image (video), it is necessary to consider "lip sync". Lip sync means that the movement of the person's mouth displayed on the display and the voice uttered from the speaker are synchronized. It is said that there is a gap in the lip sync when the sound is faster than the mouth movement, and vice versa. If the deviation of the lip sync is outside the permissible range of human hearing, the viewer feels uncomfortable. Generally, it is said that the time allowed as the lip sync shift caused by the delay of the audio from the moving image is about 50 to 250 ms.

【００１９】本発明は上記要求を満足するためになされ
たものであって、以下の目的を有するものである。〔１〕可変速再生時においても自然で聞き易い音声を再
生することが可能なＭＰＥＧオーディオ再生装置を提供
する。The present invention has been made to satisfy the above requirements, and has the following objects. [1] To provide an MPEG audio reproducing apparatus capable of reproducing a natural and easy-to-listen sound even during variable speed reproduction.

【００２０】〔２〕上記〔１〕のＭＰＥＧオーディオ再
生装置とＭＰＥＧビデオデコーダとを備えたＭＰＥＧ再
生装置を提供する。〔３〕上記〔１〕のＭＰＥＧオーディオ再生装置とＭＰ
ＥＧビデオデコーダとを備え、音声と動画との時間ずれ
を低減することが可能なＭＰＥＧ再生装置を提供する。[2] An MPEG reproducing apparatus provided with the MPEG audio reproducing apparatus and the MPEG video decoder of the above [1] is provided. [3] MPEG audio reproducing apparatus and MP of [1] above
Provided is an MPEG playback device including an EG video decoder and capable of reducing time lag between audio and moving images.

【００２１】[0021]

【課題を解決するための手段】請求項１に記載の発明
は、ＭＰＥＧオーディオデコーダ（３）と、その出力に
対して話速変換処理を行う話速変換処理手段（２，４）
とを備えたことをその要旨とする。According to a first aspect of the present invention, an MPEG audio decoder (3) and a voice speed conversion processing means (2, 4) for performing a voice speed conversion process on its output.
The point is to have and.

【００２２】請求項２に記載の発明は、ＭＰＥＧオーデ
ィオデコーダ（３）と、その出力に対して話速変換処理
を行う話速変換処理手段（２，４）と、ＭＰＥＧビデオ
デコーダ（３）とを備えたことをその要旨とする。According to the second aspect of the present invention, an MPEG audio decoder (3), a voice speed conversion processing means (2, 4) for performing a voice speed conversion process on the output, and an MPEG video decoder (3). The point is to have

【００２３】請求項３に記載の発明は、記録媒体（２
１）から読み出されたＭＰＥＧオーディオストリームを
ＭＰＥＧオーディオパートに準拠してデコードし、オー
ディオ信号を生成するＭＰＥＧオーディオデコーダ
（３）と、オーディオ信号に対して話速変換処理を行う
話速変換処理手段（２，４）とを備え、話速変換処理手
段は、オーディオストリームのビットレートが通常時よ
りも大きい場合には、再生される音声のピッチを通常の
再生時とほぼ同一にし、且つ、再生される話速を通常の
再生時に近づけるように話速変換処理を行い、オーディ
オストリームのビットレートが通常時よりも小さい場合
には、音声区間の途切れが目立たなくなるように話速変
換処理を行うことをその要旨とする。According to a third aspect of the present invention, there is provided a recording medium (2
An MPEG audio decoder (3) that decodes the MPEG audio stream read from 1) according to the MPEG audio part to generate an audio signal, and a speech speed conversion processing unit that performs a speech speed conversion process on the audio signal. (2, 4), the speech speed conversion processing unit makes the pitch of the reproduced sound substantially the same as that during normal reproduction when the bit rate of the audio stream is higher than during normal reproduction, and The voice speed conversion process should be performed so that the voice speed becomes closer to that during normal playback, and if the bit rate of the audio stream is lower than the normal time, the voice speed conversion process should be performed so that the breaks in the voice section become less noticeable. Is the gist.

【００２４】請求項４に記載の発明は、記録媒体（２
１）から読み出されたＭＰＥＧオーディオストリームを
ＭＰＥＧオーディオパートに準拠してデコードし、オー
ディオ信号を生成するＭＰＥＧオーディオデコーダ
（３）と、オーディオ信号に対して話速変換処理を行う
話速変換処理手段（２，４）とを備え、話速変換処理手
段は、オーディオストリームのビットレートが通常時よ
りも大きい場合には、再生される音声区間の時間長さを
伸長すると共に各無音区間の時間長さを短くするように
して話速変換処理を行い、オーディオストリームのビッ
トレートが通常時よりも小さい場合には、再生される各
音声区間の時間長さを伸長すると共に各無音区間の時間
長さを短くするか、または、各無音区間を削除して各音
声区間をつなぎ合わせた後に無音区間を挿入するように
して話速変換処理を行うことをその要旨とする。According to a fourth aspect of the present invention, there is provided a recording medium (2
An MPEG audio decoder (3) that decodes the MPEG audio stream read from 1) according to the MPEG audio part to generate an audio signal, and a speech speed conversion processing unit that performs a speech speed conversion process on the audio signal. (2, 4), the voice speed conversion processing means, when the bit rate of the audio stream is higher than the normal time, extends the time length of the reproduced voice section and the time length of each silent section. If the bit rate of the audio stream is lower than normal, the time length of each voice section to be played back is extended and the time length of each silent section is shortened. Or shorten the speech rate, or delete the silent sections and connect the speech sections, and then insert the silent sections to perform the speech speed conversion process. It and its gist.

【００２５】請求項５に記載の発明は、請求項３または
請求項４に記載のＭＰＥＧオーディオ再生装置におい
て、話速変換処理手段（２，４）は、オーディオ信号を
蓄積するリングメモリ（３２）と、リングメモリの蓄積
量を検出する検出手段（３３）とを備え、リングメモリ
の蓄積量に応じて音声区間の時間長さの圧縮伸長率を調
整することをその要旨とする。According to a fifth aspect of the present invention, in the MPEG audio reproducing apparatus according to the third or fourth aspect, the speech speed conversion processing means (2, 4) has a ring memory (32) for accumulating an audio signal. And a detection means (33) for detecting the accumulated amount of the ring memory, and the gist is to adjust the compression / expansion rate of the time length of the voice section according to the accumulated amount of the ring memory.

【００２６】請求項６に記載の発明は、請求項５に記載
のＭＰＥＧオーディオ再生装置において、話速変換処理
手段（２，４）は、オーディオ信号の音声区間と無音区
間とを判別する音声判別部（４１）と、無音区間の削除
処理または挿入処理を行う無音削除挿入部（４２）と、
リングメモリ（３２）の蓄積量に基づいて音声区間の圧
縮伸長処理を行うことで圧縮伸長率を調整する時間軸圧
縮伸長部（４３）とを備えたことをその要旨とする。According to a sixth aspect of the present invention, in the MPEG audio reproducing apparatus according to the fifth aspect, the voice speed conversion processing means (2, 4) discriminates the voice section of the audio signal from the voice section. A part (41), a silence deletion insertion part (42) for performing a deletion process or insertion process of a silence section,
The gist of the present invention is to include a time axis compression / expansion unit (43) that adjusts the compression / expansion rate by performing compression / expansion processing of a voice section based on the amount of storage in the ring memory (32).

【００２７】請求項７に記載の発明は、請求項３〜６の
いずれか１項に記載のＭＰＥＧオーディオ再生装置
（１）と、記録媒体（２１）から読み出されたＭＰＥＧ
ビデオストリームをＭＰＥＧビデオパートに準拠してデ
コードし、ビデオ信号を生成するＭＰＥＧビデオデコー
ダ（１２）とを備えたことをその要旨とする。The invention described in claim 7 is the MPEG audio reproducing apparatus (1) according to any one of claims 3 to 6 and the MPEG read from the recording medium (21).
The gist of the present invention is to include an MPEG video decoder (12) for decoding a video stream in accordance with the MPEG video part and generating a video signal.

【００２８】請求項８に記載の発明は、請求項５または
請求項６に記載のＭＰＥＧオーディオ再生装置（１）
と、記録媒体（２１）から読み出されたＭＰＥＧビデオ
ストリームをＭＰＥＧビデオパートに準拠してデコード
し、ビデオ信号を生成するＭＰＥＧビデオデコーダ（１
２）と、リングメモリ（３２）に書き込まれる以前のオ
ーディオ信号に、時刻に関する情報としてのインデック
ス信号を付加するインデックス付加回路（５１）と、リ
ングメモリ（３２）から読み出されたオーディオ信号に
付加されているインデックス信号を検出し、そのインデ
ックス信号から得られる時刻情報と現在の時刻情報とか
ら、話速変換処理手段（２，４）における信号遅延時間
を検出し、その検出された遅延時間を示す信号をＭＰＥ
Ｇビデオデコーダ（１２）へ供給するインデックス検出
回路（５２）とを備え、ＭＰＥＧビデオデコーダ（１
２）は、前記遅延時間を示す信号に基づいて自己の動作
のタイミングを制御することをその要旨とする。The invention described in claim 8 is the MPEG audio reproducing apparatus (1) according to claim 5 or claim 6.
And an MPEG video decoder (1) for decoding the MPEG video stream read from the recording medium (21) in accordance with the MPEG video part to generate a video signal.
2), an index adding circuit (51) for adding an index signal as information about time to the audio signal before being written in the ring memory (32), and an audio signal read from the ring memory (32) Detected index signal, the signal delay time in the speech speed conversion processing means (2, 4) is detected from the time information obtained from the index signal and the current time information, and the detected delay time is calculated. Show the signal MPE
The MPEG video decoder (1) is provided with an index detection circuit (52) which supplies the G video decoder (12).
The gist of 2) is to control the timing of its own operation based on the signal indicating the delay time.

【００２９】請求項９に記載の発明は、請求項６に記載
のＭＰＥＧオーディオ再生装置（１）と、記録媒体（２
１）から読み出されたＭＰＥＧビデオストリームをＭＰ
ＥＧビデオパートに準拠してデコードし、ビデオ信号を
生成するＭＰＥＧビデオデコーダ（１２）と、音声判別
部（４１）の処理結果と、オーディオストリームのビッ
トレートとに基づいて、話速変換処理手段（２，４）に
おける信号遅延時間を検出し、その検出された遅延時間
を示す信号をＭＰＥＧビデオデコーダ（１２）へ供給す
る遅延時間検出回路（５３）とを備え、ＭＰＥＧビデオ
デコーダ（１２）は、前記遅延時間を示す信号に基づい
て自己の動作のタイミングを制御することをその要旨と
する。According to a ninth aspect of the present invention, there is provided an MPEG audio reproducing apparatus (1) according to the sixth aspect and a recording medium (2).
MP the MPEG video stream read from 1)
Based on the processing result of the MPEG video decoder (12) that decodes in accordance with the EG video part and generates a video signal, the processing result of the audio discriminating unit (41), and the bit rate of the audio stream ( 2 and 4), a delay time detection circuit (53) for detecting a signal delay time and supplying a signal indicating the detected delay time to the MPEG video decoder (12), the MPEG video decoder (12) The gist of the invention is to control the timing of its own operation based on the signal indicating the delay time.

【００３０】請求項１０に記載の発明は、請求項６に記
載のＭＰＥＧオーディオ再生装置（１）と、記録媒体
（２１）から読み出されたＭＰＥＧビデオストリームを
ＭＰＥＧビデオパートに準拠してデコードし、ビデオ信
号を生成するＭＰＥＧビデオデコーダ（１２）と、リン
グメモリ（３２）の蓄積量に基づいて、話速変換処理済
みのオーディオ信号とビデオ信号との同期を得るための
制御信号を生成し、その制御信号をＭＰＥＧビデオデコ
ーダ（１２）へ供給する制御回路（５４）とを備え、Ｍ
ＰＥＧビデオデコーダ（１２）は、前記制御信号に基づ
いて自己の動作のタイミングを制御することをその要旨
とする。The invention described in claim 10 is to decode the MPEG video stream read from the MPEG audio reproducing apparatus (1) according to claim 6 and the recording medium (21) according to the MPEG video part. , A control signal for generating synchronization between the audio signal and the video signal, which have been subjected to the speech speed conversion processing, based on the storage amount of the MPEG video decoder (12) for generating a video signal and the ring memory (32), A control circuit (54) for supplying the control signal to the MPEG video decoder (12),
The gist of the PEG video decoder (12) is to control the timing of its own operation based on the control signal.

【００３１】請求項１１に記載の発明は、請求項６に記
載のＭＰＥＧオーディオ再生装置（１）と、記録媒体
（２１）から読み出されたＭＰＥＧビデオストリームを
ＭＰＥＧビデオパートに準拠してデコードし、ビデオ信
号を生成するＭＰＥＧビデオデコーダ（１２）と、音声
判別部（４１）および時間軸圧縮伸長部（４３）の処理
結果に基づいて、話速変換処理手段（２，４）における
信号遅延時間を検出し、その検出された遅延時間を示す
信号をＭＰＥＧビデオデコーダ（１２）へ供給する遅延
時間検出回路（５５）とを備え、ＭＰＥＧビデオデコー
ダ（１２）は、前記遅延時間を示す信号に基づいて自己
の動作のタイミングを制御することをその要旨とする。According to the invention described in claim 11, the MPEG audio reproducing apparatus (1) according to claim 6 and the MPEG video stream read from the recording medium (21) are decoded in accordance with the MPEG video part. , The signal delay time in the speech speed conversion processing means (2, 4) based on the processing results of the MPEG video decoder (12) for generating a video signal, the audio discrimination section (41) and the time axis compression / expansion section (43). And a delay time detection circuit (55) for supplying a signal indicating the detected delay time to the MPEG video decoder (12), the MPEG video decoder (12) based on the signal indicating the delay time. The point is to control the timing of one's own motion.

【００３２】[0032]

BEST MODE FOR CARRYING OUT THE INVENTION

（第１実施形態）以下、本発明を具体化した第１実施形
態を図面に従って説明する。(First Embodiment) A first embodiment of the present invention will be described below with reference to the drawings.

【００３３】図１に、本実施形態のブロック回路図を示
す。本実施形態のＭＰＥＧオーディオ再生装置１は、再
生速度検出回路２、ＭＰＥＧオーディオデコーダ３、話
速変換処理回路４、Ｄ／Ａコンバータ５、オーディオア
ンプ６から構成されている。尚、各回路２〜６は１チッ
プのＬＳＩに搭載することもできる。FIG. 1 shows a block circuit diagram of this embodiment. The MPEG audio reproducing apparatus 1 of this embodiment includes a reproducing speed detecting circuit 2, an MPEG audio decoder 3, a speech speed conversion processing circuit 4, a D / A converter 5 and an audio amplifier 6. The circuits 2 to 6 may be mounted on a 1-chip LSI.

【００３４】また、本実施形態のＭＰＥＧ再生装置２３
は、ＭＰＥＧオーディオ再生装置１に加え、オーディオ
ビデオパーサ（ＡＶパーサ）１１、ＭＰＥＧビデオデコ
ーダ１２を備えている。Further, the MPEG reproducing apparatus 23 of the present embodiment
In addition to the MPEG audio reproducing apparatus 1, the audio video parser (AV parser) 11 and the MPEG video decoder 12 are provided.

【００３５】話速変換処理回路４は、例えば、ＤＳＰ
（Digital Signal Processor）３１、リングメモリ３
２、アップダウンカウンタ３３、読み出しクロック生成
回路３６を備えている。尚、話速変換処理回路４の動作
については、前記文献（日経エレクトロニクス 1994 年
11月21日号(No.622) P.93 〜98. ）に詳述されている。The speech speed conversion processing circuit 4 is, for example, a DSP.
(Digital Signal Processor) 31, ring memory 3
2, an up / down counter 33, and a read clock generation circuit 36. The operation of the speech speed conversion processing circuit 4 is described in the above-mentioned document (Nikkei Electronics 1994.
November 21st issue (No.622) P.93-98.).

【００３６】再生速度検出回路２は、ビデオＣＤやＤＶ
Ｄなどの記録媒体２１から読み出されたＭＰＥＧシステ
ムストリームのビットレートに対応したデコードクロッ
クを生成する。そのデコードクロックは各回路１２，
３，４へ出力される。The reproduction speed detection circuit 2 is for a video CD or a DV.
A decode clock corresponding to the bit rate of the MPEG system stream read from the recording medium 21 such as D is generated. The decode clock is for each circuit 12,
Output to 3 and 4.

【００３７】ＡＶパーサ１１は、デマルチプレクサ（Ｄ
ＭＵＸ）１３を備えており、記録媒体２１から読み出さ
れたＭＰＥＧシステムストリームを入力する。ＤＭＵＸ
１３は、システムストリームをＭＰＥＧビデオストリー
ムとＭＰＥＧオーディオストリームに分離する。ビデオ
ストリームはビデオデコーダ１２へ出力され、オーディ
オストリームはオーディオデコーダ３へ出力される。The AV parser 11 is a demultiplexer (D
The MUX) 13 is provided, and the MPEG system stream read from the recording medium 21 is input. DMUX
Reference numeral 13 separates the system stream into an MPEG video stream and an MPEG audio stream. The video stream is output to the video decoder 12, and the audio stream is output to the audio decoder 3.

【００３８】ビデオデコーダ１２は、ＭＰＥＧビデオパ
ートに準拠してビデオストリームをデコードし、ビデオ
出力（以下、ビデオ信号という）を生成する。そのビデ
オ信号はディスプレイ２２へ出力され、ディスプレイ２
２で動画が再生される。The video decoder 12 decodes the video stream in accordance with the MPEG video part and produces a video output (hereinafter referred to as a video signal). The video signal is output to the display 22 and the display 2
The video is played back in 2.

【００３９】オーディオデコーダ３は、ＭＰＥＧオーデ
ィオパートに準拠してオーディオストリームをデコード
し、ディジタル信号のオーディオ出力（以下、オーディ
オ信号という）を生成する。そのオーディオ信号は話速
変換処理回路４へ出力される。話速変換処理回路４にお
いて信号処理されたオーディオ信号はＤ／Ａコンバータ
５によってＤ／Ａ変換された後、オーディオアンプ６で
増幅されてスピーカ２３へ送られる。そして、スピーカ
２３から音声が再生される。The audio decoder 3 decodes the audio stream in accordance with the MPEG audio part and generates an audio output of a digital signal (hereinafter referred to as an audio signal). The audio signal is output to the speech speed conversion processing circuit 4. The audio signal processed by the speech speed conversion processing circuit 4 is D / A converted by the D / A converter 5, amplified by the audio amplifier 6, and sent to the speaker 23. Then, the sound is reproduced from the speaker 23.

【００４０】記録媒体２１から読み出されたシステムス
トリームのビットレートは、読み出し速度に対応したも
のになる。また、各回路３，４，１２の動作はデコード
クロックによって規定される。The bit rate of the system stream read from the recording medium 21 corresponds to the reading speed. The operation of each circuit 3, 4, 12 is specified by the decode clock.

【００４１】従って、ビデオデコーダ１２は、システム
ストリームのビットレートに対応したビデオ信号を生成
する。すなわち、システムストリームのビットレート
が、通常の再生時（標準再生時）よりも大きければディ
スプレイ２２では動画が高速再生され、通常の再生時よ
りも小さければディスプレイ２２では動画が低速再生さ
れる。Therefore, the video decoder 12 produces a video signal corresponding to the bit rate of the system stream. That is, if the bit rate of the system stream is higher than that during normal reproduction (during standard reproduction), the moving image is reproduced at high speed on the display 22, and if smaller than during normal reproduction, the moving image is reproduced at low speed on the display 22.

【００４２】また、オーディオデコーダ３は、システム
ストリームのビットレートに対応したオーディオ信号を
生成する。すなわち、システムストリームのビットレー
トが、通常の再生時よりも大きければオーディオ信号の
ビットレートも大きくなり、通常の再生時より小さけれ
ばオーディオ信号のビットレートも小さくなる。The audio decoder 3 also produces an audio signal corresponding to the bit rate of the system stream. That is, if the bit rate of the system stream is higher than that during normal reproduction, the bit rate of the audio signal will be higher, and if it is lower than during normal reproduction, the bit rate of the audio signal will also be lower.

【００４３】ところで、ビデオ信号とオーディオ信号と
は、通常の再生時において同期生成されるようになって
いる。ＤＳＰ３１は、フレームメモリ３４および話速変
換部３５から構成されている。フレームメモリ３４は、
適宜なフレーム数分（例えば、２フレーム分）のオーデ
ィオ信号を記憶する。話速変換部３５は、フレームメモ
リ３４に記憶されたオーディオ信号に対してフレーム単
位で話速変換処理を行い、話速変換処理済みのオーディ
オ信号（以下、データという）を生成する。尚、１フレ
ームは、適宜な数（例えば、２００個）のサンプリング
データから構成される。By the way, the video signal and the audio signal are synchronously generated during normal reproduction. The DSP 31 includes a frame memory 34 and a voice speed conversion unit 35. The frame memory 34 is
Audio signals for an appropriate number of frames (for example, two frames) are stored. The voice speed conversion unit 35 performs a voice speed conversion process on the audio signal stored in the frame memory 34 on a frame-by-frame basis, and generates a voice speed converted audio signal (hereinafter referred to as data). It should be noted that one frame is composed of an appropriate number (eg, 200) of sampling data.

【００４４】フレームメモリ３４の内部は、２つの領域
（以下、Ａ領域、Ｂ領域と記載して区別する）に分けら
れている。オーディオデコーダ３から出力されたオーデ
ィオ信号がＢ領域に書き込まれるのと同時に、Ａ領域に
蓄積されている１フレーム分のオーディオ信号が読み出
されて話速変換部３５へ転送される。そして、Ｂ領域に
１フレーム分のオーディオ信号が蓄積されると、今度
は、Ｂ領域に蓄積された１フレーム分のオーディオ信号
が読み出されて話速変換部３５へ転送され、それと同時
に、オーディオデコーダ３から出力されたオーディオ信
号がＡ領域に書き込まれる。The inside of the frame memory 34 is divided into two areas (hereinafter referred to as area A and area B for distinction). At the same time that the audio signal output from the audio decoder 3 is written in the B area, the audio signal for one frame accumulated in the A area is read and transferred to the speech speed conversion unit 35. Then, when one frame of audio signal is accumulated in the B area, this time, one frame of audio signal accumulated in the B area is read out and transferred to the speech speed conversion unit 35. At the same time, the audio signal is read. The audio signal output from the decoder 3 is written in the area A.

【００４５】話速変換部３５の生成したデータは、話速
変換部３５が生成した書き込みクロックに従ってリング
メモリ３２に書き込まれる。リングメモリ３２は、例え
ば、ＦＩＦＯ（First-In-First-Out）構成のＲＡＭ（Ra
ndom Access Memory）から成る。The data generated by the voice speed converter 35 is written in the ring memory 32 in accordance with the write clock generated by the voice speed converter 35. The ring memory 32 is, for example, a RAM (Ra) having a FIFO (First-In-First-Out) configuration.
ndom Access Memory).

【００４６】読み出しクロック生成回路３６は、デコー
ドクロックに従って読み出しクロックを生成する。リン
グメモリ３２に蓄積されたデータは、読み出しクロック
に従って読み出され、その読み出されたデータはＤ／Ａ
コンバータ５へ出力される。Ｄ／Ａコンバータ５は、読
み出しクロックをサンプリング周波数として用いる。The read clock generation circuit 36 generates a read clock according to the decode clock. The data stored in the ring memory 32 is read according to the read clock, and the read data is D / A.
It is output to the converter 5. The D / A converter 5 uses the read clock as the sampling frequency.

【００４７】書き込みクロックはアップダウンカウンタ
３３のアップカウント入力端子UPに入力され、読み出し
クロックはアップダウンカウンタ３３のダウンカウント
入力端子DOWNに入力される。アップダウンカウンタ３３
は、書き込みクロックの総数と読み出しクロックの総数
との差をカウントする。そのカウント値は、リングメモ
リ３２の蓄積量に対応する。つまり、アップダウンカウ
ンタ３３は、書き込みクロックと読み出しクロックとに
基づいて、リングメモリ３２の蓄積量を検出する。その
リングメモリ３２の蓄積量は話速変換部３５へ出力され
る。The write clock is input to the up-count input terminal UP of the up-down counter 33, and the read clock is input to the down-count input terminal DOWN of the up-down counter 33. Up-down counter 33
Counts the difference between the total number of write clocks and the total number of read clocks. The count value corresponds to the amount stored in the ring memory 32. That is, the up / down counter 33 detects the storage amount of the ring memory 32 based on the write clock and the read clock. The accumulated amount of the ring memory 32 is output to the speech speed conversion unit 35.

【００４８】図２に、話速変換部３５に内部構成を示
す。話速変換部３５は、音声判別部４１、無音削除挿入
部４２、時間軸圧縮伸長部４３から構成されている。FIG. 2 shows the internal structure of the speech speed conversion unit 35. The speech speed conversion unit 35 includes a voice discrimination unit 41, a silence deletion insertion unit 42, and a time axis compression / decompression unit 43.

【００４９】音声判別部４１は、フレームメモリ３４か
ら読み出されたオーディオ信号が、音声区間（音声が存
在している区間）か、または、無音区間（音声が存在し
ていない区間）かを判別する。尚、人間が発声する音声
以外の背景雑音は無音区間として取り扱う。The voice discriminating section 41 discriminates whether the audio signal read from the frame memory 34 is in a voice section (section in which voice exists) or in a silent section (section in which no voice exists). To do. The background noise other than the voice uttered by a human is treated as a silent section.

【００５０】無音削除挿入部４２は、音声判別部４１の
判別した無音区間に対して、その無音区間の削除処理、
または、新たな無音区間の挿入処理を行う。時間軸圧縮
伸長部４３は、音声判別部４１の判別した音声区間に対
して、リングメモリ３２の蓄積量に基づいて圧縮処理ま
たは伸長処理を行う。The silent deletion inserting section 42 deletes the silent section from the silent section discriminated by the voice discriminating section 41.
Alternatively, a process for inserting a new silent section is performed. The time axis compression / decompression unit 43 performs compression processing or decompression processing on the voice section discriminated by the voice discrimination unit 41 based on the accumulated amount in the ring memory 32.

【００５１】また、各部４２，４３は、その処理内容に
対応した書き込みクロックを生成する。次に、高速再生
時における話速変換部３５の動作について説明する。Further, the respective units 42 and 43 generate a write clock corresponding to the processing content. Next, the operation of the speech speed conversion unit 35 during high speed reproduction will be described.

【００５２】オーディオデコーダ３から出力されるオー
ディオ信号のビットレートは、オーディオストリームの
それと同一になる。従って、高速再生時には、通常の再
生時に比べて、オーディオ信号のビットレートが大きく
なる。通常の再生時よりもビットレートの大きなオーデ
ィオ信号をそのままＤ／Ａコンバータ５へ送った場合、
通常の再生時に比べて、スピーカ２３から再生される音
声のピッチは上がり話速は速くなる。The bit rate of the audio signal output from the audio decoder 3 is the same as that of the audio stream. Therefore, the bit rate of the audio signal becomes higher during high-speed reproduction than during normal reproduction. When an audio signal having a bit rate larger than that during normal reproduction is sent as it is to the D / A converter 5,
The pitch of the sound reproduced from the speaker 23 is higher and the speech speed is higher than that during normal reproduction.

【００５３】そこで、話速変換部３５において、スピー
カ２３から再生される音声のピッチを通常の再生時とほ
ぼ同一にし、且つ、スピーカ２３から再生される話速を
通常の再生時に近づけるように話速変換処理を行う。Therefore, in the speech speed conversion unit 35, the pitch of the voice reproduced from the speaker 23 is made almost the same as that in the normal reproduction, and the speech speed reproduced from the speaker 23 is made to approach the normal reproduction. Perform speed conversion processing.

【００５４】すなわち、無音削除挿入部４２は、音声判
別部４１の判別した無音区間の継続長を算出し、その継
続長が所定長以上の場合は無音区間を削除する。また、
時間軸圧縮伸長部４３は、音声判別部４１の判別した音
声区間に対して、例えば、自己相関法を用いてピッチ抽
出を行い、抽出したピッチ波形に対して圧縮処理を行
う。その結果、高速再生時において、オーディオ信号の
ビットレートが大きくなった場合に、スピーカ２３から
再生される音声区間の時間長さは伸長される。That is, the silent deletion inserting section 42 calculates the duration of the silent section discriminated by the voice discriminating section 41, and deletes the silent section when the duration is equal to or longer than a predetermined length. Also,
The time axis compression / expansion unit 43 performs pitch extraction on the voice section discriminated by the voice discrimination unit 41, for example, using the autocorrelation method, and performs compression processing on the extracted pitch waveform. As a result, during high-speed reproduction, when the bit rate of the audio signal increases, the time length of the voice section reproduced from the speaker 23 is extended.

【００５５】尚、時間軸圧縮伸長部４３における圧縮処
理に際しては、無音区間の状態とリングメモリ３２の蓄
積量とに応じて動的に圧縮率を変化させる。例えば、同
一のピッチ周期をもつ３周期波形を２周期波形に圧縮す
ることで、２／３倍の圧縮（圧縮率；２／３）を得る。
具体的には、３周期波形から、時間軸方向で前にある２
周期波形と、後ろにある２周期波形とをそれぞれ切り出
す。そして、前の２周期波形に単調減少する三角窓関数
を、後ろの２周期波形に単調増加する三角窓関数をそれ
ぞれ乗じる。この二つの波形を加算することで出力波形
を得る。In the compression processing in the time axis compression / expansion unit 43, the compression rate is dynamically changed according to the state of the silent section and the amount of storage in the ring memory 32. For example, by compressing a three-period waveform having the same pitch period into a two-period waveform, 2/3 times compression (compression rate: 2/3) is obtained.
Specifically, from the three-cycle waveform, the two preceding in the time axis direction
The periodic waveform and the two subsequent periodic waveforms are cut out. Then, the preceding two-period waveform is multiplied by the monotonically decreasing triangular window function, and the latter two-period waveform is multiplied by the monotonically increasing triangular window function, respectively. An output waveform is obtained by adding these two waveforms.

【００５６】また、０．９倍の圧縮（圧縮率；０．９）
を得るには、例えば、１０周期波形から９周期波形に圧
縮する。この場合は、先頭の３周期波形に対して同様の
処理を施す。つまり、入力の１０周期波形のうち、先頭
の３周期波形を除いた７周期波形は処理に使わない。0.9 times compression (compression rate: 0.9)
To obtain, for example, a 10-cycle waveform is compressed into a 9-cycle waveform. In this case, the same processing is performed on the leading three-cycle waveform. That is, among the input 10-cycle waveforms, the 7-cycle waveforms except the leading 3-cycle waveforms are not used for processing.

【００５７】このＭ周期波形からＮ周期波形に圧縮する
組み合わせを色々と用意しておくことで、多種類の圧縮
率を得る。ところで、無音区間が短い場合、圧縮率が低
い（圧縮の度合いが大きい）とリングメモリ３２がオー
バーフローする恐れがある。これを防ぐためには、リン
グメモリ３２の蓄積量に応じて、時間軸圧縮伸長部４３
における圧縮率を動的に変化させればよい。また、背景
雑音が存在する場合、音声区間やピッチの抽出誤りが生
じる。これを防ぐためには、音声判別部４１における音
声区間の検出レベルを雑音信号に応じて変化させればよ
い。By preparing various combinations for compressing the M-period waveform into the N-period waveform, various types of compression rates can be obtained. By the way, when the silent section is short, the ring memory 32 may overflow if the compression rate is low (the degree of compression is high). In order to prevent this, the time-base compression / decompression unit 43 depends on the storage amount of the ring memory 32.
It suffices to dynamically change the compression rate at. Also, when background noise is present, a voice segment or pitch extraction error occurs. In order to prevent this, the detection level of the voice section in the voice discriminating unit 41 may be changed according to the noise signal.

【００５８】次に、低速再生時における話速変換部３５
の動作について、図３および図４に従って説明する。図
３に、通常の再生時および０．５倍速再生時において再
生される音声の例を示す。Next, the speech speed conversion unit 35 at the time of low speed reproduction
The operation will be described with reference to FIGS. 3 and 4. FIG. 3 shows an example of audio reproduced during normal reproduction and 0.5 × speed reproduction.

【００５９】低速再生時には、通常の再生時に比べて、
オーディオ信号のビットレートが小さくなる。そのた
め、方法１に示すように、通常の再生時よりもビットレ
ートの小さなオーディオ信号をそのままＤ／Ａコンバー
タ５へ送った場合、通常の再生時に比べて、スピーカ２
３から再生される音声のピッチは変化しないものの、音
声が途切れ途切れになる。つまり、各音声区間（「あ」
「い」「う」「え」）の時間長さは通常の再生時のそれ
と変わらず、全く音の存在していない無音区間が各音声
区間の間に挿入されるため、音声が途切れ途切れにな
り、ユーザは聴感上違和感を覚える。Compared to normal reproduction, low-speed reproduction
The bit rate of the audio signal decreases. Therefore, as shown in the method 1, when an audio signal having a bit rate smaller than that in the normal reproduction is sent to the D / A converter 5 as it is, the speaker 2 is compared with that in the normal reproduction.
Although the pitch of the sound reproduced from No. 3 does not change, the sound is interrupted. In other words, each voice section ("A")
The time length of "i", "u", and "e") is the same as that during normal playback, and a silent section with no sound is inserted between each voice section, so the voice is interrupted. As a result, the user feels uncomfortable in hearing.

【００６０】そこで、話速変換部３５において、方法２
または方法３に示すように話速変換処理を行う。尚、Ｍ
ＰＥＧオーディオでは、低速再生時に音声のピッチが変
化しないため、高速再生時のように時間軸圧縮伸長部４
３においてピッチを変える処理を行う必要はない。Therefore, in the speech speed conversion unit 35, method 2 is used.
Alternatively, the speech speed conversion process is performed as shown in Method 3. Incidentally, M
With PEG audio, the pitch of the audio does not change during low-speed playback, so the time-axis compression / decompression unit 4 does not change as in high-speed playback.
It is not necessary to perform the process of changing the pitch in 3.

【００６１】（方法２）方法２では、時間軸圧縮伸長部
４３において各音声区間の長さを伸長させ、それと共
に、無音削除挿入部４２において各無音区間の長さを短
くすることで、音声の途切れを目立たなくする。(Method 2) In method 2, the time-base compression / decompression unit 43 extends the length of each voice section, and at the same time, the silence deletion / insertion unit 42 shortens the length of each voice section. Make the breaks inconspicuous.

【００６２】尚、時間軸圧縮伸長部４３において音声区
間の長さを伸長させるには、音声判別部４１の判別した
音声区間に対して、例えば、自己相関法を用いてピッチ
抽出を行い、抽出したピッチ波形に対して伸長処理を行
う。例えば、同一のピッチ周期をもつ２周期波形を３周
期波形に伸長することで、３／２倍の伸長（伸長率；３
／２）を得る。また、同一のピッチ周期をもつ３周期波
形を４周期波形に伸長することで、４／３倍の伸長（伸
長率；４／３）を得る。その結果、低速再生時におい
て、オーディオ信号のビットレートが小さくなった場合
に、スピーカ２３から再生される音声区間の時間長さは
伸長される。In order to extend the length of the voice section in the time axis compression / expansion unit 43, pitch extraction is performed on the voice section discriminated by the voice discriminating unit 41 by using, for example, the autocorrelation method. The expansion process is performed on the pitch waveform. For example, by expanding a 2-cycle waveform having the same pitch cycle into a 3-cycle waveform, a 3 / 2-fold expansion (expansion rate: 3
/ 2) is obtained. Further, by expanding a 3-cycle waveform having the same pitch cycle into a 4-cycle waveform, a 4 / 3-fold expansion (expansion rate; 4/3) is obtained. As a result, the time length of the voice section reproduced from the speaker 23 is extended when the bit rate of the audio signal becomes low during low-speed reproduction.

【００６３】このとき、音声区間を伸長し過ぎると、音
声区間が間延びして聞こえるため、音声の途切れは目立
たなくなるものの、やはり不自然になる。これを防止す
るには、通常の再生時における音声区間の長さＬ１に対
して、低速再生時における音声区間の長さＬ２を、例え
ば、以下の式に示すように設定する。At this time, if the voice section is excessively extended, the voice section is extended and heard, so that the interruption of the voice is not noticeable, but it is also unnatural. In order to prevent this, the length L2 of the voice section at the time of low-speed reproduction is set as shown in the following formula, for example, with respect to the length L1 of the voice section at the time of normal reproduction.

【００６４】Ｌ２／Ｌ１≦１．４尚、上記式は０．５倍速再生時だけでなく、あらゆる倍
率の低速再生時に適用できる。ここで、時間軸圧縮伸長
部４３における音声区間の伸長率は一定値にしてもよ
く、以下のに示すように可変にしてもよい。L2 / L1.ltoreq.1.4 The above equation can be applied not only to the 0.5 × speed reproduction but also to the low speed reproduction of any magnification. Here, the expansion rate of the voice section in the time axis compression / expansion unit 43 may be a constant value or may be variable as shown below.

【００６５】リングメモリ３２の蓄積量に対応して音
声区間の伸長率を動的に変化させる。無音区間が短い場
合、音声区間の伸長率が大きい（伸長の度合いが大き
い）とリングメモリ３２がオーバーフローする恐れがあ
る。これを防ぐためには、音声区間の伸長率を小さくす
ればよい。The expansion rate of the voice section is dynamically changed according to the storage amount of the ring memory 32. When the silent section is short, the ring memory 32 may overflow if the expansion rate of the speech section is large (the degree of expansion is large). In order to prevent this, the expansion rate of the voice section may be reduced.

【００６６】音声のピッチ変化に対応して音声区間の
伸長率を動的に変化させる。つまり、図４に示すよう
に、音声のピッチ変化に対応して音声区間の伸長率を変
化させることで、話速を変化させる。この場合、音声の
聞き易さをさらに向上させることができる。尚、音声の
ピッチ変化に対応して音声区間の伸長率を変化させるこ
とで話速を変化させる技術は公知である（信学技報 SP9
2-56,HC92-33(1992-09),P.49〜56. 参照）。The expansion rate of the voice section is dynamically changed according to the change of the voice pitch. That is, as shown in FIG. 4, the speech rate is changed by changing the expansion rate of the voice section in response to the change in the pitch of the voice. In this case, it is possible to further improve the audibility of the voice. Note that a technique for changing the speech rate by changing the extension rate of the voice section in response to the change in the pitch of the voice is publicly known (Science Technical Report SP9
2-56, HC92-33 (1992-09), P.49-56.).

【００６７】（方法３）方法３では、無音削除挿入部４
２において、各無音区間を削除して各音声区間をつなぎ
合わせた後で、音声区間に続いて新たに無音区間を挿入
することで、音声の途切れを目立たなくする。尚、挿入
する無音区間は、以下の〜のいずれであってもよ
い。(Method 3) In method 3, the silent deletion insertion unit 4
In 2, the silent sections are deleted and the speech sections are connected to each other, and then a new silent section is inserted after the speech section to make the interruption of the speech inconspicuous. The silent section to be inserted may be any of the following items.

【００６８】全く音の存在しない無音区間。視聴者が違和感を覚えないような白色雑音を含む無音
区間。尚、そのような白色雑音は、予め作成して別メモ
リ（図示略）に記憶しておく。A silent section in which no sound exists. A silent section that contains white noise so that the viewer does not feel uncomfortable. Note that such white noise is created in advance and stored in another memory (not shown).

【００６９】音声判別部４１において無音区間と判別
したオーディオ信号を別メモリ（図示略）に保持してお
き、それを無音区間として挿入する。このように、本実施形態によれば、以下の作用および効
果を得ることができる。The audio signal discriminated as a silent section in the voice discriminating section 41 is held in another memory (not shown) and is inserted as a silent section. As described above, according to this embodiment, the following actions and effects can be obtained.

【００７０】（１）話速変換処理回路４を設けること
で、高速再生時において、スピーカ２３から再生される
音声のピッチを通常の再生時とほぼ同一にし、且つ、ス
ピーカ２３から再生される話速を通常の再生時に近づけ
ることが可能になり、自然で聞き易い音声を再生するこ
とができる。(1) By providing the speech speed conversion processing circuit 4, during high speed reproduction, the pitch of the sound reproduced from the speaker 23 is made substantially the same as that during normal reproduction, and the sound reproduced from the speaker 23 is reproduced. The speed can be brought closer to the speed during normal reproduction, and natural and easy-to-listen sound can be reproduced.

【００７１】ところで、ｍ倍速再生時（ｍ＞１）には、
オーディオストリームおよびデコードクロックのビット
レートは通常の再生時のｍ倍になる。このとき、話速変
換部３５から出力されるデータのビットレートを通常の
再生時とほぼ同一になるようにすれば、再生される音声
のピッチを通常の再生時とほぼ同一にすることができ
る。すなわち、話速変換部３５においてビットレートを
ｍ→１に変換すれば、再生される音声のピッチは通常の
再生時とほぼ同一になる。By the way, during m-times speed reproduction (m> 1),
The bit rates of the audio stream and the decode clock are m times those in normal reproduction. At this time, if the bit rate of the data output from the speech speed conversion unit 35 is set to be substantially the same as that in the normal reproduction, the pitch of the reproduced sound can be made substantially the same as that in the normal reproduction. . That is, if the bit rate is converted from m → 1 in the speech speed conversion unit 35, the pitch of the reproduced sound becomes almost the same as that in the normal reproduction.

【００７２】（２）話速変換処理回路４を設けること
で、低速再生時において再生される音声の途切れを目立
たなくすることが可能になり、自然で聞き易い音声を再
生することができる。(2) By providing the speech speed conversion processing circuit 4, it is possible to make the interruption of the sound reproduced at the low speed reproduction inconspicuous, and to reproduce the natural and easily heard sound.

【００７３】ところで、上記方法２と方法３とを、以下
の(1)(2)に示すように併用してもよい。 (1) ＭＰＥＧオーディオ再生装置１のユーザが、方法２
と方法３とを任意に切り換え選択できるようにする。こ
のようにすれば、個々のユーザの聴覚特性に合わせるこ
とが可能になり、ユーザにとって聞き易い音声を再生す
ることができる。 (2) 低速再生の倍率に対応して方法２と方法３とが自動
的に切り換え選択されるようにする。例えば、１〜０．
５倍速再生時には方法３が選択され、０．５倍速以下の
再生時には方法２が選択されるようにする。このように
すれば、再生速度に応じて、自然な音声を再生すること
ができる。By the way, the method 2 and the method 3 may be used together as shown in the following (1) and (2). (1) The user of the MPEG audio playback device 1 uses the method 2
And method 3 can be arbitrarily switched and selected. By doing so, it is possible to match the hearing characteristics of each user, and it is possible to reproduce a voice that is easy for the user to hear. (2) Method 2 and method 3 are automatically switched and selected according to the low speed reproduction magnification. For example, 1 to 0.
Method 3 is selected at the time of 5 × speed reproduction, and method 2 is selected at the time of 0.5 × speed reproduction or less. In this way, natural sound can be reproduced according to the reproduction speed.

【００７４】（３）各回路２〜６を１チップのＬＳＩに
搭載した場合には、ＭＰＥＧオーディオ再生装置１を小
型化することができる。（第２実施形態）以下、本発明を具体化した第２実施形
態を図面に従って説明する。尚、本実施形態において、
第１実施形態と同じ構成部材については符号を等しくし
てその詳細な説明を省略する。(3) When the circuits 2 to 6 are mounted on a one-chip LSI, the MPEG audio reproducing apparatus 1 can be downsized. (Second Embodiment) Hereinafter, a second embodiment of the present invention will be described with reference to the drawings. In this embodiment,
The same components as those in the first embodiment have the same reference numerals, and a detailed description thereof will be omitted.

【００７５】図５に、本実施形態の要部ブロック回路図
を示す。本実施形態において、第１実施形態と異なるの
は、インデックス付加回路５１およびインデックス検出
回路５２が設けられている点だけである。FIG. 5 shows a block circuit diagram of a main part of this embodiment. The present embodiment is different from the first embodiment only in that an index addition circuit 51 and an index detection circuit 52 are provided.

【００７６】インデックス付加回路５１は、フレームメ
モリ３４の前段（すなわち、ＭＰＥＧオーディオデコー
ダ３と話速変換処理回路４の間）に設けられている。イ
ンデックス付加回路５１は、デコードクロックに従っ
て、オーディオデコーダ３の生成したオーディオ信号に
一定周期でインデックス信号を付加する。そのインデッ
クス信号が付加されたオーディオ信号は、フレームメモ
リ３４へ出力される。The index adding circuit 51 is provided in the preceding stage of the frame memory 34 (that is, between the MPEG audio decoder 3 and the speech speed conversion processing circuit 4). The index adding circuit 51 adds an index signal to the audio signal generated by the audio decoder 3 at a constant cycle according to the decode clock. The audio signal added with the index signal is output to the frame memory 34.

【００７７】インデックス検出回路５２は、リングメモ
リ３２から読み出されたデータに付加されているインデ
ックス信号を検出し、そのインデックス信号から得られ
る時刻情報と現在時刻とから、話速変換処理回路４が信
号処理に要する時間Δｔを算出し、その時間Δｔに関す
る検出信号をビデオデコーダ１２へ供給する。ビデオデ
コーダ１２は、その時間Δｔに関する検出信号に従っ
て、自己の動作のタイミングを制御する。The index detection circuit 52 detects the index signal added to the data read from the ring memory 32, and the speech speed conversion processing circuit 4 detects the index signal from the time information and the current time obtained from the index signal. The time Δt required for signal processing is calculated, and a detection signal regarding the time Δt is supplied to the video decoder 12. The video decoder 12 controls the timing of its own operation according to the detection signal regarding the time Δt.

【００７８】このように、本実施形態によれば、第１実
施形態の作用および効果に加えて、以下の作用および効
果を得ることができる。（１）前記したように、ビデオデコーダ１２の生成する
ビデオ信号と、オーディオデコーダ３の生成するオーデ
ィオ信号とは、通常の再生時において同期生成されるよ
うになっている。そのため、オーディオデコーダ３とＤ
／Ａコンバータ５の間に話速変換処理回路４を設ける
と、話速変換処理回路４における信号処理に要する時間
分（すなわち、話速変換処理回路４における遅延時間
分）だけ、オーディオ信号が遅延することになる。As described above, according to this embodiment, the following operation and effect can be obtained in addition to the operation and effect of the first embodiment. (1) As described above, the video signal generated by the video decoder 12 and the audio signal generated by the audio decoder 3 are synchronously generated during normal reproduction. Therefore, the audio decoder 3 and D
If the speech speed conversion processing circuit 4 is provided between the A / A converter 5, the audio signal is delayed by the time required for signal processing in the speech speed conversion processing circuit 4 (that is, the delay time in the speech speed conversion processing circuit 4). Will be done.

【００７９】そこで、インデックス付加回路５１を用い
て、フレームメモリ３４へ入力されるオーディオ信号に
予め一定周期でインデックス信号を付加する。インデッ
クス検出回路５２は、リングメモリ３２から読み出され
たデータに付加されているインデックス信号を検出し、
話速変換処理回路４が信号処理に要する時間Δｔを算出
し、その時間Δｔに関する検出信号をビデオデコーダ１
２へ供給する。ビデオデコーダ１２は、その時間Δｔに
関する検出信号に従って、自己の動作のタイミングを制
御する。また、インデックス検出回路５２が次にインデ
ックス信号を検出したとき、ビデオデコーダ１２は、そ
のときに算出された時間と前回算出された時間との差だ
け、自己の動作のタイミングを遅らせたり早めたりす
る。Therefore, the index adding circuit 51 is used to add an index signal to the audio signal input to the frame memory 34 in advance at a constant cycle. The index detection circuit 52 detects the index signal added to the data read from the ring memory 32,
The speech speed conversion processing circuit 4 calculates the time Δt required for the signal processing, and the detection signal relating to the time Δt is detected by the video decoder 1.
Supply to 2. The video decoder 12 controls the timing of its own operation according to the detection signal regarding the time Δt. When the index detection circuit 52 detects the next index signal, the video decoder 12 delays or advances the timing of its own operation by the difference between the time calculated at that time and the time calculated last time. .

【００８０】その結果、話速変換処理回路４における遅
延時間に関係なく、リングメモリ３２から読み出された
データ（すなわち、話速変換処理済みのオーディオ信
号）とビデオ信号との同期をとることができる。As a result, the data read from the ring memory 32 (that is, the audio signal subjected to the speech speed conversion processing) and the video signal can be synchronized regardless of the delay time in the speech speed conversion processing circuit 4. it can.

【００８１】（２）上記（１）より、スピーカ２３で再
生される音声と、ディスプレイ２２で再生される動画と
の時間ずれを低減することが可能になり、リップシンク
のずれを人間の聴覚の許容範囲内にすることができる。(2) From the above (1), it becomes possible to reduce the time lag between the sound reproduced by the speaker 23 and the moving image reproduced by the display 22, and the deviation of the lip sync can be detected by the human auditory sense. It can be within the allowable range.

【００８２】（３）オーディオ信号に付加されたインデ
ックス信号は、無音削除挿入部４２によって削除される
ことがある。しかし、インデックス信号を付加する周期
を短くして、オーディオ信号に十分な数のインデックス
信号を付加しておけば、そのインデックス信号の内のい
くつかが無音削除挿入部４２によって削除されたとして
も、リングメモリ３２から読み出されたデータには一定
数以上のインデックス信号が残ることになる。その残っ
たインデックス信号により、上記（１）の作用および効
果を得ることができる。(3) The index signal added to the audio signal may be deleted by the silent deletion inserting section 42. However, if the period for adding the index signal is shortened and a sufficient number of index signals are added to the audio signal, even if some of the index signals are deleted by the silence deletion insertion unit 42, A certain number or more of index signals remain in the data read from the ring memory 32. With the remaining index signal, the action and effect of the above (1) can be obtained.

【００８３】（第３実施形態）以下、本発明を具体化し
た第３実施形態を図面に従って説明する。尚、本実施形
態において、第２実施形態と同じ構成部材については符
号を等しくしてその詳細な説明を省略する。(Third Embodiment) A third embodiment of the present invention will be described below with reference to the drawings. In the present embodiment, the same components as those in the second embodiment are designated by the same reference numerals, and detailed description thereof will be omitted.

【００８４】図６に、本実施形態の要部ブロック回路図
を示す。本実施形態において、第２実施形態と異なるの
は、インデックス付加回路５１が、フレームメモリ３４
と音声判別部４１の間に設けられている点だけである。
インデックス付加回路５１は、デコードクロックに従っ
て、フレームメモリ３４から読み出されたオーディオ信
号に一定周期でインデックス信号を付加する。そのイン
デックス信号が付加されたオーディオ信号は、音声判別
部４１へ出力される。FIG. 6 shows a block circuit diagram of a main part of this embodiment. The present embodiment is different from the second embodiment in that the index adding circuit 51 uses the frame memory 34.
It is only the point provided between the voice discrimination unit 41 and the voice discrimination unit 41.
The index adding circuit 51 adds an index signal to the audio signal read from the frame memory 34 at a constant cycle according to the decode clock. The audio signal to which the index signal is added is output to the voice discriminating unit 41.

【００８５】前記したように、フレームメモリ３４が２
フレーム分のオーディオ信号を蓄積する場合、フレーム
メモリ３４の記憶容量は、例えば、０．８Ｋバイト程度
あれば十分である。このように、フレームメモリ３４の
記憶容量が小さい場合には、話速変換処理回路４におけ
る遅延時間に比べて、フレームメモリ３４における書き
込み動作および読み出し動作に要する時間（すなわち、
フレームメモリ３４における遅延時間）は僅かであり、
無視しても差し支えない。As described above, the frame memory 34 has two
When storing audio signals for frames, it is sufficient that the storage capacity of the frame memory 34 is, for example, about 0.8 Kbytes. In this way, when the storage capacity of the frame memory 34 is small, the time required for the write operation and the read operation in the frame memory 34 (that is, the delay time in the speech speed conversion processing circuit 4) (that is,
The delay time in the frame memory 34) is small,
You can ignore it.

【００８６】従って、本実施形態によれば、第２実施形
態と同様の作用および効果を得ることができる。（第４実施形態）以下、本発明を具体化した第４実施形
態を図面に従って説明する。尚、本実施形態において、
第２実施形態と同じ構成部材については符号を等しくし
てその詳細な説明を省略する。Therefore, according to this embodiment, the same operation and effect as those of the second embodiment can be obtained. (Fourth Embodiment) A fourth embodiment of the present invention will be described below with reference to the drawings. In this embodiment,
The same components as those in the second embodiment are designated by the same reference numerals, and detailed description thereof will be omitted.

【００８７】図７に、本実施形態の要部ブロック回路図
を示す。本実施形態において、第２実施形態と異なるの
は、インデックス付加回路５１が、音声判別部４１と無
音削除挿入部４２および時間軸圧縮伸長部４３との間に
それぞれ設けられている点だけである。インデックス付
加回路５１は、デコードクロックに従って、音声判別部
４１における信号処理が済んだオーディオ信号に一定周
期でインデックス信号を付加する。そのインデックス信
号が付加されたオーディオ信号は、無音削除挿入部４２
および時間軸圧縮伸長部４３へ出力される。FIG. 7 shows a block circuit diagram of a main part of this embodiment. The present embodiment differs from the second embodiment only in that the index adding circuit 51 is provided between the voice discrimination unit 41, the silence deletion insertion unit 42, and the time axis compression / decompression unit 43, respectively. . The index adding circuit 51 adds an index signal to the audio signal, which has been subjected to the signal processing in the audio discriminating unit 41, at a constant cycle according to the decode clock. The audio signal to which the index signal is added is the silence deletion insertion unit 42.
And output to the time axis compression / decompression unit 43.

【００８８】前記したように、フレームメモリ３４の記
憶容量が小さい場合には、話速変換処理回路４における
遅延時間に比べて、フレームメモリ３４における遅延時
間は僅かであり、無視しても差し支えない。As described above, when the storage capacity of the frame memory 34 is small, the delay time in the frame memory 34 is small as compared with the delay time in the speech speed conversion processing circuit 4, and can be ignored. .

【００８９】また、音声判別部４１における信号処理に
要する時間（すなわち、音声判別部４１における遅延時
間）は、話速変換処理回路４における遅延時間に比べて
僅かであり、無視しても差し支えない。The time required for the signal processing in the voice discriminating unit 41 (that is, the delay time in the voice discriminating unit 41) is shorter than the delay time in the voice speed conversion processing circuit 4 and can be ignored. .

【００９０】従って、本実施形態によれば、第２実施形
態と同様の作用および効果を得ることができる。（第５実施形態）以下、本発明を具体化した第５実施形
態を図面に従って説明する。尚、本実施形態において、
第２実施形態と同じ構成部材については符号を等しくし
てその詳細な説明を省略する。Therefore, according to this embodiment, the same operation and effect as those of the second embodiment can be obtained. (Fifth Embodiment) A fifth embodiment of the present invention will be described below with reference to the drawings. In this embodiment,
The same components as those in the second embodiment are designated by the same reference numerals, and detailed description thereof will be omitted.

【００９１】図８に、本実施形態の要部ブロック回路図
を示す。本実施形態において、第２実施形態と異なるの
は、インデックス付加回路５１が、無音削除挿入部４２
および時間軸圧縮伸長部４３とリングメモリ３２との間
に設けられている点だけである。インデックス付加回路
５１は、デコードクロックに従って、各部４２，４３に
おける信号処理が済んだオーディオ信号に一定周期でイ
ンデックス信号を付加する。そのインデックス信号が付
加されたオーディオ信号は、リングメモリ３２へ出力さ
れる。FIG. 8 shows a block circuit diagram of a main part of this embodiment. The present embodiment differs from the second embodiment in that the index addition circuit 51 includes a silence deletion insertion unit 42.
And the point provided between the time axis compression / expansion unit 43 and the ring memory 32. The index adding circuit 51 adds an index signal to the audio signal, which has been subjected to the signal processing in the respective units 42 and 43, at a constant cycle according to the decode clock. The audio signal added with the index signal is output to the ring memory 32.

【００９２】前記したように、フレームメモリ３４の記
憶容量が小さい場合には、話速変換処理回路４における
遅延時間に比べて、フレームメモリ３４における遅延時
間は僅かであり、無視しても差し支えない。As described above, when the storage capacity of the frame memory 34 is small, the delay time in the frame memory 34 is small as compared with the delay time in the speech speed conversion processing circuit 4, and can be ignored. .

【００９３】また、各部４１〜４３における信号処理に
要する時間（すなわち、各部４１〜４３における遅延時
間）は、話速変換処理回路４における遅延時間に比べて
僅かであり、無視しても差し支えない。The time required for signal processing in each of the units 41 to 43 (that is, the delay time in each of the units 41 to 43) is shorter than the delay time in the speech speed conversion processing circuit 4 and can be ignored. .

【００９４】つまり、話速変換処理回路４における遅延
時間は、主に、リングメモリ３２における書き込み動作
および読み出し動作に要する時間（すなわち、リングメ
モリ３２における遅延時間）によって決定される。That is, the delay time in the speech speed conversion processing circuit 4 is mainly determined by the time required for the write operation and the read operation in the ring memory 32 (that is, the delay time in the ring memory 32).

【００９５】従って、本実施形態によれば、第２実施形
態と同様の作用および効果を得ることができる。また、
本実施形態によれば、第２実施形態のようにオーディオ
信号に付加されたインデックス信号が無音削除挿入部４
２によって削除されることがない。そのため、付加した
インデックス信号が全て活用され、インデックス信号の
数を減らすことが可能になることから、インデックス付
加回路５１の回路規模を小さくすることができる。Therefore, according to this embodiment, the same operation and effect as those of the second embodiment can be obtained. Also,
According to the present embodiment, the index signal added to the audio signal as in the second embodiment is the silent deletion insertion unit 4.
Not deleted by 2. Therefore, all the added index signals are utilized and the number of index signals can be reduced, so that the circuit scale of the index adding circuit 51 can be reduced.

【００９６】（第６実施形態）以下、本発明を具体化し
た第６実施形態を図面に従って説明する。尚、本実施形
態において、第１実施形態と同じ構成部材については符
号を等しくしてその詳細な説明を省略する。(Sixth Embodiment) A sixth embodiment of the present invention will be described below with reference to the drawings. In this embodiment, the same components as those in the first embodiment have the same reference numerals, and a detailed description thereof will be omitted.

【００９７】図９に、本実施形態の要部ブロック回路図
を示す。本実施形態において、第１実施形態と異なるの
は、遅延時間検出回路５３が設けられている点だけであ
る。前記したように、音声判別部４１は、フレームメモ
リ３４から読み出されたオーディオ信号が、音声区間か
又は無音区間かを判別する。つまり、音声判別部４１の
処理結果には、オーディオ信号に音声が含まれているか
否かという情報が含まれている。FIG. 9 shows a block circuit diagram of a main part of this embodiment. The present embodiment differs from the first embodiment only in that a delay time detection circuit 53 is provided. As described above, the voice discriminating unit 41 discriminates whether the audio signal read from the frame memory 34 is a voice section or a silent section. That is, the processing result of the voice discriminating unit 41 includes information indicating whether or not the audio signal includes voice.

【００９８】また、デコードクロックは、システムスト
リームのビットレートに対応している。つまり、デコー
ドクロックには、予めオーディオ信号の圧縮伸長率の情
報が含まれている。The decode clock corresponds to the bit rate of the system stream. That is, the decode clock includes information on the compression / decompression rate of the audio signal in advance.

【００９９】そこで、遅延時間検出回路５３は、オーデ
ィオ信号に音声が含まれているか否かという情報と圧縮
伸長率の情報とに基づいて、話速変換処理回路４におけ
る遅延時間を検出し、その検出信号をビデオデコーダ１
２へ供給する。ビデオデコーダ１２は、遅延時間検出回
路５３の検出信号に基づいて、自己の動作のタイミング
を制御する。その結果、話速変換処理回路４における遅
延時間に関係なく、リングメモリ３２から読み出された
データ（すなわち、話速変換処理済みのオーディオ信
号）とビデオ信号との同期をとることができる。Therefore, the delay time detection circuit 53 detects the delay time in the speech speed conversion processing circuit 4 based on the information indicating whether or not the audio signal contains voice and the information on the compression / expansion rate, and Video decoder 1 for detection signal
Supply to 2. The video decoder 12 controls the timing of its own operation based on the detection signal of the delay time detection circuit 53. As a result, the data read from the ring memory 32 (that is, the audio signal that has been subjected to the voice speed conversion processing) and the video signal can be synchronized regardless of the delay time in the voice speed conversion processing circuit 4.

【０１００】このように、本実施形態によれば、第２実
施形態と同様の効果を得ることができる。（第７実施形態）以下、本発明を具体化した第７実施形
態を図面に従って説明する。尚、本実施形態において、
第１実施形態と同じ構成部材については符号を等しくし
てその詳細な説明を省略する。As described above, according to this embodiment, the same effect as that of the second embodiment can be obtained. (Seventh Embodiment) A seventh embodiment of the present invention will be described below with reference to the drawings. In this embodiment,
The same components as those in the first embodiment have the same reference numerals, and a detailed description thereof will be omitted.

【０１０１】図１０に、本実施形態の要部ブロック回路
図を示す。本実施形態において、第１実施形態と異なる
のは、制御回路５４が設けられている点だけである。制
御回路５４は、アップダウンカウンタ３３の検出したリ
ングメモリ３２の蓄積量に基づいて、ビデオデコーダ１
２の動作速度を制御するための制御信号を生成し、その
制御信号をビデオデコーダ１２へ供給する。ビデオデコ
ーダ１２は、制御回路５４の制御信号に基づいて、自己
の動作のタイミングを制御する。その結果、リングメモ
リ３２から読み出されたデータと、ビデオデコーダ１２
の生成するビデオ信号との同期をとることができる。FIG. 10 shows a block circuit diagram of a main part of this embodiment. The present embodiment differs from the first embodiment only in that a control circuit 54 is provided. The control circuit 54 uses the video decoder 1 based on the accumulated amount of the ring memory 32 detected by the up / down counter 33.
2 generates a control signal for controlling the operating speed of the control signal 2 and supplies the control signal to the video decoder 12. The video decoder 12 controls the timing of its own operation based on the control signal of the control circuit 54. As a result, the data read from the ring memory 32 and the video decoder 12
It is possible to synchronize with the video signal generated by.

【０１０２】前記したように、話速変換処理回路４にお
ける遅延時間は、主にリングメモリ３２における遅延時
間によって決定される。リングメモリ３２における遅延
時間は、その蓄積量と相関関係があり、蓄積量が大きく
なるほど遅延時間も大きくなる。従って、リングメモリ
３２の蓄積量に基づいてビデオデコーダ１２の動作速度
を制御すれば、リングメモリ３２から読み出されたデー
タ（すなわち、話速変換処理済みのオーディオ信号）と
ビデオ信号との同期をとることができる。As described above, the delay time in the voice speed conversion processing circuit 4 is mainly determined by the delay time in the ring memory 32. The delay time in the ring memory 32 has a correlation with the accumulated amount, and the larger the accumulated amount, the longer the delay time. Therefore, if the operation speed of the video decoder 12 is controlled based on the storage amount of the ring memory 32, the data read from the ring memory 32 (that is, the audio signal that has undergone the speech speed conversion processing) and the video signal are synchronized. Can be taken.

【０１０３】このように、本実施形態によれば、第２実
施形態と同様の効果を得ることができる。（第８実施形態）以下、本発明を具体化した第８実施形
態を図面に従って説明する。尚、本実施形態において、
第１実施形態と同じ構成部材については符号を等しくし
てその詳細な説明を省略する。As described above, according to this embodiment, the same effect as that of the second embodiment can be obtained. (Eighth Embodiment) An eighth embodiment of the present invention will be described below with reference to the drawings. In this embodiment,
The same components as those in the first embodiment have the same reference numerals, and a detailed description thereof will be omitted.

【０１０４】図１１に、本実施形態の要部ブロック回路
図を示す。本実施形態において、第１実施形態と異なる
のは、遅延時間検出回路５５が設けられている点だけで
ある。FIG. 11 shows a block circuit diagram of a main part of this embodiment. The present embodiment differs from the first embodiment only in that a delay time detection circuit 55 is provided.

【０１０５】前記したように、音声判別部４１の処理結
果には、オーディオ信号に音声が含まれているか否かと
いう情報が含まれている。また、時間軸圧縮伸長部４３
の処理結果には、オーディオ信号の圧縮伸長率の情報が
含まれている。As described above, the processing result of the voice discriminating section 41 includes information as to whether or not the audio signal contains voice. Also, the time axis compression / decompression unit 43
The processing result of (1) includes information on the compression / decompression rate of the audio signal.

【０１０６】そこで、遅延時間検出回路５５は、オーデ
ィオ信号に音声が含まれているか否かという情報と圧縮
伸長率の情報とに基づいて、話速変換処理回路４におけ
る遅延時間を検出し、その検出信号をビデオデコーダ１
２へ供給する。ビデオデコーダ１２は、遅延時間検出回
路５５の検出信号に基づいて、自己の動作のタイミング
を制御する。その結果、話速変換処理回路４における遅
延時間に関係なく、リングメモリ３２から読み出された
データ（すなわち、話速変換処理済みのオーディオ信
号）とビデオ信号との同期をとることができる。Therefore, the delay time detection circuit 55 detects the delay time in the speech speed conversion processing circuit 4 based on the information indicating whether the audio signal contains voice and the information about the compression / expansion rate, and Video decoder 1 for detection signal
Supply to 2. The video decoder 12 controls the timing of its own operation based on the detection signal of the delay time detection circuit 55. As a result, the data read from the ring memory 32 (that is, the audio signal that has been subjected to the voice speed conversion processing) and the video signal can be synchronized regardless of the delay time in the voice speed conversion processing circuit 4.

【０１０７】このように、本実施形態によれば、第２実
施形態と同様の効果を得ることができる。図１２に、可
変速再生機能を備えたＭＰＥＧビデオデコーダ１２の要
部ブロック回路を示す。As described above, according to this embodiment, the same effect as that of the second embodiment can be obtained. FIG. 12 shows a block circuit of a main part of the MPEG video decoder 12 having a variable speed reproduction function.

【０１０８】ＭＰＥＧビデオデコーダ１２は、ビットバ
ッファ２０２、ピクチャヘッダ検出回路２０３、ＭＰＥ
Ｇビデオデコードコア回路（以下、デコードコア回路と
略す）２０４、可変閾値オーバーフロー判定回路（以
下、判定回路と略す）２０５、ピクチャスキップ回路２
０６、制御コア回路２０７から構成されている。尚、各
回路２０３〜２０７は１チップのＬＳＩに搭載すること
もできる。The MPEG video decoder 12 includes a bit buffer 202, a picture header detection circuit 203 and an MPE.
G video decode core circuit (hereinafter, abbreviated as decode core circuit) 204, variable threshold overflow determination circuit (hereinafter, abbreviated as determination circuit) 205, picture skip circuit 2
06, and a control core circuit 207. The circuits 203 to 207 can also be mounted on a one-chip LSI.

【０１０９】制御コア回路２０７は各回路２〜６を制御
する。ＡＶパーサ１１から転送されてきたＭＰＥＧビデ
オストリームはビットバッファ２０２へ入力される。The control core circuit 207 controls the circuits 2 to 6. The MPEG video stream transferred from the AV parser 11 is input to the bit buffer 202.

【０１１０】ビットバッファ２０２はＦＩＦＯ構成のＲ
ＡＭから成るリングメモリによって構成され、転送され
てくるビデオストリームをそのまま順次蓄積する。ピク
チャヘッダ検出回路２０３は、ビットバッファ２０２に
蓄積されたビデオストリームの各ピクチャの先頭に付く
ピクチャヘッダを検出し、その各ピクチャヘッダに規定
されているピクチャのタイプ（Ｉ，Ｐ，Ｂ）を検出す
る。The bit buffer 202 is an R of FIFO structure.
It is composed of a ring memory composed of AM and sequentially stores the transferred video stream as it is. The picture header detection circuit 203 detects the picture header at the beginning of each picture of the video stream stored in the bit buffer 202, and detects the picture type (I, P, B) specified in each picture header. To do.

【０１１１】制御コア回路２０７は、ピクチャヘッダ検
出回路２０３の検出結果と後記する判定回路２０５の判
定結果とに基づいて、ビットバッファ２０２から１フレ
ーム期間毎に適宜なピクチャ分のビデオストリームを読
み出す。尚、ビットバッファ２０２から読み出されたビ
デオストリームは、読み出された後もビットバッファ２
０２にそのまま残される。The control core circuit 207 reads a video stream of an appropriate picture for each frame period from the bit buffer 202 based on the detection result of the picture header detection circuit 203 and the determination result of the determination circuit 205 described later. The video stream read from the bit buffer 202 is stored in the bit buffer 2 even after the video stream is read.
02 as it is.

【０１１２】ビットバッファ２０２から読み出された各
ピクチャは、ピクチャスキップ回路２０６を介してデコ
ードコア回路２０４へ転送される。デコードコア回路２
０４は、各ピクチャをＭＰＥＧビデオパートに準拠して
デコードし、各ピクチャ毎のビデオ信号を生成する。Each picture read from the bit buffer 202 is transferred to the decode core circuit 204 via the picture skip circuit 206. Decode core circuit 2
The decoder 04 decodes each picture in accordance with the MPEG video part and generates a video signal for each picture.

【０１１３】ピクチャスキップ回路２０６は、制御コア
回路２０７の制御に従って各ノード２０６ａ，２０６ｂ
側への接続が切り換えられる。そして、ピクチャスキッ
プ回路２０６がノード２０６ａ側に接続されると、ビッ
トバッファ２０２から読み出されたピクチャはそのまま
デコードコア回路２０４へ転送される。また、ノード２
０６ｂ側に接続されると、ビットバッファ２０２から読
み出されたピクチャはビットバッファ２０２へ転送され
ずにスキップされる。その結果、デコードコア回路２０
４へ転送されるピクチャは、ピクチャスキップ回路２０
６によってスキップされた分だけピクチャ単位で間引か
れる。The picture skip circuit 206 controls the nodes 206a and 206b under the control of the control core circuit 207.
The connection to the side is switched. When the picture skip circuit 206 is connected to the node 206a, the picture read from the bit buffer 202 is transferred to the decode core circuit 204 as it is. Node 2
When connected to the 06b side, the picture read from the bit buffer 202 is skipped without being transferred to the bit buffer 202. As a result, the decode core circuit 20
4 are transferred to the picture skip circuit 20.
6 is skipped in picture units by the amount skipped.

【０１１４】判定回路２０５は、再生速度検出回路２の
生成したデコードクロックに基づいてビットバッファ２
０２の占有量Ｂm の閾値Ｂthn を設定し、ビットバッフ
ァ２０２の占有量Ｂm と閾値Ｂthn とを比較する。尚、
判定回路２０５では、再生速度検出回路２の生成した実
際のデコードクロックの周波数と、通常の再生時のデコ
ードクロックの周波数との比を求め、その比を再生速度
の倍率ｎとする。従って、２倍速再生時には倍率ｎ＝２
となり、閾値Ｂthn ＝Ｂth2 となる。また、通常の再生
時には倍率ｎ＝１となり、閾値Ｂthn ＝Ｂth1 となる。The decision circuit 205 determines the bit buffer 2 based on the decode clock generated by the reproduction speed detection circuit 2.
The threshold Bthn of the occupancy Bm of 02 is set, and the occupancy Bm of the bit buffer 202 is compared with the threshold Bthn. still,
The determination circuit 205 obtains a ratio between the frequency of the actual decode clock generated by the reproduction speed detection circuit 2 and the frequency of the decode clock at the time of normal reproduction, and sets the ratio as the reproduction speed multiplying factor n. Therefore, the magnification n = 2 during 2 × speed reproduction.
And the threshold Bthn = Bth2. During normal reproduction, the magnification n = 1, and the threshold value Bthn = Bth1.

【０１１５】そして、判定回路２０５は、ビットバッフ
ァ２０２の占有量Ｂm が閾値Ｂthnを越えない場合に
は、ビットバッファ２０２がオーバーフローする恐れが
なく正常であると判定する。この場合、制御コア回路２
０７は、ビットバッファ２０２から１ピクチャ分のビデ
オストリームを読み出す。そして、制御コア回路２０７
は、ピクチャスキップ回路２０６をノード２０６ａ側に
接続し、そのビットバッファ２０２から読み出されたピ
クチャをデコードコア回路２０４へ転送させる。Then, when the occupied amount Bm of the bit buffer 202 does not exceed the threshold value Bthn, the decision circuit 205 decides that the bit buffer 202 is normal without fear of overflow. In this case, the control core circuit 2
07 reads a video stream for one picture from the bit buffer 202. Then, the control core circuit 207
Connects the picture skip circuit 206 to the node 206a, and causes the picture read from the bit buffer 202 to be transferred to the decode core circuit 204.

【０１１６】また、判定回路２０５は、ビットバッファ
２０２の占有量Ｂm が閾値Ｂthn を越えた場合には、ビ
ットバッファ２０２がオーバーフローする恐れがあると
判定する。この場合、制御コア回路２０７は、ビットバ
ッファ２０２の占有量Ｂm が閾値Ｂthn を下回るまで、
ビットバッファ２０２から適宜なピクチャ分のビデオス
トリームを読み出す。そして、制御コア回路２０７は、
ピクチャスキップ回路２０６をノード２０６ｂ側に接続
し、そのビットバッファ２０２から読み出された適宜な
ピクチャ分のビデオストリームを全てスキップさせる。Further, the judging circuit 205 judges that the bit buffer 202 may overflow when the occupied amount Bm of the bit buffer 202 exceeds the threshold value Bthn. In this case, the control core circuit 207 operates until the occupation amount Bm of the bit buffer 202 falls below the threshold value Bthn.
A video stream for an appropriate picture is read from the bit buffer 202. Then, the control core circuit 207
The picture skip circuit 206 is connected to the node 206b, and skips all video streams for appropriate pictures read from the bit buffer 202.

【０１１７】図１３に、ビットバッファ２０２の占有量
Ｂm の変化を示す。ビットバッファ２０２の占有量Ｂm
はビットレートＲB をグラフの傾きとして上昇する。ビ
ットレートＲB は、シーケンスの先頭に付くシーケンス
ヘッダのＢＲ（Bit Rate）に従って式（１）に示すよう
に規定される。また、ＡＶパーサ１１から転送されてく
るビデオストリームのピクチャレートＲP はシーケンス
ヘッダのＰＲ（Picture Rate）によって規定される。そ
して、ビットバッファ２０２の容量Ｂは、シーケンスヘ
ッダのＶＢＶ（Vbv[Video Bufferring Verifier] Buffe
r Size）に従って式（２）に示すように規定される。そ
して、１フレーム期間毎に、デコードコア回路２０４が
そのときデコードしようとする１ピクチャ分のビデオス
トリームが、ビットバッファ２０２から一気に読み出さ
れる。ここで、１フレーム期間にビットバッファ２０２
に入力されるビデオストリームのデータ量Ｘは、ビット
レートＲB およびピクチャレートＲP に従って式（３）
に示すように規定される。従って、ビットバッファ２０
２から１ピクチャ分のビデオストリームが一気に読み出
された直後のビットバッファ２０２の占有量Ｂm （＝Ｂ
0〜Ｂ6 ）は、データ量Ｘとビットバッファ２０２の容
量Ｂとに基づいて、式（４）に示す条件を満たすように
規定される。FIG. 13 shows changes in the occupied amount Bm of the bit buffer 202. Occupancy Bm of bit buffer 202
Increases with the bit rate RB as the slope of the graph. The bit rate RB is defined as shown in equation (1) according to the BR (Bit Rate) of the sequence header at the beginning of the sequence. Also, the picture rate RP of the video stream transferred from the AV parser 11 is defined by the PR (Picture Rate) of the sequence header. The capacity B of the bit buffer 202 is the VBV (Vbv [Video Buffering Verifier] Buffe) of the sequence header.
r Size), and is defined as shown in Expression (2). Then, for each frame period, a video stream for one picture which the decode core circuit 204 is to decode at that time is read from the bit buffer 202 at a stretch. Here, the bit buffer 202 in one frame period
The data amount X of the video stream input to is calculated according to equation (3) according to the bit rate RB and the picture rate RP.
It is specified as shown in. Therefore, the bit buffer 20
Occupancy Bm (= B) of the bit buffer 202 immediately after the video stream for 2 to 1 pictures is read at once
0 to B6) is defined so as to satisfy the condition shown in Expression (4) based on the data amount X and the capacity B of the bit buffer 202.

【０１１８】ＲB ＝４００×ＢＲ ………（１）Ｂ＝１６×１０２４×ＶＢＶ ………（２）Ｘ＝ＲB ／ＲP ………（３）０＜Ｂm ＜Ｂ−Ｘ＝Ｂ−（ＲB ／ＲP ） ………（４）式（４）に示す条件を満たすようにビットバッファ２０
２の占有量Ｂm が規定されていれば、ビットバッファ２
０２がオーバーフローしたりアンダーフローしたりする
ことはない。逆に言えば、ビットバッファ２０２の占有
量Ｂm が閾値（Ｂ−Ｘ）を越えると、次の１フレーム期
間にビットバッファ２０２に入力されるビデオストリー
ムによってビットバッファ２０２がオーバーフローする
可能性が極めて高くなる。RB = 400 × BR (1) B = 16 × 1024 × VBV (2) X = RB / RP (3) 0 <Bm <BX−B− (RB / RP) ... (4) The bit buffer 20 is set so as to satisfy the condition shown in Expression (4).
If the occupation amount Bm of the bit buffer 2 is specified, the bit buffer 2
02 does not overflow or underflow. Conversely, if the occupation amount Bm of the bit buffer 202 exceeds the threshold value (BX), there is a very high possibility that the bit buffer 202 will overflow due to the video stream input to the bit buffer 202 in the next one frame period. Become.

【０１１９】ビデオデコーダ１２では、通常の再生時に
おいて、式（４）が満たされるように、ビットレートＲ
B 、ピクチャレートＲP 、容量Ｂの各値が規定されてい
る。つまり、式（２）に示すようにビットバッファ２０
２の容量Ｂを設定しておけば、ピクチャスキップ回路２
０６の接続をノード２０６ａ側に固定しておいたとして
も、理想的な状態ではビットバッファ２０２がオーバー
フローしたりアンダーフローしたりすることはない。In the video decoder 12, the bit rate R is set so that the expression (4) is satisfied during normal reproduction.
The values of B, picture rate RP, and capacity B are specified. That is, as shown in equation (2), the bit buffer 20
If the capacity B of 2 is set, the picture skip circuit 2
Even if the connection of 06 is fixed to the node 206a side, the bit buffer 202 does not overflow or underflow in an ideal state.

【０１２０】従って、通常の再生時において、ビットバ
ッファ２０２から１ピクチャ分のデータが一気に読み出
された直後の占有量Ｂm （＝Ｂ0 〜Ｂ4 ）は、閾値Ｂth
1 に基づいて、式（５）に示す条件を満たすように規定
される。尚、閾値Ｂth1 は、式（４）に基づいて、式
（６）に示すように設定される。Therefore, during normal reproduction, the occupied amount Bm (= B0 to B4) immediately after the data for one picture is read out from the bit buffer 202 at once is determined by the threshold value Bth.
Based on 1, it is defined to satisfy the condition shown in Expression (5). The threshold value Bth1 is set based on the equation (4) as shown in the equation (6).

【０１２１】０＜Ｂm ＜Ｂth1 ＜Ｂ ………（５）Ｂth1 ＝Ｂ−Ｘ＝Ｂ−（ＲB ／ＲP ） ………（６）ところで、実際の状態では、式（２）に示すようにビッ
トバッファ２０２の容量Ｂを設定しておいても、ピクチ
ャスキップ回路２０６の接続をノード２０６ａ側に固定
しておくと、ビットバッファ２０２がオーバーフローす
る恐れがある。0 <Bm <Bth1 <B (5) Bth1 = B−X = B− (RB / RP) (6) By the way, in an actual state, as shown in the formula (2), Even if the capacity B of the bit buffer 202 is set, if the connection of the picture skip circuit 206 is fixed to the node 206a side, the bit buffer 202 may overflow.

【０１２２】しかし、ビデオデコーダ１２では、通常の
再生時において、ビットバッファ２０２の占有量Ｂm が
閾値Ｂth1 を越えた場合、ビットバッファ２０２がオー
バーフローする恐れがあると判定される。すると、ビッ
トバッファ２０２の占有量Ｂm が閾値Ｂth1 を下回るま
で、ビットバッファ２０２から適宜なピクチャ分のビデ
オストリームが読み出される。そして、ピクチャスキッ
プ回路２０６はノード２０６ｂ側に接続され、そのビッ
トバッファ２０２から読み出された適宜なピクチャ分の
ビデオストリームは全てスキップされる。従って、ビデ
オデコーダ１２によれば、通常の再生時において、ビッ
トバッファ２０２がオーバーフローすることはない。However, the video decoder 12 determines that the bit buffer 202 may overflow if the occupied amount Bm of the bit buffer 202 exceeds the threshold value Bth1 during normal reproduction. Then, a video stream for an appropriate picture is read from the bit buffer 202 until the occupation amount Bm of the bit buffer 202 falls below the threshold value Bth1. Then, the picture skip circuit 206 is connected to the node 206b side, and all video streams for appropriate pictures read from the bit buffer 202 are skipped. Therefore, according to the video decoder 12, the bit buffer 202 does not overflow during normal reproduction.

【０１２３】高速再生時におけるビットバッファ２０２
の占有量Ｂm はビットレートｎ×ＲB をグラフの傾きと
して上昇する。例えば、２倍速再生時におけるビットバ
ッファ２０２の占有量Ｂm はビットレート２×ＲB をグ
ラフの傾きとして上昇する。Bit buffer 202 during high speed reproduction
Occupancy Bm of the graph rises with the bit rate n × RB as the slope of the graph. For example, the occupation amount Bm of the bit buffer 202 at the time of double speed reproduction increases with the bit rate 2 × RB as the slope of the graph.

【０１２４】従って、高速再生時において、ビットバッ
ファ２０２から１ピクチャ分のデータが一気に読み出さ
れた直後の占有量Ｂm （＝Ｂ0 〜Ｂ4 ）は、閾値Ｂthn
に基づいて、式（７）に示す条件を満たすように規定さ
れる。尚、閾値Ｂthn は式（８）に示すように設定され
る。Therefore, at the time of high speed reproduction, the occupied amount Bm (= B0 to B4) immediately after the data of one picture is read from the bit buffer 202 at once is the threshold value Bthn.
Is defined to satisfy the condition shown in Expression (7). Note that the threshold value Bthn is set as shown in Expression (8).

【０１２５】０＜Ｂm ＜Ｂthn ………（７）Ｂthn ＝Ｂ−ｎ×Ｘ＝Ｂ−（ｎ×ＲB ／ＲP ） ………（８）高速再生時においては、ビットバッファ２０２の占有量
Ｂm が閾値Ｂthn を越えた場合、ビットバッファ２０２
がオーバーフローする恐れがあると判定される。例え
ば、２倍速再生時には占有量Ｂm が閾値Ｂth2 （＝Ｂ−
（２×ＲB ／ＲP））を越えた場合、３倍速再生時には
占有量Ｂm が閾値Ｂth3 （＝Ｂ−（３×ＲB ／ＲP ））
を越えた場合に、ビットバッファ２０２がオーバーフロ
ーする恐れがあると判定される。すると、ビットバッフ
ァ２０２の占有量Ｂm が閾値Ｂthnを下回るまでビット
バッファ２０２から適宜なピクチャ分のビデオストリー
ムが読み出され、そのビデオストリームは全てスキップ
される。従って、ビデオデコーダ１２によれば、高速再
生時において、ビットバッファ２０２がオーバーフロー
することはない。0 <Bm <Bthn (7) Bthn = B−n × X = B− (n × RB / RP) (8) The occupied amount Bm of the bit buffer 202 during high speed reproduction Exceeds the threshold value Bthn, the bit buffer 202
Is determined to be likely to overflow. For example, at the time of 2 × speed reproduction, the occupation amount Bm is equal to the threshold Bth2 (= B−2).
(2 × RB / RP)), the occupation amount Bm becomes the threshold value Bth3 (= B− (3 × RB / RP)) at 3 × speed reproduction.
Is exceeded, it is determined that the bit buffer 202 may overflow. Then, a video stream of an appropriate picture is read from the bit buffer 202 until the occupation amount Bm of the bit buffer 202 falls below the threshold value Bthn, and all the video streams are skipped. Therefore, according to the video decoder 12, the bit buffer 202 does not overflow during high speed reproduction.

【０１２６】デコードコア回路２０４において任意のピ
クチャをデコードしている途中でビットバッファ２０２
がオーバーフローすると、デコード処理中のピクチャの
ビットバッファ２０２に残っている部分に対して、新た
に入力されたビデオストリームが上書きされる。その結
果、デコード処理中のピクチャのビットバッファ２０２
に残っている部分が破壊されて失われる。すると、デコ
ードコア回路２０４では、そのピクチャのデコードを完
了することが不可能になり、そのピクチャのビデオ信号
を生成することができなくなる。従って、デコードコア
回路２０４において任意のピクチャをデコードしている
途中でビットバッファ２０２がオーバーフローすること
は絶対に避けなければならない。During decoding of an arbitrary picture in the decoding core circuit 204, the bit buffer 202
Overflows, the newly input video stream is overwritten on the remaining portion of the bit buffer 202 of the picture being decoded. As a result, the bit buffer 202 of the picture being decoded is
The remaining parts are destroyed and lost. Then, the decoding core circuit 204 cannot complete the decoding of the picture, and cannot generate a video signal of the picture. Therefore, it is absolutely necessary to prevent the bit buffer 202 from overflowing while the decoding core circuit 204 is decoding an arbitrary picture.

【０１２７】そのため、ビットバッファ２０２がオーバ
ーフローする恐れがあるかどうかの判定は、デコードコ
ア回路２０４において任意のピクチャのデコードを開始
する前に行う必要がある。より正確には、ピクチャヘッ
ダ検出回路２０３がピクチャヘッダを検出した時点で、
ビットバッファ２０２がオーバーフローする恐れがある
かどうかを判定し、そのピクチャをピクチャスキップ回
路２０６を介してスキップするかどうかを決定する必要
がある。Therefore, it is necessary to determine whether or not the bit buffer 202 may overflow before the decoding core circuit 204 starts decoding any picture. More precisely, when the picture header detection circuit 203 detects the picture header,
It is necessary to determine whether the bit buffer 202 may overflow and determine whether to skip the picture via the picture skip circuit 206.

【０１２８】ところで、１つのピクチャのデータ量は０
〜４０バイトであるが、そのデータ量はデコードコア回
路２０４においてデコードが終了した時点でないとわか
らない。また、１つのピクチャのデコード処理時間は、
そのピクチャのデータ量やデコードコア回路２０４の動
作速度によって異なるが、通常、１フレーム期間の１／
３〜３／４程度である。The data amount of one picture is 0.
Although the data amount is ４０40 bytes, the data amount cannot be known until the decoding at the decoding core circuit 204 is completed. The decoding processing time for one picture is
Although it depends on the data amount of the picture and the operation speed of the decode core circuit 204, it is usually 1/1 of one frame period.
It is about 3 to 3/4.

【０１２９】ビットバッファ２０２から読み出されたピ
クチャのデータ量が０バイトの場合、そのピクチャの読
み出し前後でビットバッファ２０２の占有量Ｂm は変化
しないため、そのピクチャをスキップしたとしてもオー
バーフローを回避することはできない。逆に言えば、ビ
ットバッファ２０２から読み出されたピクチャのデータ
量が０バイトの場合でも、ビットバッファ２０２に十分
な空き容量があればオーバーフローすることはない。When the data amount of the picture read from the bit buffer 202 is 0 byte, the occupied amount Bm of the bit buffer 202 does not change before and after the reading of the picture, and therefore overflow is avoided even if the picture is skipped. It is not possible. Conversely, even when the data amount of the picture read from the bit buffer 202 is 0 bytes, there is no overflow if the bit buffer 202 has a sufficient free space.

【０１３０】そこで、１フレーム期間にビットバッファ
２０２に入力されるビデオストリームのデータ量分の空
き容量を、ビットバッファ２０２に確保しておく。そう
すれば、ビットバッファ２０２から読み出されたピクチ
ャのデータ量が０バイトの場合でもオーバーフローする
ことはない。Therefore, a free space corresponding to the data amount of the video stream input to the bit buffer 202 in one frame period is secured in the bit buffer 202. Then, even if the data amount of the picture read from the bit buffer 202 is 0 bytes, no overflow occurs.

【０１３１】１フレーム期間にビットバッファ２０２に
入力されるビデオストリームのデータ量は、（ｎ×Ｘ＝
ｎ×ＲB ／ＲP ）になる。ビットバッファ２０２の空き
容量がこのデータ量以上であればオーバーフローするこ
とはない。従って、式（８）に示すように閾値Ｂthn を
設定しておけば、ビットバッファ２０２のオーバーフロ
ーを確実に回避することができる。The data amount of the video stream input to the bit buffer 202 in one frame period is (n × X =
nxRB / RP). If the free space of the bit buffer 202 is equal to or more than this data amount, no overflow occurs. Therefore, if the threshold value Bthn is set as shown in Expression (8), the overflow of the bit buffer 202 can be reliably avoided.

【０１３２】すなわち、判定回路２０５は、ピクチャヘ
ッダ検出回路２０３がピクチャヘッダを検出した時点で
ビットバッファ２０２の空き容量をチェックし、十分な
空き容量（ｎ×Ｘ＝ｎ×ＲB ／ＲP ）が確保されている
かどうかを判定する。十分な空き容量が確保されていな
ければ、そのピクチャヘッダに基づいて制御コア回路２
０７がビットバッファ２０２から読み出したピクチャ
を、ピクチャスキップ回路２０６を介してスキップす
る。続いて、判定回路２０５は、ピクチャヘッダ検出回
路２０３が次のピクチャヘッダを検出した時点で、再び
ビットバッファ２０２の空き容量をチェックする。これ
らの処理に要する時間は、デコードコア回路２０４のデ
コード処理時間に比べてはるかに短いため、ビットバッ
ファ２０２に十分な空き容量が確保できてからデコード
コア回路２０４のデコード処理を開始しても十分に間に
合う。That is, the determination circuit 205 checks the free space of the bit buffer 202 at the time when the picture header detection circuit 203 detects the picture header, and secures a sufficient free space (n × X = n × RB / RP). It is determined whether it has been done. If sufficient free space is not secured, the control core circuit 2
07 skips the picture read from the bit buffer 202 via the picture skip circuit 206. Subsequently, when the picture header detection circuit 203 detects the next picture header, the determination circuit 205 checks the free space of the bit buffer 202 again. Since the time required for these processes is much shorter than the decoding process time of the decode core circuit 204, it is sufficient to start the decoding process of the decode core circuit 204 after securing sufficient free space in the bit buffer 202. In time.

【０１３３】ところで、ピクチャヘッダ検出回路２０３
がピクチャヘッダを検出した時点や、デコードコア回路
２０４がデコードを開始した後に、ビットバッファ２０
２がアンダーフローすることがある。この場合は、ビデ
オストリームがビットバッファ２０２に入力され次第、
ビットバッファ２０２から１ピクチャ分のビデオストリ
ームを逐次読み出せばよいため、特に問題とはならな
い。By the way, the picture header detection circuit 203
At the time when the bit buffer 20 detects the picture header or after the decode core circuit 204 starts decoding.
2 may underflow. In this case, as soon as the video stream is input to the bit buffer 202,
There is no particular problem since a video stream for one picture may be sequentially read from the bit buffer 202.

【０１３４】以上詳述したように、ビデオデコーダ１２
によれば、以下に示す効果を得ることができる。通常の再生時において、ビットバッファ２０２のオー
バーフローを回避することができる。As described in detail above, the video decoder 12
According to this, the following effects can be obtained. During normal reproduction, overflow of the bit buffer 202 can be avoided.

【０１３５】高速再生時において、ビットバッファ２
０２のオーバーフローを回避することができる。判定回路２０５およびピクチャスキップ回路２０６を
設けることにより、ビットバッファ２０２のオーバーフ
ローを回避することができる。上記したように判定回路
２０５およびピクチャスキップ回路２０６の制御は簡単
であるため、制御コア回路２０７はマイクロコンピュー
タを用いて構成する必要がない。そして、各回路２０３
〜２０７を１チップのＬＳＩに搭載した場合には、ビデ
オデコーダ１２を小型化することができる。Bit buffer 2 during high-speed reproduction
02 overflow can be avoided. By providing the determination circuit 205 and the picture skip circuit 206, overflow of the bit buffer 202 can be avoided. As described above, since the control of the determination circuit 205 and the picture skip circuit 206 is simple, there is no need to configure the control core circuit 207 using a microcomputer. And each circuit 203
When ˜207 are mounted on a one-chip LSI, the video decoder 12 can be downsized.

【０１３６】ピクチャスキップ回路２０６のノード２
０６ｂ側からスキップされるビデオストリームは、ピク
チャ単位となる。そのため、デコードコア回路２０４へ
転送されるピクチャの途中でデータが途切れることはな
い。従って、デコードコア回路２０４では、Ｉピクチャ
だけでなくＰピクチャやＢピクチャについてもデコード
可能になる。その結果、ディスプレイ２２で再生される
動画に生じるコマ落ちが少なくなる。そのため、２〜４
倍という比較的遅い高速再生時において、数コマ／秒の
表示が可能になる。従って、高速再生時における動画の
動きを滑らかにして画質を大幅に向上させることができ
る。Node 2 of the picture skip circuit 206
The video stream skipped from the 06b side is a picture unit. Therefore, data is not interrupted in the middle of the picture transferred to the decode core circuit 204. Therefore, the decoding core circuit 204 can decode not only an I picture but also a P picture and a B picture. As a result, dropped frames in the moving image reproduced on the display 22 are reduced. Therefore, 2-4
It is possible to display several frames per second during high-speed playback, which is a relatively slow double speed. Therefore, the motion of a moving image during high-speed playback can be smoothed, and the image quality can be greatly improved.

【０１３７】ところで、上記したビデオデコーダ１２に
おいて、式（９）に示す規定を満たすように、２つの閾
値Ｂ2thn，Ｂ3thnを設定してもよい。尚、各閾値Ｂ2th
n，Ｂ3thnの値は、上記のように再生速度に応じて設定
されると共に、ディスプレイ２２で再生される動画の画
質を実際に検討して適宜に設定すればよい。By the way, in the above video decoder 12, two threshold values B2thn and B3thn may be set so as to satisfy the formula (9). Each threshold B2th
The values of n and B3thn are set according to the reproduction speed as described above, and may be set appropriately by actually examining the image quality of the moving image reproduced on the display 22.

【０１３８】０＜Ｂ3thn＜Ｂ2thn＜Ｂ ………（９）判定回路２０５は、ビットバッファ２０２の占有量Ｂm
と各閾値Ｂthn ，Ｂ2thnとを比較し、占有量Ｂm が式
（１０）〜（１２）に示すどの領域に含まれるかを判定
する。0 <B3thn <B2thn <B (9) The decision circuit 205 determines that the bit buffer 202 occupies Bm.
Is compared with the thresholds Bthn and B2thn to determine in which area the occupancy Bm is included in the equations (10) to (12).

【０１３９】Ｂm ＜Ｂ3thn ………（１０）Ｂ3thn＜Ｂm ＜Ｂ2thn ………（１１）Ｂ2thn＜Ｂm ………（１２）判定回路２０５は、式（１０）に示すように、ビットバ
ッファ２０２の占有量Ｂm が閾値Ｂ3thnを越えない場合
には、ビットバッファ２０２がオーバーフローする恐れ
がなく正常であると判定する。この場合、制御コア回路
２０７は、ビットバッファ２０２から１ピクチャ分のビ
デオストリームを読み出す。そして、制御コア回路２０
７は、ピクチャスキップ回路２０６をノード２０６ａ側
に接続し、そのビットバッファ２０２から読み出された
ピクチャをデコードコア回路２０４へ転送させる。Bm <B3thn (10) B3thn <Bm <B2thn (11) B2thn <Bm (12) The decision circuit 205 determines the bit buffer 202 of the bit buffer 202 as shown in equation (10). If the occupied amount Bm does not exceed the threshold value B3thn, it is determined that the bit buffer 202 is normal without fear of overflow. In this case, the control core circuit 207 reads a video stream for one picture from the bit buffer 202. Then, the control core circuit 20
7, the picture skip circuit 206 is connected to the node 206a, and the picture read from the bit buffer 202 is transferred to the decode core circuit 204.

【０１４０】判定回路２０５は、式（１２）に示すよう
に、ビットバッファ２０２の占有量Ｂm が閾値Ｂ2thnを
越え且つ閾値Ｂthn を越えない場合に、ビットバッファ
２０２から読み出されたピクチャがＩピクチャまたはＰ
ピクチャならば、第１のフラグを立てる。また、式（１
１）に示すように、ビットバッファ２０２の占有量Ｂm
が閾値Ｂ3thnを越え且つ閾値Ｂ2thnを越えない場合に、
ビットバッファ２０２から読み出されたピクチャがＰピ
クチャならば、第２のフラグを立てる。第１または第２
のフラグが立っている場合、式（１０）に示す場合で
も、制御コア回路２０７は、ビットバッファ２０２から
読み出されたピクチャがＢピクチャならば、ピクチャス
キップ回路２０６をノード２０６ｂ側に接続し、そのピ
クチャをスキップさせる。The determination circuit 205 determines that the picture read from the bit buffer 202 is an I picture when the occupied amount Bm of the bit buffer 202 exceeds the threshold value B2thn and does not exceed the threshold value Bthn, as shown in the equation (12). Or P
If it is a picture, the first flag is set. Also, the formula (1
As shown in 1), the occupation amount Bm of the bit buffer 202
Exceeds threshold B3thn and does not exceed threshold B2thn,
If the picture read from the bit buffer 202 is a P picture, a second flag is set. 1st or 2nd
Is set, the control core circuit 207 connects the picture skip circuit 206 to the node 206b if the picture read from the bit buffer 202 is a B picture, even in the case of the equation (10). Skip the picture.

【０１４１】図１３に、２つの閾値Ｂ2thn，Ｂ3thnを設
定した場合におけるビットバッファ２０２の占有量Ｂm
の変化を示す。占有量Ｂm が閾値Ｂ3thnを越えた場合、
ビットバッファ２０２から読み出されたピクチャがＢピ
クチャであればデコードせずにスキップする（図示※
１）。ここで、Ｂピクチャのスキップ後に占有量Ｂm が
まだ閾値Ｂ3thnを越えていても、ビットバッファ２０２
から次に読み出されたピクチャがＩピクチャまたはＰピ
クチャであればデコードする（図示※２）。In FIG. 13, the occupied amount Bm of the bit buffer 202 when two thresholds B2thn and B3thn are set.
Shows the change in If the occupancy Bm exceeds the threshold B3thn,
If the picture read from the bit buffer 202 is a B picture, it is skipped without decoding (illustration *
1). Here, even if the occupation amount Bm still exceeds the threshold value B3thn after skipping the B picture, the bit buffer 202
If the picture read next from is an I picture or a P picture, it is decoded (illustrated * 2).

【０１４２】占有量Ｂm が閾値Ｂ3thnを越えた場合で
も、ビットバッファ２０２から読み出されたピクチャが
ＩピクチャまたはＰピクチャであればデコードする（図
示※３）。ここで、ＩピクチャまたはＰピクチャのデコ
ード後に占有量Ｂm がまだ閾値Ｂ3thnを越えている場
合、ビットバッファ２０２から次に読み出されたピクチ
ャがＢピクチャであればデコードせずにスキップする
（図示※４）。このＢピクチャのスキップは、占有量Ｂ
m が閾値Ｂ3thnを下回るまで繰り返し行う（図示※
５）。Even if the occupied amount Bm exceeds the threshold value B3thn, if the picture read from the bit buffer 202 is an I picture or a P picture, it is decoded (* 3 in the figure). Here, if the occupation amount Bm still exceeds the threshold value B3thn after the decoding of the I picture or the P picture, if the next picture read from the bit buffer 202 is a B picture, the picture is skipped without decoding (illustration *). 4). This skipping of the B picture is performed with the occupation amount B
Repeat until m falls below the threshold B3thn (*
5).

【０１４３】占有量Ｂm が閾値Ｂ2thnを越えた場合、ビ
ットバッファ２０２から読み出されたピクチャがＩピク
チャまたはＰピクチャであれば、判定回路２０５は第１
のフラグを立てる（図示※６）。第１のフラグが立って
いる場合、ビットバッファ２０２から次に読み出された
ピクチャがＢピクチャであれば、占有量Ｂm が閾値Ｂ3t
hnを下回っていても、そのＢピクチャをスキップする
（図示※７）。If the occupied amount Bm exceeds the threshold value B2thn, and if the picture read from the bit buffer 202 is an I picture or a P picture, the decision circuit 205 makes the first decision.
Is set (illustration * 6). When the first flag is set, if the next picture read from the bit buffer 202 is a B picture, the occupation amount Bm is set to the threshold B3t.
Even if it is less than hn, the B picture is skipped (illustrated * 7).

【０１４４】占有量Ｂm が閾値Ｂ3thnを越え且つ閾値Ｂ
2thnを越えない場合、ビットバッファ２０２から読み出
されたピクチャがＰピクチャであれば、判定回路２０５
は第２のフラグを立てる（図示※８）。第２のフラグが
立っている場合、ビットバッファ２０２から次に読み出
されたピクチャがＢピクチャであれば、占有量Ｂm が閾
値Ｂ3thnを下回っていても、そのＢピクチャをスキップ
する（図示※９）。The occupied amount Bm exceeds the threshold value B3thn and the threshold value B3
If it does not exceed 2thn, if the picture read from the bit buffer 202 is a P picture, the decision circuit 205
Sets the second flag (illustrated * 8). When the second flag is set, if the next picture read from the bit buffer 202 is a B picture, the B picture is skipped even if the occupation amount Bm is below the threshold value B3thn (illustrated * 9). ).

【０１４５】占有量Ｂm が閾値Ｂ3thnを越え且つ閾値Ｂ
2thnを越えない場合、ビットバッファ２０２から読み出
されたピクチャがＩピクチャのときには、判定回路２０
５は第２のフラグを立てない（図示※１０）。第２のフ
ラグが立っていない場合、占有量Ｂm が閾値Ｂ3thnを下
回っていれば、ビットバッファ２０２から次に読み出さ
れたピクチャがＢピクチャであってもデコードする。The occupied amount Bm exceeds the threshold value B3thn and the threshold value B3
If it does not exceed 2thn, and the picture read from the bit buffer 202 is an I picture, the decision circuit 20
5 does not set the second flag (* 10 in the figure). When the second flag is not set, if the occupation amount Bm is smaller than the threshold value B3thn, decoding is performed even if the next picture read from the bit buffer 202 is a B picture.

【０１４６】以上のように、２つの閾値Ｂ2thn，Ｂ3thn
を設定した場合には、上記したビデオデコーダ１２の効
果〜に加えて、以下の効果を得ることができる。ビットバッファ２０２の占有量Ｂm が閾値Ｂ3thnを越
え且つ閾値Ｂthn を越えない場合、ＩピクチャおよびＰ
ピクチャを可能な限りデコードすると共に、Ｂピクチャ
を優先してスキップする。As described above, the two threshold values B2thn and B3thn
When setting, the following effects can be obtained in addition to the effects of the video decoder 12 described above. If the occupation amount Bm of the bit buffer 202 exceeds the threshold value B3thn and does not exceed the threshold value Bthn, the I picture and P
The picture is decoded as much as possible and the B picture is skipped with priority.

【０１４７】Ｂピクチャは双方向予測によって生成され
るため、その重要度はＩピクチャやＰピクチャに比べて
低い。従って、重要度の低いＢピクチャを優先してスキ
ップすることにより、ディスプレイ２２で再生される動
画に生じるコマ落ちをさらに少なくすることができる。
その結果、高速再生時における動画の動きをさらに滑ら
かにして画質をより向上させることができる。Since the B picture is generated by bidirectional prediction, its importance is lower than that of the I picture and P picture. Therefore, by skipping B-pictures of low importance with priority, it is possible to further reduce dropped frames occurring in the moving image reproduced on the display 22.
As a result, the motion of the moving image during high-speed playback can be further smoothed, and the image quality can be further improved.

【０１４８】第１のフラグを設定することで、Ｉピク
チャまたはＰピクチャのデコード後にビットバッファ２
０２の占有量Ｂm が閾値Ｂ3thnを下回っても、余裕をみ
て次にビットバッファ２０２から読み出されるＢピクチ
ャを予めスキップすることができる。また、第２のフラ
グを設定することで、Ｐピクチャのデコード後にビット
バッファ２０２の占有量Ｂm が閾値Ｂ3thnを下回って
も、余裕をみて次にビットバッファ２０２から読み出さ
れるＢピクチャを予めスキップすることができる。By setting the first flag, the bit buffer 2 is decoded after the I picture or P picture is decoded.
Even if the occupation amount Bm of 02 is below the threshold value B3thn, it is possible to skip the B picture to be read from the bit buffer 202 in advance with a margin. Also, by setting the second flag, even if the occupation amount Bm of the bit buffer 202 falls below the threshold value B3thn after decoding the P picture, the B picture to be read from the bit buffer 202 next is skipped in advance with a margin. Can be.

【０１４９】このように、Ｂピクチャを予めスキップす
ることは、ビットバッファ２０２の次回のオーバーフロ
ーに対して予防措置を講ずることに他ならない。従っ
て、ビットバッファ２０２のオーバーフローをより確実
に回避することができる。As described above, skipping a B picture in advance is nothing but taking precautions against the next overflow of the bit buffer 202. Therefore, the overflow of the bit buffer 202 can be avoided more reliably.

【０１５０】Ｉピクチャのデータ量はＰピクチャのそ
れの２〜３倍と多い。そのため、Ｐピクチャが読み出さ
れた場合に比べて、Ｉピクチャが読み出された場合の方
がビットバッファ２０２の占有量Ｂm の減少の度合いが
大きい。従って、Ｐピクチャが読み出された後よりも、
Ｉピクチャが読み出された後の方がビットバッファ２０
２がオーバーフローする可能性が小さくなる。そこで、
第１および第２のフラグを設定することにより、Ｉピク
チャとＰピクチャとで前記予防措置に差をつける。すな
わち、Ｉピクチャに対する予防措置の閾値Ｂ2thnを、Ｐ
ピクチャに対する予防措置の閾値Ｂ3thnよりも高い値に
設定することで、Ｉピクチャに対する予防措置をＰピク
チャのそれに比べて緩くすることが可能になる。その結
果、Ｂピクチャの無駄なスキップを少なくすることがで
きる。The data amount of the I picture is as large as 2-3 times that of the P picture. Therefore, the degree of decrease in the occupation amount Bm of the bit buffer 202 is greater when the I picture is read than when the P picture is read. Therefore, than after the P picture is read out,
After the I picture is read, the bit buffer 20
2 is less likely to overflow. Therefore,
Setting the first and second flags differentiates the precautionary measures between I and P pictures. That is, the threshold value B2thn of the precautionary measure for the I picture is set to P
By setting the precautionary measure for a picture to a value higher than the threshold value B3thn, the preventive measure for an I picture can be less strict than that for a P picture. As a result, unnecessary skips of B pictures can be reduced.

【０１５１】以下のａ）ｂ）に示すＧＯＰ構成（ピク
チャのタイプの並び）のビデオストリームがＡＶパーサ
１１から転送されてきた場合についてシミュレーション
したところ、以下に示す結果が得られた。When the video stream having the GOP structure (arrangement of picture types) shown in a) and b) below was transferred from the AV parser 11, simulation was performed, and the following results were obtained.

【０１５２】ａ）ＩＢＰＢＰＢＰＢＰ・・・ｂ）ＩＢＢＰＢＢＰＢＢＰＢＢＰＢＢＩＢＰ・・・ [1] ２倍速再生時；ａ）の場合、ＩピクチャおよびＰピ
クチャの全てがデコード可能であり、その結果、３０コ
マ／秒のフルレートで表示できる。ｂ）の場合、Ｉピク
チャおよびＰピクチャの全てとＢピクチャの一部がデコ
ード可能であり、その結果、２５コマ／秒以上で表示で
きる。A) IBPBPBPBP ... b) IBBPBBPBBPBBPBBIBP ... [1] During double speed reproduction; In the case of a), all I pictures and P pictures can be decoded, and as a result, 30 frames / second full rate Can be displayed with. In the case of b), all of the I and P pictures and a part of the B picture can be decoded, and as a result, they can be displayed at 25 frames / second or more.

【０１５３】[2] ４倍速再生時；ａ）ｂ）共に、Ｉピク
チャおよびそれに続く３〜４枚のＰピクチャがデコード
可能であり、その結果、１５コマ／秒以上で表示でき
る。ところで、第２〜第３実施形態において、ビデオデ
コーダ１２の動作速度を制御するには、デコードコア回
路２０４におけるデコード処理の速度を制御すればよ
い。[2] At 4 × speed reproduction; a) and b), I picture and 3 to 4 P pictures following it can be decoded, and as a result, display at 15 frames / second or more is possible. By the way, in the second to third embodiments, in order to control the operation speed of the video decoder 12, the speed of the decoding process in the decode core circuit 204 may be controlled.

【０１５４】尚、上記各実施形態は以下のように変更し
てもよく、その場合でも同様の作用および効果を得るこ
とができる。（１）リングメモリ３２を、ＤＳＰ３１の後段ではな
く、ＤＳＰ３１の前段（すなわち、ＭＰＥＧオーディオ
デコーダ３とＤＳＰ３１の間）に設ける。The above embodiments may be modified as follows, and in that case, the same operation and effect can be obtained. (1) The ring memory 32 is provided in the front stage of the DSP 31 (that is, between the MPEG audio decoder 3 and the DSP 31), not in the rear stage of the DSP 31.

【０１５５】（２）ＭＰＥＧ再生装置２３を構成する各
回路１，１１，１２を１チップのＬＳＩに搭載する。こ
のようにすれば、ＭＰＥＧ再生装置２３を小型化するこ
とができる。(2) Each of the circuits 1, 11, and 12 which compose the MPEG playback device 23 is mounted on a one-chip LSI. By doing so, the MPEG playback device 23 can be downsized.

【０１５６】（３）第２〜第８実施形態において、ビデ
オデコーダ１２の動作速度を制御するのではなく、ビデ
オデコーダ１２とディスプレイ２２の間に遅延回路を挿
入し、その遅延回路の遅延時間を制御する。(3) In the second to eighth embodiments, instead of controlling the operation speed of the video decoder 12, a delay circuit is inserted between the video decoder 12 and the display 22 and the delay time of the delay circuit is adjusted. Control.

【０１５７】（４）第２〜第８実施形態の内いずれか２
つ以上の実施形態を適宜に組み合わせて実施する。この
ようにすれば、組み合わせた各実施形態の相乗作用によ
りさらに優れた効果を得ることができる。(4) Any two of the second to eighth embodiments
One or more embodiments are appropriately combined and implemented. In this case, a more excellent effect can be obtained by the synergistic action of the combined embodiments.

【０１５８】（５）第１〜第８実施形態をＣＰＵを用い
たソフトウェア的な処理に置き代える。すなわち、各回
路（１〜５５）における信号処理をＣＰＵを用いたソフ
トウェア的な信号処理に置き代える。(5) The first to eighth embodiments are replaced by software processing using a CPU. That is, the signal processing in each circuit (1 to 55) is replaced with software-like signal processing using a CPU.

【０１５９】（６）図１２に示したＭＰＥＧビデオデコ
ーダ１２においては、説明を分かり易くするため、ピク
チャスキップ回路２０６が各ノード２０６ａ，２０６ｂ
を有し、制御コア回路２０７の制御に従って各ノード２
０６ａ，２０６ｂの接続が切り換えられる構成とした
が、この構成に代えて、ピクチャスキップ回路２０６
を、制御コア回路２０７の制御に従って、デコードコア
回路２０４でデコードされるべきピクチャだけを通過さ
せる論理回路によって構成してもよい。(6) In the MPEG video decoder 12 shown in FIG. 12, the picture skip circuit 206 includes nodes 206a and 206b for the sake of clarity.
Each node 2 under control of the control core circuit 207.
Although the connection between 06a and 206b is switched, the picture skip circuit 206 is used instead of this structure.
May be configured by a logic circuit that allows only the picture to be decoded by the decode core circuit 204 to pass under the control of the control core circuit 207.

【０１６０】以上、本発明を具体化した各実施形態につ
いて説明したが、上記実施形態から把握できる請求項以
外の技術的思想について、以下にそれらの効果と共に記
載する。（イ）請求項３〜６のいずれか１項に記載のＭＰＥＧオ
ーディオ再生装置において、オーディオ信号をＤ／Ａ変
換するＤ／Ａコンバータ（５）と、Ｄ／Ａコンバータの
出力を増幅するオーディオアンプ（６）とを備えたＭＰ
ＥＧオーディオ再生装置。Although the respective embodiments of the present invention have been described above, technical ideas other than the claims which can be understood from the above embodiments will be described below along with their effects. (A) In the MPEG audio reproducing apparatus according to any one of claims 3 to 6, a D / A converter (5) for D / A converting an audio signal and an audio amplifier for amplifying an output of the D / A converter. MP with (6)
EG audio playback device.

【０１６１】このようにすれば、ディジタルのオーディ
オ信号からスピーカを駆動するためのアナログ信号を生
成することができる。（ロ）請求項２，７〜１１のいずれか１項に記載のＭＰ
ＥＧ再生装置において、記録媒体（２１）から読み出さ
れたＭＰＥＧシステムストリームを、ＭＰＥＧオーディ
オストリームとＭＰＥＧビデオストリームとに分離する
デマルチプレクサ（１３）を備えたＭＰＥＧ再生装置。By doing so, an analog signal for driving the speaker can be generated from the digital audio signal. (B) The MP according to any one of claims 2 and 7 to 11.
An EG reproducing apparatus, which is provided with a demultiplexer (13) for separating an MPEG system stream read from a recording medium (21) into an MPEG audio stream and an MPEG video stream.

【０１６２】このようにすれば、オーディオデコーダへ
オーディオストリームを、ビデオデコーダへビデオスト
リームをそれぞれ転送することができる。By doing so, it is possible to transfer the audio stream to the audio decoder and the video stream to the video decoder, respectively.

【０１６３】[0163]

【発明の効果】請求項１，３〜６のいずれか１項に記載
の発明によれば、可変速再生時においても自然で聞き易
い音声を再生することが可能なＭＰＥＧオーディオ再生
装置を提供することができる。According to the invention described in any one of claims 1 to 3, there is provided an MPEG audio reproducing apparatus capable of reproducing a natural and easy-to-listen sound even during variable speed reproduction. be able to.

【０１６４】請求項２または請求項７に記載の発明によ
れば、可変速再生時においても自然で聞き易い音声を再
生することが可能なＭＰＥＧオーディオ再生装置とＭＰ
ＥＧビデオデコーダとを備えたＭＰＥＧ再生装置を提供
することができる。According to the second or seventh aspect of the invention, an MPEG audio reproducing apparatus and an MP capable of reproducing a natural and easy-to-listen sound even during variable speed reproduction.
It is possible to provide an MPEG playback device including an EG video decoder.

【０１６５】請求項８〜１１のいずれか１項に記載の発
明によれば、可変速再生時においても自然で聞き易い音
声を再生することが可能なＭＰＥＧオーディオ再生装置
とＭＰＥＧビデオデコーダとを備え、音声と動画との時
間ずれを低減することが可能なＭＰＥＧ再生装置を提供
することができる。According to the invention described in any one of claims 8 to 11, it is provided with an MPEG audio reproducing device and an MPEG video decoder capable of reproducing a natural and easily audible sound even during variable speed reproduction. Thus, it is possible to provide an MPEG playback device capable of reducing the time lag between audio and moving images.

[Brief description of drawings]

【図１】第１実施形態のブロック回路図。FIG. 1 is a block circuit diagram of a first embodiment.

【図２】第１実施形態の要部ブロック回路図。FIG. 2 is a block circuit diagram of a main part of the first embodiment.

【図３】第１実施形態の作用を説明するための模式図。FIG. 3 is a schematic diagram for explaining the operation of the first embodiment.

【図４】第１実施形態の作用を説明するための模式図。FIG. 4 is a schematic diagram for explaining the operation of the first embodiment.

【図５】第２実施形態の要部ブロック回路図。FIG. 5 is a block circuit diagram of a main part of the second embodiment.

【図６】第３実施形態の要部ブロック回路図。FIG. 6 is a block circuit diagram of an essential part of a third embodiment.

【図７】第４実施形態の要部ブロック回路図。FIG. 7 is a block circuit diagram of an essential part of a fourth embodiment.

【図８】第５実施形態の要部ブロック回路図。FIG. 8 is a block circuit diagram of an essential part of a fifth embodiment.

【図９】第６実施形態の要部ブロック回路図。FIG. 9 is a block circuit diagram of an essential part of a sixth embodiment.

【図１０】第７実施形態の要部ブロック回路図。FIG. 10 is a block circuit diagram of an essential part of a seventh embodiment.

【図１１】第８実施形態の要部ブロック回路図。FIG. 11 is a block circuit diagram of an essential part of an eighth embodiment.

【図１２】ＭＰＥＧビデオデコーダの要部ブロック回路
図。FIG. 12 is a block circuit diagram of a main part of an MPEG video decoder.

【図１３】ＭＰＥＧビデオデコーダの動作を説明するた
めのグラフ。FIG. 13 is a graph for explaining the operation of the MPEG video decoder.

【図１４】ＭＰＥＧビデオデコーダの動作を説明するた
めのグラフ。FIG. 14 is a graph for explaining the operation of the MPEG video decoder.

[Explanation of symbols]

１…ＭＰＥＧオーディオ再生装置２…話速変換手段としての再生速度検出回路３…ＭＰＥＧオーディオデコーダ４…話速変換手段としての話速変換処理回路１２…ＭＰＥＧビデオデコーダ２１…記録媒体３２…リングメモリ３３…検出手段としてのアップダウンカウンタ４１…音声判別部４２…無音削除挿入部４３…時間軸圧縮伸長部５１…インデックス付加回路５２…インデックス検出回路５３，５５…遅延時間検出回路５４…制御回路 DESCRIPTION OF SYMBOLS 1 ... MPEG audio reproducing apparatus 2 ... Reproduction speed detecting circuit as speech speed converting means 3 ... MPEG audio decoder 4 ... Speech speed conversion processing circuit as speech speed converting means 12 ... MPEG video decoder 21 ... Recording medium 32 ... Ring memory 33 ... up-down counter 41 as detection means 41 ... voice discrimination section 42 ... silence deletion insertion section 43 ... time axis compression / expansion section 51 ... index addition circuit 52 ... index detection circuit 53, 55 ... delay time detection circuit 54 ... control circuit

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｈ０４Ｎ 7/24 Ｈ０４Ｎ 7/13 Ｚ (72)発明者田中浩司大阪府守口市京阪本通２丁目５番５号三洋電機株式会社内─────────────────────────────────────────────────── ─── Continuation of front page (51) Int.Cl. ⁶ Identification number Internal reference number FI Technical display location H04N 7/24 H04N 7/13 Z (72) Inventor Koji Tanaka 2-chome, Keihanhondori, Moriguchi-shi, Osaka No. 5-5 Sanyo Electric Co., Ltd.

Claims

[Claims]

1. An MPEG audio decoder (3),
An MPEG audio reproducing apparatus provided with a voice speed conversion processing means (2, 4) for performing a voice speed conversion process on the output.

2. An MPEG audio decoder (3),
An MPEG playback device comprising a speech speed conversion processing means (2, 4) for performing a speech speed conversion processing on the output, and an MPEG video decoder (3).

3. An MP read from a recording medium (21)
MP that decodes the EG audio stream according to the MPEG audio part and generates an audio signal
An EG audio decoder (3) and a voice speed conversion processing unit (2, 4) for performing a voice speed conversion process on the audio signal are provided, and the voice speed conversion processing unit has a bit rate of the audio stream higher than that of a normal time. If it is large, the pitch of the reproduced voice is set to be almost the same as that during normal reproduction, and the voice speed is converted so that the reproduced voice speed is close to that during normal reproduction. An MPEG audio reproducing apparatus that performs a voice speed conversion process so that a break in a voice section becomes inconspicuous when it is smaller than time.

4. An MP read from a recording medium (21)
MP that decodes the EG audio stream according to the MPEG audio part and generates an audio signal
An EG audio decoder (3) and a voice speed conversion processing unit (2, 4) for performing a voice speed conversion process on the audio signal are provided, and the voice speed conversion processing unit has a bit rate of the audio stream higher than that of a normal time. If it is large, the speech speed conversion process is performed by extending the time length of the reproduced voice section and shortening the time length of each silent section, and the bit rate of the audio stream is smaller than normal. , The time length of each voice section to be played back is extended and the time length of each silence section is shortened, or each silence section is deleted and the silence section is inserted and then the silence section is inserted. An MPEG audio reproducing apparatus for performing speech rate conversion processing as described above.

5. The MPE according to claim 3 or 4.
In the G audio reproducing apparatus, the speech speed conversion processing means (2, 4) comprises a ring memory (32) for accumulating an audio signal and a detecting means (33) for detecting an accumulation amount of the ring memory. An MPEG audio reproducing apparatus that adjusts a compression / expansion rate of a time length of a voice section according to a storage amount.

6. The MPEG audio reproducing apparatus according to claim 5, wherein the voice speed conversion processing means (2, 4) includes a voice discriminating section (41) for discriminating a voice section and a silence section of the audio signal, and a silence. A silence deletion insertion unit (42) that performs a deletion process or an insertion process of a section, and a time axis compression / expansion unit that adjusts the compression / expansion rate by performing a compression / expansion process of a voice section based on the accumulated amount of the ring memory (32). (43) An MPEG audio reproducing apparatus comprising:

7. M according to any one of claims 3 to 6.
The MPEG video stream read from the PEG audio reproducing device (1) and the recording medium (21) is decoded in accordance with the MPEG video part,
MPEG video decoder for generating video signal (12)
An MPEG playback device comprising:

8. The MPE according to claim 5 or claim 6.
The G audio playback device (1) and the MPEG video stream read from the recording medium (21) are decoded in accordance with the MPEG video part,
MPEG video decoder for generating video signal (12)
And an index adding circuit (51) for adding an index signal as time-related information to the audio signal before being written in the ring memory (32) and an audio signal read from the ring memory (32). The index delay signal is detected, the signal delay time in the speech speed conversion processing means (2, 4) is detected from the time information obtained from the index signal and the current time information, and the signal indicating the detected delay time is detected. To MP
An MPEG video decoder (12) is provided with an index detection circuit (52) that supplies the EG video decoder (12), and the MPEG video decoder (12) controls the timing of its own operation based on the signal indicating the delay time.
PEG playback device.

9. The MPEG audio reproducing apparatus (1) according to claim 6 and an MPEG video stream read from a recording medium (21) are decoded in accordance with an MPEG video part,
MPEG video decoder for generating video signal (12)
A signal delay time in the speech speed conversion processing means (2, 4) is detected based on the processing result of the audio discriminating unit (41) and the bit rate of the audio stream, and a signal indicating the detected delay time. MPEG video decoder (1
2) and a delay time detection circuit (53) for supplying the M video signal to the MPEG video decoder (12), which controls the timing of its own operation based on the signal indicating the delay time.
PEG playback device.

10. The MPEG audio reproducing apparatus (1) according to claim 6 and an MPEG video stream read from a recording medium (21) are decoded in accordance with an MPEG video part,
MPEG video decoder for generating video signal (12)
And a control signal for obtaining synchronization between the audio signal and the video signal, which have been subjected to the speech speed conversion processing, is generated based on the storage amount of the ring memory (32), and the control signal is supplied to the MPEG video decoder (12). And a control circuit (54) for controlling the operation, wherein the MPEG video decoder (12) controls the timing of its own operation based on the control signal.

11. An MPEG audio reproducing apparatus (1) according to claim 6 and an MPEG video stream read from a recording medium (21) are decoded in accordance with an MPEG video part,
MPEG video decoder for generating video signal (12)
Then, the signal delay time in the speech speed conversion processing means (2, 4) is detected based on the processing results of the voice discrimination unit (41) and the time axis compression / decompression unit (43), and the detected delay time is shown. A delay time detection circuit (55) for supplying a signal to the MPEG video decoder (12), wherein the MPEG video decoder (12) controls the timing of its own operation based on the signal indicating the delay time.
PEG playback device.