JP4826034B2

JP4826034B2 - Content receiving method, content reproducing method, content receiving apparatus and content reproducing apparatus

Info

Publication number: JP4826034B2
Application number: JP2001190663A
Authority: JP
Inventors: 雅美三浦
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2001-06-25
Filing date: 2001-06-25
Publication date: 2011-11-30
Anticipated expiration: 2021-06-25
Also published as: JP2003009099A

Description

【０００１】
【発明の属する技術分野】
この発明は、放送やネットワークによって配信された、少なくとも音声情報を含む一連の情報であるコンテンツを、受信する方法および装置、および、光ディスクなどの記録媒体から、少なくとも音声情報を含む一連の情報であるコンテンツを、再生する方法および装置に関する。
【０００２】
なお、この発明では、映像情報（画像情報）と音声情報（音響情報）、または音声情報のみなど、少なくとも音声情報を含む一連の情報をコンテンツと定義する。音声情報は、人の話声（発話音声）、音楽の音響、自然音や物音の音響など、人が聴覚上認識できる全ての音声（音響）を含むものである。
【０００３】
【従来の技術】
コンテンツ、例えば映像情報と音声情報を含むコンテンツを、受信または再生する装置は、テレビ受信装置やＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ）再生装置などに見られるように、音量調整機能や、音量が所定レベルを超えた場合に対する保護機能を備え、適度な音量に調整でき、大音量から機器や聴覚を保護するように構成されている。
【０００４】
【発明が解決しようとする課題】
コンテンツ中の音声情報には、人の話声、音楽の音響、自然音や物音の音響など、様々な音声（音響）が含まれ、また、話声にも、遠くからの小さな声や、近くからの大きな声など、様々な声がある。
【０００５】
しかしながら、このように様々な種類やレベルの音声情報を含むコンテンツを受信または再生する場合であっても、従来の受信装置または再生装置は、音量レベルを適当なレベルに固定してコンテンツを最初から最後まで出力するのが、ほとんどであり、視聴者（ユーザ）が、音量が小さすぎると、または大きすぎると、感じたときにのみ、音量設定を変えることができるように構成されている。
【０００６】
そのため、十分に静寂な視聴環境でないと、または若年で広いダイナミックレンジに対応できる聴覚をもつ視聴者でないと、コンテンツ中の音声情報を十分に楽しむことができない。
【０００７】
すなわち、騒音のある視聴環境であると、騒音によって小さな音が妨害されてしまい、特に、小さな声では、何を言っているのかが分からなくなる。それに対処しようとして、音量を大きくすると、声は大きく、聞きやすくなるが、音楽が流れている期間など、その他の時には、うるさくなる。
【０００８】
また、視聴者が高齢者で、小さい音が聞こえにくく、さらに、大きい音が聴覚的に歪んでしまうような老人性難聴である場合には、音声レベルに応じて常に音量調整をしないと、話声が聞こえにくいという不便さがある。
【０００９】
音声の種類に応じて音声を選択的に増幅することも考えられているが、話声とその他の音声とを、正確に区別することが難しく、誤って認識することがあるとともに、老人性難聴のような聴覚の視聴者にとっては、話声を単純に増幅するだけでは聞き取りやすさが改善されないため、十分な対応を期待することができない。
【００１０】
そこで、この発明は、少なくとも音声情報を含む一連の情報であるコンテンツを受信または再生する場合に、コンテンツ中の音声情報自体から話声とその他の音声を区別しなくても、話声を聞き取りやすくすることができるようにしたものである。
【００１１】
【課題を解決するための手段】
この発明のコンテンツ受信方法では、少なくとも音声情報を含む一連の情報であるコンテンツに、このコンテンツ中の音声情報の話声が主体または話声のみの期間である話声期間を示す話声期間識別情報が多重化された多重化信号を受信し、当該受信した多重化信号から、コンテンツおよび話声期間識別情報を分離し、当該多重化信号から分離したコンテンツを再生して、音声信号を再生すると共に、多重化信号から分離した話声期間識別情報によって特定される話声期間において、再生した音声信号を、低入力レベル領域での入出力レベル変換を定める設定値と高入力レベル領域での入出力レベル変換を定める設定値とで設定されるレベル変換特性によってレベル変換し、操作入力部を介して音量調整の指示が行われると、当該指示に従ってレベル変換特性における低入力レベル領域と高入力レベル領域の境界点の入出力レベルを変更して、操作入力部を介して音量の調整状態を保存する指示が行われると、そのときの境界点の入出力レベルと、そのときのコンテンツを示すコンテンツ識別コード及びそのときのシーンを示すシーン識別コードとを対応付けて記憶手段に記録しておき、多重化信号を再び受信してコンテンツおよび話声期間識別情報を分離し、当該多重化信号から分離したコンテンツを再生して、音声信号を再生すると、当該再生しているコンテンツ及びシーンと対応付けられている境界点の入出力レベルを記憶手段から読み出し、多重化信号から分離した話声期間識別情報によって特定される話声期間において、再生した音声信号を、読み出した境界点の入出力レベルによって定められるレベル変換特性によってレベル変換するようにした。
【００１２】
この発明のコンテンツ再生方法では、少なくとも音声情報を含む一連の情報であるコンテンツに、このコンテンツ中の音声情報の話声が主体または話声のみの期間である話声期間を示す話声期間識別情報が多重化された多重化信号を記録媒体から読み取り、当該読み取った多重化信号から、コンテンツおよび話声期間識別情報を分離し、当該多重化信号から分離したコンテンツを再生して、音声信号を再生すると共に、多重化信号から分離した話声期間識別情報によって特定される話声期間において、再生した音声信号を、低入力レベル領域での入出力レベル変換を定める設定値と高入力レベル領域での入出力レベル変換を定める設定値とで設定されるレベル変換特性によってレベル変換し、操作入力部を介して音量調整の指示が行われると、当該指示に従って、レベル変換特性における低入力レベル領域と高入力レベル領域の境界点の入出力レベルを変更して、操作入力部を介して音量の調整状態を保存する指示が行われると、そのときの境界点の入出力レベルと、そのときのコンテンツを示すコンテンツ識別コード及びそのときのシーンを示すシーン識別コードとを対応付けて記憶手段に記録しておき、多重化信号を記録媒体から再び読み取ってコンテンツおよび話声期間識別情報を分離しし、当該多重化信号から分離したコンテンツを再生して、音声信号を再生すると、当該再生しているコンテンツ及びシーンと対応付けられている境界点の入出力レベルを記憶手段から読み出し、多重化信号から分離した話声期間識別情報によって特定される話声期間において、再生した音声信号を、読み出した境界点の入出力レベルによって定められるレベル変換特性によってレベル変換するようにした。
【００１３】
【発明の実施の形態】
〔受信装置および再生装置のシステム構成…図１および図２〕
（受信装置のシステム構成…図１）
図１は、この発明のコンテンツ受信装置の一実施形態を示し、デジタルテレビ放送を受信する放送受信装置の場合である。
【００１４】
この場合の放送は、コンテンツが番組の映像情報および音声情報を含むものであり、そのコンテンツに付加情報が多重化されたものである。付加情報は、コンテンツ（番組）を識別する情報であるコンテンツ識別コード、番組の各シーンまたは特定のシーンを識別する情報であるシーン識別コード、および番組の音声情報の話声が主体または話声のみの期間である話声期間を示す話声期間識別情報などである。
【００１５】
具体的に、映像データおよび音声データが、ＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ）方式などによって圧縮符号化され、多重化されるとともに、付加情報データが、符号化され、映像音声データストリームとは別に多重化されて、全体が変調されて放送される。
【００１６】
付加情報データは、付加情報がコンテンツ識別コードであるか、シーン識別コードであるか、話声期間識別情報であるかなど、付加情報の種別を示すコードを有するヘッダ部と、これに続く、コンテンツ識別コード、シーン識別コード、話声期間識別情報などのデータ部とからなるものとされる。
【００１７】
また、放送信号には、デコード・タイムスタンプなどのデコード時刻情報、およびプレゼンテーション・タイムスタンプなどのコンテンツ呈示時刻情報が多重化される。
【００１８】
選局受信部１９では、操作入力部１６での視聴者の選局操作に基づくシステムコントローラ１７の選局制御によって、放送信号が選局受信される。その選局受信された信号は、復調エラー訂正部２１で復調され、エラー訂正された後、バッファ２２に書き込まれ、バッファ２２から読み出される。
【００１９】
バッファ２２から読み出された信号は、デマルチプレクサ２３に供給され、デマルチプレクサ２３から、それぞれ符号化された映像データ、字幕データ、音声データおよび付加情報データが、分離されて得られる。
【００２０】
その映像データ、字幕データおよび音声データは、それぞれ、ビデオコードバッファ３１、字幕コードバッファ４１およびオーディオコードバッファ５１に書き込まれ、ビデオコードバッファ３１、字幕コードバッファ４１およびオーディオコードバッファ５１から読み出された後、それぞれ、ビデオデコーダ３２、字幕デコーダ４２およびオーディオデコーダ５２でデコードされる。
【００２１】
システムコントローラ１７は、上記のデコード・タイムスタンプのようなタイミング情報に基づいて、各デコーダ３２，４２，５２におけるデコードタイミングを制御し、上記のプレゼンテーション・タイムスタンプのようなタイミング情報に基づいて、各デコーダ３２，４２，５２からのデータの時系列を整合させるように、各デコーダ３２，４２，５２における出力タイミングを制御する。
【００２２】
ビデオデコーダ３２からの映像データ、および字幕デコーダ４２からの字幕データは、映像処理部３３で処理され、映像処理部３３において、映像信号中に字幕信号がスーパーインポーズされる。
【００２３】
映像処理部３３の出力の映像信号は、映像出力端子３４に導出され、映像出力端子３４から、ＣＲＴディスプレイや液晶ビデオプロジェクタなどの映像表示装置３５に送出される。
【００２４】
映像信号は、映像処理部３３でアナログ映像信号に変換されることなくデジタル映像データのまま、Ｄ／Ａ（ＤｉｇｉｔａｌｔｏＡｎａｌｏｇ）変換部を備える映像表示装置３５に送出され、または映像処理部３３でアナログ映像信号に変換されて、映像表示装置３５に送出される。
【００２５】
オーディオデコーダ５２からの音声データは、音声処理部５３で処理され、音声処理部５３の出力の音声信号は、音声出力端子５４に導出され、音声出力端子５４から、スピーカやヘッドホンなどの音声出力装置５５に送出される。
【００２６】
音声信号も、音声処理部５３でアナログ音声信号に変換されることなくデジタル音声データのまま、Ｄ／Ａ変換部を備える音声出力装置５５に送出され、または音声処理部５３でアナログ音声信号に変換されて、音声出力装置５５に送出される。
【００２７】
デマルチプレクサ２３で分離された付加情報データは、付加情報コードバッファ６１に書き込まれ、付加情報コードバッファ６１から読み出された後、話声期間識別情報検出部６２および識別コード検出部６３に送出される。
【００２８】
話声期間識別情報検出部６２では、付加情報データ中のヘッダ部の種別コードによって、話声期間識別情報が検出され、その検出された話声期間識別情報は、システムコントローラ１７に取り込まれて、デコードされる。
【００２９】
識別コード検出部６３では、付加情報データ中のヘッダ部の種別コードによって、コンテンツ識別コードおよびシーン識別コードが検出され、その検出されたコンテンツ識別コードおよびシーン識別コードは、システムコントローラ１７に取り込まれて、デコードされる。
【００３０】
システムコントローラ１７は、話声期間識別情報が検出されたとき、映像処理部３３を制御して、そのシーンが話声期間であることを、映像表示装置３５の表示画面上に呈示する。例えば、そのシーンの映像中に、話声期間であることを示すマークまたは文字をスーパーインポーズする。
【００３１】
視聴者は、その表示を見て、音声を聞き取りやすくしたいときには、操作入力部１６での操作によって、後述のように音声を聞き取りやすくするような音量調整をする。これによって、システムコントローラ１７は、音声処理部５３を制御して、後述のように音声が聞き取りやすくなるような音声処理を行わせる。
【００３２】
話声期間であることの呈示は、視聴者が操作入力部１６での設定操作によってオン・オフを切り替えられるように、受信装置を構成してもよい。さらに、話声期間であることが呈示されないように、受信装置を構成してもよい。
【００３３】
呈示オフに切り替えられている状態でも、または話声期間であることが呈示されない構成とされた場合でも、視聴者が、音声出力装置５５から出力される音声を聞いて、音声を聞き取りやすくするような音量調整をしたときには、システムコントローラ１７は、音声処理部５３を制御して、音声が聞き取りやすくなるような音声処理を行わせる。
【００３４】
話声期間において、視聴者が特に音量調整をしない場合には、システムコントローラ１７は、後述のように、あらかじめ設定された音声処理パラメータによって音声処理を行うように音声処理部５３を制御する。
【００３５】
システムコントローラ１７には、後述のように視聴者の音量調整による音声処理パラメータが書き込まれる記憶装置１８が接続される。記憶装置１８は、受信装置に内蔵されたフラッシュメモリやＥＥＰＲＯＭなどの不揮発性メモリ、またはメモリカード、磁気ディスク、光ディスク、光磁気ディスクなどの外部記憶媒体とされる。
【００３６】
なお、図１では、話声期間識別情報検出部６２、識別コード検出部６３およびシステムコントローラ１７を、機能的に分離して示しているが、話声期間識別情報検出部６２および識別コード検出部６３の機能は、システムコントローラ１７の一部の機能として構成することもできる。
【００３７】
（再生装置のシステム構成…図２）
図２は、この発明のコンテンツ再生装置の一実施形態を示し、光ディスク再生装置の場合である。
【００３８】
光ディスク１１には、コンテンツに多重化されて、付加情報が記録されている。この場合のコンテンツは、映像情報および音声情報を含むものであり、付加情報は、上述したコンテンツ識別コード、シーン識別コードおよび話声期間識別情報などである。
【００３９】
具体的に、映像データおよび音声データが、ＭＰＥＧ方式などによって圧縮符号化され、多重化されるとともに、付加情報データが、符号化され、映像音声データストリームとは別に多重化されて、全体が変調されて、光ディスク１１に記録されている。
【００４０】
付加情報データは、上述したように、付加情報の種別を示すコードを有するヘッダ部と、これに続く、コンテンツ識別コード、シーン識別コード、話声期間識別情報などのデータ部とからなるものとされる。
【００４１】
また、光ディスク１１には、デコード・タイムスタンプなどのデコード時刻情報、およびプレゼンテーション・タイムスタンプなどのコンテンツ呈示時刻情報が記録されている。
【００４２】
光ディスク１１は、ディスクモータ１３によって駆動される。光ヘッド（ピックアップ）１４は、送りモータとトラッキング用およびフォーカシング用の２軸アクチュエータを含むドライブユニット１５によって駆動される。
【００４３】
操作入力部１６での視聴者の再生操作によって、システムコントローラ１７は、ドライブユニット１５に光ディスク１１の再生を指示し、光ヘッド１４によって、光ディスク１１から信号が読み出される。その読み出された信号は、復調エラー訂正部２１で復調され、エラー訂正された後、バッファ２２に書き込まれ、バッファ２２から読み出される。
【００４４】
バッファ２２から読み出された信号は、デマルチプレクサ２３に供給され、デマルチプレクサ２３から、それぞれ符号化された映像データ、字幕データ、音声データおよび付加情報データが、分離されて得られる。その他は、図１の放送受信装置と同じである。
【００４５】
〔音声処理および音量調整…図３〜図９〕
上述したように、図１の放送受信装置または図２の光ディスク再生装置では、視聴者が特に音量調整をしない場合、話声期間識別情報によって特定される話声期間では、あらかじめ設定された音声処理パラメータによって音声処理が行われる。
【００４６】
図３に、この場合の音声処理特性、すなわち音声処理部５３の入出力特性を示す。この入出力特性は、点Ａ０（Ａｘ０，Ａｙ０）および点Ｂ０（Ｂｘ０，Ｂｙ０）で屈折（屈曲）する非線形のレベル変換特性とし、Ａｘ０からＢｘ０の間の入力レベルをＡｙ０からＢｙ０の間の出力レベルに変換するとともに、入力音声信号がＢｘ０以上のレベルになるときには、出力音声信号をＢｙ０のレベルにクリップするものである。
【００４７】
以下では、点Ａ０のような低レベル側の屈折点Ａをニーポイント、点Ｂ０のような高レベル側の屈折点Ｂを上限ポイントとする。
【００４８】
このようなレベル変換特性によれば、後述のように、視聴者が音量調整をしたとき、それに応じてシステムコントローラ１７がニーポイントＡを、すなわちニーポイントＡの入出力レベルを変更することによって、視聴者は単に音量の大小を指示するだけで、音声を聞き取りやすくすることができる。
【００４９】
例えば、ニーポイントＡを点Ａ０から低入力レベル方向の点Ａ１に移動させれば、低入力レベル領域での増幅率を、より大きくすることができる。
【００５０】
逆に、ニーポイントＡを点Ａ０から高入力レベル方向の点Ａ２に移動させると、低入力レベル領域での増幅率が小さくなり、出力レベルがＡｙ０に達しない入力レベル領域が広がることになる。
【００５１】
また、ニーポイントＡを点Ａ０から高出力レベル方向の点Ａ３に移動させれば、低入力レベル領域での増幅率を、より大きくすることができる。
【００５２】
他方で、図４に示すように、上限ポイントＢを点Ｂ０から高出力レベル方向の点Ｂ１に移動させれば、高入力レベル領域での増幅率を、大きくすることができる。また、上限ポイントＢを点Ｂ０から高入力レベル方向の点Ｂ２に移動させれば、高入力レベル領域での増幅率を、より小さくすることができ、より大きい入力レベルの音声信号でも、クリップされないようにすることができる。
【００５３】
以上のように、ニーポイントＡと上限ポイントＢの間で出力レベルが好ましい範囲（Ａｙ０からＢｙ０の間）に変換されるので、通常、音声信号レベルは、ニーポイントＡと上限ポイントＢの間になるようにする。
【００５４】
音声信号レベルを信号の実効値とすると、音声信号のピーク値と音声信号レベルとの差は、音声信号のピークファクタと呼ばれ、通常、１５デシベル程度ある。音声を明瞭に聞き取るためには、３０〜４０デシベル程度のダイナミックレンジが必要であるので、このピークファクタを考慮して、ニーポイントＡの入力レベルは、入力音声信号レベルより少なくても１５〜２５デシベル程度小さいレベルであることが望ましい。以下の例は、ニーポイントＡの入力レベルを、入力音声信号レベルより２０デシベル小さいレベルに設定する場合である。
【００５５】
操作入力部１６で音量を大きくする操作がなされたとき、システムコントローラ１７は、音声処理部５３での入力音声信号レベルを算出する。この場合の音声信号レベルは、瞬時レベルではなく、コンマ数秒から数秒程度というような短い時間の平均レベルである。
【００５６】
そして、入力音声信号レベルが、図５のレベルＶＬ１で示すように、点Ａ０より２０デシベル以上大きいときには、システムコントローラ１７は、ニーポイントＡを、点Ａ０から、音量を大きくする指示がなくなるまで、点Ａ３で示すように高出力レベル方向に移動させ、あるいは、点Ａ０から、そのときの特性曲線上の、算出された入力音声信号レベルＶＬ１より２０デシベル小さい点Ａ４に移動させた上で、音量を大きくする指示がなくなるまで、点Ａ５で示すように高出力レベル方向に移動させる。
【００５７】
入力音声信号レベルが、図６のレベルＶＬ２で示すように、点Ａ０より２０デシベル以上大きくないときには、システムコントローラ１７は、ニーポイントＡを、点Ａ０から、低入力レベル方向に、算出された入力音声信号レベルＶＬ２より２０デシベル小さい点Ａ６に移動させた上で、音量を大きくする指示がなくなるまで、点Ａ７で示すように高出力レベル方向に移動させる。
【００５８】
操作入力部１６で音量を小さくする操作がなされたときも、システムコントローラ１７は、音声処理部５３での入力音声信号レベルを算出する。
【００５９】
そして、入力音声信号レベルが、図７のレベルＶＬ１で示すように、点Ａ０より２０デシベル以上大きいときには、システムコントローラ１７は、ニーポイントＡを、点Ａ０から、そのときの特性曲線上の、算出された入力音声信号レベルＶＬ１より２０デシベル小さい点Ａ４に移動させた上で、音量を小さくする指示がなくなるまで、点Ａ８で示すように低出力レベル方向に移動させ、あるいは、点Ａ０から、音量を小さくする指示がなくなるまで、点Ａ９で示すように低出力レベル方向に移動させる。
【００６０】
入力音声信号レベルが、図８のレベルＶＬ２で示すように、点Ａ０より２０デシベル以上大きくないときには、システムコントローラ１７は、ニーポイントＡを、点Ａ０から、そのときの特性曲線上の、算出された入力音声信号レベルＶＬ２より２０デシベル小さい点Ａ１０に移動させた上で、音量を小さくする指示がなくなるまで、点Ａ１１で示すように低出力レベル方向に移動させ、あるいは、点Ａ０から、音量を小さくする指示がなくなるまで、点Ａ１２で示すように低出力レベル方向に移動させる。
【００６１】
音声信号は、放送前または記録時、あらかじめ音量調整されているので、音声レベルが大きすぎることや、小さすぎることは少ない。したがって、ニーポイントＡの移動は、限られた範囲にすることができ、あらかじめ、その範囲を設定しておくことができる。
【００６２】
また、このニーポイントＡの移動範囲が、受信または再生された音声信号の基準信号レベルに応じて変更されるように、受信装置または再生装置を構成することもできる。音声信号の基準信号レベルは、付加情報として放送信号または記録信号に多重化されてもよく、あるいは受信装置または再生装置で１０秒程度というような長時間の平均レベルとして検出してもよい。
【００６３】
また、視聴者が自分の聴力に応じてニーポイントＡの移動範囲を変更できるように、受信装置または再生装置を構成することもできる。
【００６４】
同様に、上限ポイントＢも、受信または再生された音声信号の基準信号レベルや視聴者の聴力に応じて変更できるように、受信装置または再生装置を構成することができる。
【００６５】
視聴者は、音量調整をしたとき、操作入力部１６での操作によって、その調整状態を保存すべきことを、システムコントローラ１７に指示することができる。システムコントローラ１７は、保存を指示されたとき、そのときのニーポイントＡ（Ａｘ，Ａｙ）および上限ポイントＢ（Ｂｘ，Ｂｙ）を、音声処理パラメータとして記憶装置１８に記録する。
【００６６】
これによれば、別の放送番組を受信し、別のディスクを再生するなど、別のコンテンツを受信または再生するとき、システムコントローラ１７が、記憶装置１８から、その音声処理パラメータを読み出して、音声処理部５３での音声処理を制御することによって、視聴者は、別のコンテンツを視聴する際、改めて音量調整をしなくても、聞き取りやすい音量で音声を聞き取ることができる。
【００６７】
また、記憶装置１８としてメモリカードを用い、その音声処理パラメータが記録されたメモリカードを、別の受信装置または再生装置に装着して、別の受信装置または再生装置のシステムコントローラに、その音声処理パラメータを読み込ませることによって、別の受信装置または再生装置でも、視聴者は、音量調整をすることなく、聞き取りやすい音量で音声を聞き取ることができる。
【００６８】
さらに、図９に示すように、音声処理パラメータとしてのニーポイントＡ（Ａｘ，Ａｙ）および上限ポイントＢ（Ｂｘ，Ｂｙ）が、コンテンツ識別コードと対応づけられて、さらに同一コンテンツ内でシーンや期間によって異なる音量調整がなされたときには、そのシーンや期間を特定するシーン識別コードやプレゼンテーションタイムスタンプと対応づけられて、記憶装置１８に記録されるように構成することもできる。
【００６９】
これによれば、同じコンテンツやシーンを受信または再生するとき、システムコントローラ１７が、記憶装置１８から、そのコンテンツやシーンに対応する音声処理パラメータを読み出して、音声処理部５３での音声処理を制御することによって、視聴者は、同じコンテンツやシーンを視聴する際、改めて音量調整をしなくても、そのコンテンツやシーンに最適な音量で音声を聞き取ることができる。
【００７０】
上述した例は、特に話声期間で、音声処理を制御し、音量を調整する場合であるが、話声期間以外の期間、例えば、音楽が主体または音楽のみの期間についても、例えば、話声期間で得られた音声処理パラメータによって、話声期間と同様に、音声処理が制御され、音量が調整されるようにすることができる。この場合には、話声期間以外の期間、例えば、音楽が主体または音楽のみの期間についても、視聴者は聞きやすい音量で音声を聞くことができる。
【００７１】
〔他の実施形態〕
放送側で、または光ディスクなどの記録媒体への記録時、音声情報については、例えば、話声が主体または話声のみの期間である話声期間、音楽が主体または音楽のみの期間である音楽期間、および話声に対してＢＧＭ（ＢａｃｋｇｒｏｕｎｄＭｕｓｉｃ）が合成された期間であるＢＧＭ期間を区別し、それぞれの期間を示す識別情報を多重化して、コンテンツを放送または記録し、受信装置または再生装置では、視聴者によって音量調整がなされない状態では、話声期間、音楽期間およびＢＧＭ期間で、あらかじめ設定された、それぞれの期間に適する音声処理パラメータによって音声処理が制御されるように、システムを構成することもできる。
【００７２】
また、上述した実施形態は、コンテンツが映像情報と音声情報を含む場合であるが、この発明は、コンテンツが映像情報を含まない場合にも適用することができる。
【００７３】
【発明の効果】
上述したように、この発明によれば、コンテンツ受信方法において、少なくとも音声情報を含む一連の情報であるコンテンツに、このコンテンツ中の音声情報の話声が主体または話声のみの期間である話声期間を示す話声期間識別情報が多重化された多重化信号を受信し、当該受信した多重化信号から、コンテンツおよび話声期間識別情報を分離し、当該多重化信号から分離したコンテンツを再生して、音声信号を再生すると共に、多重化信号から分離した話声期間識別情報によって特定される話声期間において、再生した音声信号を、低入力レベル領域での入出力レベル変換を定める設定値と高入力レベル領域での入出力レベル変換を定める設定値とで設定されるレベル変換特性によってレベル変換し、操作入力部を介して音量調整の指示が行われると、当該指示に従ってレベル変換特性における低入力レベル領域と高入力レベル領域の境界点の入出力レベルを変更して、操作入力部を介して音量の調整状態を保存する指示が行われると、そのときの境界点の入出力レベルと、そのときのコンテンツを示すコンテンツ識別コード及びそのときのシーンを示すシーン識別コードとを対応付けて記憶手段に記録しておき、多重化信号を再び受信してコンテンツおよび話声期間識別情報を分離し、当該多重化信号から分離したコンテンツを再生して、音声信号を再生すると、当該再生しているコンテンツ及びシーンと対応付けられている境界点の入出力レベルを記憶手段から読み出し、多重化信号から分離した話声期間識別情報によって特定される話声期間において、再生した音声信号を、読み出した境界点の入出力レベルによって定められるレベル変換特性によってレベル変換するようにしたことにより、コンテンツ中の音声情報自体から話声とその他の音声を区別しなくても、話声を聞き取りやすくすることができると共に、視聴者が音量調整をしたときの調整結果を音声処理パラメータとして保存しておくことで、同じコンテンツを再び視聴するときの音声処理パラメータとして使用することができ、視聴者は一回の音量調整によって、常に聞き取りやすい音量で音声を聞き取ることができる。
【００７４】
また、この発明によれば、コンテンツ再生方法において、少なくとも音声情報を含む一連の情報であるコンテンツに、このコンテンツ中の音声情報の話声が主体または話声のみの期間である話声期間を示す話声期間識別情報が多重化された多重化信号を記録媒体から読み取り、当該読み取った多重化信号から、コンテンツおよび話声期間識別情報を分離し、当該多重化信号から分離したコンテンツを再生して、音声信号を再生すると共に、多重化信号から分離した話声期間識別情報によって特定される話声期間において、再生した音声信号を、低入力レベル領域での入出力レベル変換を定める設定値と高入力レベル領域での入出力レベル変換を定める設定値とで設定されるレベル変換特性によってレベル変換し、操作入力部を介して音量調整の指示が行われると、当該指示に従って、レベル変換特性における低入力レベル領域と高入力レベル領域の境界点の入出力レベルを変更して、操作入力部を介して音量の調整状態を保存する指示が行われると、そのときの境界点の入出力レベルと、そのときのコンテンツを示すコンテンツ識別コード及びそのときのシーンを示すシーン識別コードとを対応付けて記憶手段に記録しておき、多重化信号を記録媒体から再び読み取ってコンテンツおよび話声期間識別情報を分離しし、当該多重化信号から分離したコンテンツを再生して、音声信号を再生すると、当該再生しているコンテンツ及びシーンと対応付けられている境界点の入出力レベルを記憶手段から読み出し、多重化信号から分離した話声期間識別情報によって特定される話声期間において、再生した音声信号を、読み出した境界点の入出力レベルによって定められるレベル変換特性によってレベル変換するようにしたことにより、コンテンツ中の音声情報自体から話声とその他の音声を区別しなくても、話声を聞き取りやすくすることができると共に、視聴者が音量調整をしたときの調整結果を音声処理パラメータとして保存しておくことで、同じコンテンツを再び視聴するときの音声処理パラメータとして使用することができ、視聴者は一回の音量調整によって、常に聞き取りやすい音量で音声を聞き取ることができる。
【００７５】
さらに、視聴者が音量調整をしたときには、音声信号レベルを算出し、その算出結果に応じて音声処理パラメータを変更することによって、聞き取りやすい音量で音声を聞き取ることができる。
【００７６】
さらに、調整結果の音声処理パラメータを、外部記憶媒体に保存して、別の受信装置または再生装置に読み込ませることによって、別の受信装置または再生装置でも、同じ特性で音声信号を処理して、音声を聞き取りやすくすることができる。
【図面の簡単な説明】
【図１】この発明のコンテンツ受信装置の一実施形態を示す図である。
【図２】この発明のコンテンツ再生装置の一実施形態を示す図である。
【図３】音声処理部の入出力特性の一例を示す図である。
【図４】音声処理部の入出力特性の一例を示す図である。
【図５】音量を大きくする場合の音声処理部の入出力特性の一例を示す図である。
【図６】音量を大きくする場合の音声処理部の入出力特性の一例を示す図である。
【図７】音量を小さくする場合の音声処理部の入出力特性の一例を示す図である。
【図８】音量を小さくする場合の音声処理部の入出力特性の一例を示す図である。
【図９】音声処理パラメータを保存する場合の説明に供する図である。
【符号の説明】
主要部については図中に全て記述したので、ここでは省略する。[0001]
BACKGROUND OF THE INVENTION
The present invention is a method and apparatus for receiving content that is a series of information including at least audio information distributed by broadcasting or a network, and a series of information including at least audio information from a recording medium such as an optical disk. The present invention relates to a method and an apparatus for reproducing content.
[0002]
In the present invention, a series of information including at least audio information such as video information (image information) and audio information (acoustic information), or only audio information is defined as content. The voice information includes all voices (acoustics) that can be recognized by humans, such as human voices (uttered voices), music sounds, and sounds of natural sounds and physical sounds.
[0003]
[Prior art]
A device that receives or reproduces content, for example, content including video information and audio information, such as a television receiver or a DVD (Digital Versatile Disc) player, has a volume adjustment function or a volume that exceeds a predetermined level. It is equipped with a protection function for the case where it is detected, can be adjusted to an appropriate volume, and is configured to protect equipment and hearing from a high volume.
[0004]
[Problems to be solved by the invention]
The audio information in the content includes various voices (sounds) such as human voices, music sounds, and natural sounds and sounds of sounds. There are various voices such as loud voices from.
[0005]
However, even when content including various types and levels of audio information is received or played back as described above, the conventional receiving device or playback device fixes the volume level to an appropriate level from the beginning. Most of the information is output to the end, and the viewer (user) can change the sound volume setting only when the user feels that the sound volume is too low or too high.
[0006]
For this reason, the audio information in the content cannot be fully enjoyed unless the viewing environment is sufficiently quiet, or the viewers are young and have a hearing ability that can handle a wide dynamic range.
[0007]
That is, in a noisy viewing environment, a small sound is disturbed by the noise, and it is difficult to understand what is being said, especially with a small voice. If you try to deal with it, increasing the volume will make your voice louder and easier to hear, but will be noisy at other times, such as during periods of music.
[0008]
In addition, if the viewer is an elderly person, and it is difficult to hear small sounds, and it is a senile deafness that causes loud sounds to be audibly distorted, the volume must always be adjusted according to the sound level. There is inconvenience that it is hard to hear voice.
[0009]
It is also considered to selectively amplify the sound according to the type of sound, but it is difficult to accurately distinguish between spoken voice and other sounds, which may be mistakenly recognized, and senile deafness For an auditory audience such as the above, it is not possible to expect a sufficient response since the ease of hearing is not improved by simply amplifying the speech.
[0010]
Therefore, in the present invention, when content that is a series of information including at least audio information is received or reproduced, it is easy to hear the speech without distinguishing the speech from other audio from the audio information itself in the content. It is something that can be done.
[0011]
[Means for Solving the Problems]
In the content receiving method of the present invention, the speech period identification information indicating the speech period in which the speech of the speech information in the content is the main subject or only the speech is included in the content which is a series of information including at least the speech information. Was multiplexed Multiplexing Receive the signal, Concerned Separate the content and speech period identification information from the received multiplexed signal, and From multiplexed signal Play separated content and play audio signal As well as , From multiplexed signal Playback during the speech period specified by the separated speech period identification information Shi The audio input signal is level-converted by a level conversion characteristic set by a setting value that determines input / output level conversion in the low input level region and a setting value that determines input / output level conversion in the high input level region, and an operation input unit When volume adjustment is instructed via Concerned Follow the instructions Te Changed the input / output level at the boundary between the low and high input level regions in the bell conversion characteristics When an instruction to save the volume adjustment state is given via the operation input unit, the input / output level of the boundary point at that time, the content identification code indicating the content at that time, and the scene indicating the scene at that time The identification code is associated and recorded in the storage means, the multiplexed signal is received again, the content and the speech period identification information are separated, the content separated from the multiplexed signal is reproduced, and the audio signal is When playback is performed, the input / output level of the boundary point associated with the content and scene being played back is read from the storage means, and playback is performed in the speech period specified by the speech period identification information separated from the multiplexed signal. Level conversion is performed by level conversion characteristics determined by the input / output level of the read boundary point. Do I did .
[0012]
In the content reproduction method of the present invention, the speech period identification information indicating the speech period in which the speech of the audio information in the content is the main subject or the speech only period is included in the content which is a series of information including at least the audio information. Was multiplexed Multiplexing Read the signal from the recording medium, separate the content and speech period identification information from the read multiplexed signal, From multiplexed signal Play separated content and play audio signal As well as , From multiplexed signal Playback during the speech period specified by the separated speech period identification information Shi The audio input signal is level-converted by a level conversion characteristic set by a setting value that determines input / output level conversion in the low input level region and a setting value that determines input / output level conversion in the high input level region, and an operation input unit When volume adjustment is instructed via Concerned According to instructions , Les Changed the input / output level at the boundary between the low and high input level regions in the bell conversion characteristics When an instruction to save the volume adjustment state is given via the operation input unit, the input / output level of the boundary point at that time, the content identification code indicating the content at that time, and the scene indicating the scene at that time The identification code is associated and recorded in the storage means, the multiplexed signal is read again from the recording medium, the content and the speech period identification information are separated, the content separated from the multiplexed signal is reproduced, When an audio signal is reproduced, the input / output level of the boundary point associated with the content and scene being reproduced is read from the storage means, and the voice period specified by the voice period identification information separated from the multiplexed signal , The level of the reproduced audio signal is converted by the level conversion characteristic determined by the input / output level of the read boundary point. Do I did .
[0013]
DETAILED DESCRIPTION OF THE INVENTION
[System Configuration of Receiving Device and Reproducing Device ... FIGS. 1 and 2]
(System configuration of receiving apparatus: Fig. 1)
FIG. 1 shows an embodiment of a content receiving apparatus according to the present invention, which is a case of a broadcast receiving apparatus that receives a digital television broadcast.
[0014]
The broadcast in this case is such that the content includes video information and audio information of the program, and additional information is multiplexed on the content. The additional information includes a content identification code that is information for identifying content (program), a scene identification code that is information for identifying each scene of a program or a specific scene, and speech of the audio information of the program, or only speech Speech period identification information indicating a speech period that is a period of.
[0015]
Specifically, video data and audio data are compressed and encoded by the MPEG (Moving Picture Experts Group) method and the like, and the additional information data is encoded and multiplexed separately from the video and audio data stream. Then, the whole is modulated and broadcast.
[0016]
The additional information data includes a header portion having a code indicating the type of additional information, such as whether the additional information is a content identification code, a scene identification code, or a speech period identification information, followed by a content. It consists of data parts such as an identification code, a scene identification code, and speech period identification information.
[0017]
The broadcast signal is multiplexed with decoding time information such as a decoding time stamp and content presentation time information such as a presentation time stamp.
[0018]
In the channel selection receiving unit 19, a broadcast signal is selected and received by channel selection control of the system controller 17 based on the channel selection operation of the viewer using the operation input unit 16. The channel-received signal is demodulated by the demodulation error correction unit 21, corrected for error, written to the buffer 22, and read from the buffer 22.
[0019]
The signal read from the buffer 22 is supplied to the demultiplexer 23, and the encoded video data, caption data, audio data, and additional information data are separated from the demultiplexer 23 and obtained.
[0020]
The video data, subtitle data, and audio data are written to the video code buffer 31, the subtitle code buffer 41, and the audio code buffer 51, respectively, and read from the video code buffer 31, the subtitle code buffer 41, and the audio code buffer 51. Thereafter, the video decoder 32, the caption decoder 42, and the audio decoder 52 respectively decode the decoded data.
[0021]
The system controller 17 controls the decoding timing in each of the decoders 32, 42, and 52 based on the timing information such as the above decoding time stamp, and based on the timing information such as the above presentation time stamp, The output timing in each decoder 32, 42, 52 is controlled so that the time series of the data from the decoders 32, 42, 52 is matched.
[0022]
The video data from the video decoder 32 and the caption data from the caption decoder 42 are processed by the video processing unit 33, and the video processing unit 33 superimposes the caption signal in the video signal.
[0023]
The video signal output from the video processing unit 33 is led to the video output terminal 34 and is sent from the video output terminal 34 to a video display device 35 such as a CRT display or a liquid crystal video projector.
[0024]
The video signal is sent to a video display device 35 having a D / A (Digital to Analog) converter as it is without being converted into an analog video signal by the video processor 33, or at the video processor 33. It is converted into an analog video signal and sent to the video display device 35.
[0025]
Audio data from the audio decoder 52 is processed by the audio processing unit 53, and an audio signal output from the audio processing unit 53 is led to the audio output terminal 54, and an audio output device such as a speaker or headphones is output from the audio output terminal 54. 55.
[0026]
The audio signal is also sent to the audio output device 55 having a D / A converter without being converted into an analog audio signal by the audio processor 53 or converted into an analog audio signal by the audio processor 53. And sent to the audio output device 55.
[0027]
The additional information data separated by the demultiplexer 23 is written in the additional information code buffer 61, read out from the additional information code buffer 61, and then transmitted to the speech period identification information detection unit 62 and the identification code detection unit 63. The
[0028]
In the voice period identification information detection unit 62, the voice period identification information is detected by the type code of the header part in the additional information data, and the detected voice period identification information is taken into the system controller 17, Decoded.
[0029]
In the identification code detection unit 63, the content identification code and the scene identification code are detected based on the type code of the header part in the additional information data, and the detected content identification code and scene identification code are taken into the system controller 17. Decoded.
[0030]
When the speech period identification information is detected, the system controller 17 controls the video processing unit 33 to present on the display screen of the video display device 35 that the scene is the speech period. For example, a mark or a character indicating that it is a speech period is superimposed in the video of the scene.
[0031]
When the viewer looks at the display and wants to make it easier to hear the sound, the viewer adjusts the volume so as to make the sound easier to hear, as will be described later, by operating the operation input unit 16. As a result, the system controller 17 controls the sound processing unit 53 to perform sound processing that makes it easy to hear the sound, as will be described later.
[0032]
The receiving device may be configured such that the presentation of the speech period is switched on / off by the viewer through a setting operation on the operation input unit 16. Furthermore, the receiving device may be configured so that the speech period is not presented.
[0033]
Even when the presentation is switched off or the configuration is such that the speech period is not presented, the viewer can listen to the voice output from the voice output device 55 to make it easier to hear the voice. When the volume is adjusted properly, the system controller 17 controls the sound processing unit 53 to perform sound processing that makes it easy to hear the sound.
[0034]
When the viewer does not particularly adjust the volume during the speech period, the system controller 17 controls the voice processing unit 53 to perform voice processing according to preset voice processing parameters as will be described later.
[0035]
As will be described later, the system controller 17 is connected to a storage device 18 in which audio processing parameters are written by adjusting the volume of the viewer. The storage device 18 is a non-volatile memory such as a flash memory or an EEPROM built in the receiving device, or an external storage medium such as a memory card, magnetic disk, optical disk, or magneto-optical disk.
[0036]
In FIG. 1, the speech period identification information detection unit 62, the identification code detection unit 63, and the system controller 17 are functionally separated, but the speech period identification information detection unit 62 and the identification code detection unit The function 63 can be configured as a part of the function of the system controller 17.
[0037]
(System configuration of the playback device ... Fig. 2)
FIG. 2 shows an embodiment of the content reproduction apparatus of the present invention, which is an optical disk reproduction apparatus.
[0038]
On the optical disc 11, additional information is recorded by being multiplexed with the content. The content in this case includes video information and audio information, and the additional information is the above-described content identification code, scene identification code, speech period identification information, and the like.
[0039]
Specifically, video data and audio data are compression-encoded and multiplexed by the MPEG method or the like, and additional information data is encoded and multiplexed separately from the video / audio data stream to be modulated as a whole. And recorded on the optical disc 11.
[0040]
As described above, the additional information data is composed of a header portion having a code indicating the type of additional information, followed by a data portion such as a content identification code, a scene identification code, and a speech period identification information. The
[0041]
The optical disc 11 is recorded with decoding time information such as a decoding time stamp and content presentation time information such as a presentation time stamp.
[0042]
The optical disk 11 is driven by a disk motor 13. The optical head (pickup) 14 is a drive including a feed motor and a tracking and focusing biaxial actuator. unit 15 is driven.
[0043]
The system controller 17 instructs the drive unit 15 to play the optical disk 11 by the playback operation of the viewer using the operation input unit 16, and a signal is read from the optical disk 11 by the optical head 14. The read signal is demodulated by the demodulation error correction unit 21, corrected for error, written to the buffer 22, and read from the buffer 22.
[0044]
The signal read from the buffer 22 is supplied to the demultiplexer 23, and the encoded video data, caption data, audio data, and additional information data are separated from the demultiplexer 23 and obtained. Others are the same as the broadcast receiving apparatus of FIG.
[0045]
[Speech processing and volume adjustment ... FIGS. 3 to 9]
As described above, in the broadcast receiving device of FIG. 1 or the optical disc playback device of FIG. 2, when the viewer does not particularly adjust the volume, the voice processing set in advance in the speech period specified by the speech period identification information Audio processing is performed according to the parameters.
[0046]
FIG. 3 shows the sound processing characteristics in this case, that is, the input / output characteristics of the sound processing unit 53. This input / output characteristic is a non-linear level conversion characteristic that is refracted (bent) at a point A0 (Ax0, Ay0) and a point B0 (Bx0, By0). An input level between Ax0 and Bx0 is an output between Ay0 and By0. In addition to the conversion to the level, when the input audio signal becomes a level equal to or higher than Bx0, the output audio signal is clipped to the level of By0.
[0047]
In the following, the low-level refraction point A such as the point A0 is the knee point, and the high-level refraction point B such as the point B0 is the upper limit point.
[0048]
According to such level conversion characteristics, as described later, when the viewer adjusts the volume, the system controller 17 changes the knee point A, that is, the input / output level of the knee point A accordingly. The viewer can easily hear the sound simply by instructing the volume level.
[0049]
For example, if the knee point A is moved from the point A0 to the point A1 in the low input level direction, the amplification factor in the low input level region can be further increased.
[0050]
Conversely, when the knee point A is moved from the point A0 to the point A2 in the high input level direction, the amplification factor in the low input level region decreases, and the input level region in which the output level does not reach Ay0 is expanded.
[0051]
If the knee point A is moved from the point A0 to the point A3 in the high output level direction, the amplification factor in the low input level region can be further increased.
[0052]
On the other hand, as shown in FIG. 4, if the upper limit point B is moved from the point B0 to the point B1 in the high output level direction, the amplification factor in the high input level region can be increased. Further, if the upper limit point B is moved from the point B0 to the point B2 in the high input level direction, the amplification factor in the high input level region can be reduced, and even an audio signal with a higher input level is not clipped. Can be.
[0053]
As described above, since the output level is converted into a preferable range (between Ay0 and By0) between the knee point A and the upper limit point B, the audio signal level is usually between the knee point A and the upper limit point B. To be.
[0054]
When the audio signal level is the effective value of the signal, the difference between the peak value of the audio signal and the audio signal level is called the peak factor of the audio signal, and is usually about 15 decibels. In order to hear the voice clearly, a dynamic range of about 30 to 40 decibels is necessary. Therefore, considering the peak factor, the input level of the knee point A is 15 to 25 at least less than the input voice signal level. It is desirable that the level be as small as decibels. In the following example, the input level of knee point A is set to a level 20 decibels lower than the input audio signal level.
[0055]
When the operation input unit 16 is operated to increase the volume, the system controller 17 calculates the input audio signal level at the audio processing unit 53. The audio signal level in this case is not an instantaneous level but an average level in a short time such as a few seconds to a few seconds.
[0056]
When the input audio signal level is 20 dB or more higher than the point A0 as indicated by the level VL1 in FIG. 5, the system controller 17 changes the knee point A from the point A0 until there is no instruction to increase the volume. As indicated by a point A3, the sound volume is moved in the direction of a high output level, or moved from a point A0 to a point A4 that is 20 decibels lower than the calculated input audio signal level VL1 on the characteristic curve at that time. Is moved in the high output level direction as indicated by a point A5 until there is no instruction to increase.
[0057]
As shown by the level VL2 in FIG. 6, when the input audio signal level is not greater than 20 decibels than the point A0, the system controller 17 calculates the knee point A from the point A0 in the direction of lower input level. After moving to a point A6 that is 20 decibels lower than the audio signal level VL2, it is moved in the high output level direction as indicated by point A7 until there is no instruction to increase the volume.
[0058]
Even when the operation input unit 16 is operated to decrease the volume, the system controller 17 calculates the input audio signal level at the audio processing unit 53.
[0059]
When the input audio signal level is 20 dB or more higher than the point A0 as indicated by the level VL1 in FIG. 7, the system controller 17 calculates the knee point A from the point A0 on the characteristic curve at that time. After moving to a point A4 that is 20 decibels lower than the input audio signal level VL1, it is moved in the low output level direction as indicated by point A8 until there is no instruction to decrease the volume, or from point A0 Is moved in the low output level direction as indicated by a point A9 until there is no instruction to decrease.
[0060]
As shown by the level VL2 in FIG. 8, when the input audio signal level is not greater than 20 decibels than the point A0, the system controller 17 calculates the knee point A from the point A0 on the characteristic curve at that time. After moving to a point A10 that is 20 decibels lower than the input audio signal level VL2, move to a low output level direction as indicated by a point A11 until there is no instruction to reduce the volume, or from the point A0, the volume is increased. It moves in the low output level direction as indicated by point A12 until there is no instruction to make it smaller.
[0061]
Since the volume of the audio signal is adjusted in advance before broadcasting or at the time of recording, it is rare that the audio level is too high or too low. Therefore, the movement of the knee point A can be within a limited range, and the range can be set in advance.
[0062]
Further, the receiving apparatus or the reproducing apparatus can be configured such that the moving range of the knee point A is changed according to the reference signal level of the received or reproduced audio signal. The reference signal level of the audio signal may be multiplexed with the broadcast signal or the recording signal as additional information, or may be detected as a long-term average level such as about 10 seconds by the receiving device or the reproducing device.
[0063]
In addition, the receiving device or the reproducing device can be configured so that the viewer can change the moving range of the knee point A according to his / her hearing ability.
[0064]
Similarly, the receiving device or the reproducing device can be configured so that the upper limit point B can be changed according to the reference signal level of the received or reproduced audio signal or the hearing ability of the viewer.
[0065]
When the viewer adjusts the volume, the viewer can instruct the system controller 17 to save the adjustment state by operating the operation input unit 16. When instructed to save, the system controller 17 records the knee point A (Ax, Ay) and the upper limit point B (Bx, By) at that time in the storage device 18 as voice processing parameters.
[0066]
According to this, when receiving or playing back another content such as receiving another broadcast program and playing back another disc, the system controller 17 reads the audio processing parameters from the storage device 18 and plays the audio. By controlling the audio processing in the processing unit 53, the viewer can listen to the sound at an easily audible volume without adjusting the volume again when viewing another content.
[0067]
In addition, a memory card is used as the storage device 18, and the memory card on which the audio processing parameters are recorded is attached to another receiving device or reproducing device, and the audio processing is performed on the system controller of the other receiving device or reproducing device. By reading the parameters, the viewer can listen to the sound at an easily audible volume without adjusting the volume even with another receiving apparatus or reproducing apparatus.
[0068]
Furthermore, as shown in FIG. 9, knee points A (Ax, Ay) and upper limit points B (Bx, By) as audio processing parameters are associated with content identification codes, and scenes and periods within the same content. When the volume is adjusted differently depending on the situation, it can be configured to be recorded in the storage device 18 in association with a scene identification code or a presentation time stamp for specifying the scene or period.
[0069]
According to this, when receiving or playing back the same content or scene, the system controller 17 reads the audio processing parameters corresponding to the content or scene from the storage device 18 and controls the audio processing in the audio processing unit 53. Thus, when viewing the same content or scene, the viewer can listen to the sound at the optimum volume for the content or scene without adjusting the volume again.
[0070]
The above-described example is a case where the sound processing is controlled and the volume is adjusted particularly in the speech period. However, in the period other than the speech period, for example, the period in which music is mainly or only music, According to the voice processing parameters obtained during the period, the voice processing can be controlled and the volume can be adjusted in the same manner as in the voice period. In this case, even during a period other than the speech period, for example, a period in which music is mainly or only music, the viewer can listen to the sound with a volume that is easy to hear.
[0071]
[Other Embodiments]
When recording on a recording medium such as an optical disc on the broadcast side, for audio information, for example, a speech period in which the voice is mainly or only the voice, a music period in which music is the main or only the music And a BGM period, which is a period in which BGM (Background Music) is synthesized with speech, and identification information indicating each period is multiplexed to broadcast or record content. In a state where the volume is not adjusted by the viewer, the system is configured such that the voice processing is controlled by the voice processing parameters set in advance in the speech period, the music period, and the BGM period. You can also.
[0072]
Moreover, although embodiment mentioned above is a case where a content contains video information and audio | voice information, this invention is applicable also when a content does not contain video information.
[0073]
【The invention's effect】
As described above, according to the present invention, In the content receiving method, voice period identification information indicating a voice period in which the voice information of the voice information in the content is a main subject or a voice-only period is multiplexed with content that is a series of information including at least voice information. The multiplexed signal is received, the content and the speech period identification information are separated from the received multiplexed signal, the content separated from the multiplexed signal is reproduced, the audio signal is reproduced, and the multiplexing is performed In the speech period specified by the speech period identification information separated from the signal, the reproduced speech signal is subjected to input / output level conversion in the high input level area and a set value that determines input / output level conversion in the low input level area. When level conversion is performed according to the level conversion characteristic set with the set value to be set, and volume adjustment is instructed via the operation input unit, the level is adjusted according to the instruction. When an instruction is given to change the input / output level of the boundary point between the low input level area and the high input level area in the conversion characteristics and the volume adjustment state is saved via the operation input unit, the boundary point input at that time The output level, the content identification code indicating the content at that time and the scene identification code indicating the scene at that time are associated and recorded in the storage means, and the multiplexed signal is received again to identify the content and speech period When information is separated and the content separated from the multiplexed signal is reproduced and the audio signal is reproduced, the input / output level of the boundary point associated with the content and scene being reproduced is read from the storage means, The input / output level of the boundary point at which the reproduced speech signal is read during the speech period specified by the speech period identification information separated from the multiplexed signal By which is adapted to level conversion by the level conversion characteristic Accordingly defined, Even without distinguishing speech from other speech from the audio information itself in the content, it is possible to make speech easier to hear. At the same time, by saving the adjustment result when the viewer adjusts the volume as an audio processing parameter, it can be used as an audio processing parameter when viewing the same content again. By adjusting, you can always hear the sound at a level that is easy to hear. .
[0074]
Also, According to the present invention, in the content reproduction method, the content that is a series of information including at least audio information, the speech that indicates the speech period in which the speech of the audio information in the content is the main or only speech A multiplexed signal in which the period identification information is multiplexed is read from the recording medium, the content and the voice period identification information are separated from the read multiplexed signal, and the content separated from the multiplexed signal is reproduced to generate audio. In the speech period specified by the speech period identification information separated from the multiplexed signal while reproducing the signal, the reproduced audio signal is set to a set value and a high input level that determine input / output level conversion in the low input level region. Level conversion is performed according to the level conversion characteristics set by the setting value that determines the input / output level conversion in the area, and the volume adjustment instruction is given via the operation input unit When performed, an instruction to change the input / output level at the boundary point between the low input level region and the high input level region in the level conversion characteristic and to save the volume adjustment state via the operation input unit is performed according to the instruction. And the input / output level of the boundary point at that time, the content identification code indicating the content at that time, and the scene identification code indicating the scene at that time are recorded in association with each other, and the multiplexed signal is recorded. When the content and the speech period identification information are read again from the medium, the content separated from the multiplexed signal is reproduced, and the audio signal is reproduced, it is associated with the content and scene being reproduced. In the speech period specified by the speech period identification information separated from the multiplexed signal by reading the input / output level of the boundary point from the storage means By converting the level of the reproduced audio signal according to the level conversion characteristics determined by the input / output level of the read boundary point, it is possible to distinguish speech from other audio from the audio information itself in the content. The voice can be easily heard, and the adjustment result when the viewer adjusts the volume can be saved as an audio processing parameter so that it can be used as an audio processing parameter when viewing the same content again. The viewer can listen to the sound at a volume that is always easy to hear by adjusting the volume once. .
[0075]
Furthermore, when the viewer adjusts the volume, the sound can be heard at a volume that is easy to hear by calculating the sound signal level and changing the sound processing parameter according to the calculation result. .
[0076]
Furthermore, the audio processing parameters of the adjustment result are stored in an external storage medium and read by another receiving device or reproducing device, so that the other receiving device or reproducing device processes the audio signal with the same characteristics, The voice can be easily heard.
[Brief description of the drawings]
FIG. 1 is a diagram showing an embodiment of a content receiving apparatus according to the present invention.
FIG. 2 is a diagram showing an embodiment of a content reproduction apparatus according to the present invention.
FIG. 3 is a diagram illustrating an example of input / output characteristics of an audio processing unit.
FIG. 4 is a diagram illustrating an example of input / output characteristics of an audio processing unit.
FIG. 5 is a diagram illustrating an example of input / output characteristics of an audio processing unit when the volume is increased.
FIG. 6 is a diagram illustrating an example of input / output characteristics of an audio processing unit when the volume is increased.
FIG. 7 is a diagram illustrating an example of input / output characteristics of an audio processing unit when the volume is reduced.
FIG. 8 is a diagram illustrating an example of input / output characteristics of an audio processing unit when the volume is decreased.
FIG. 9 is a diagram for explaining the case of storing audio processing parameters.
[Explanation of symbols]
Since all the main parts are described in the figure, they are omitted here.

Claims

A multiplexed signal in which speech period identification information indicating a speech period in which the speech information of the speech information in the content is a main subject or a speech-only period is multiplexed with content that is a series of information including at least speech information Receiving a first receiving step;
From the multiplexed signal thus received, a first separation step of separating the content and the speech period identifying information,
By reproducing the content separated from the multiplexed signal, a first reproducing step for reproducing audio signals,
In the speech period specified by the speech period identification information separated from the multiplexed signal, the audio signal reproduced, set values and the high input level region defining the input and output level conversion at low input level range A first level conversion step of performing level conversion according to a level conversion characteristic set by a set value that determines input / output level conversion at
When an instruction for volume adjustment is made via the operation input unit, a boundary point changing step of changing the input / output level of the boundary point between the low input level region and the high input level region in the level conversion characteristic according to the instruction ;
When an instruction to save the volume adjustment state is given via the operation input unit, the input / output level of the boundary point at that time, the content identification code indicating the content at that time, and the scene identification indicating the scene at that time A recording step of associating a code with the storage means and recording the code;
A second receiving step of receiving the multiplexed signal again;
A second separation step of separating the content and the speech period identification information from the multiplexed signal received again;
A second reproduction step of reproducing the audio signal by reproducing the content separated from the multiplexed signal;
The speech specified by the speech period identification information separated from the multiplexed signal by reading out the input / output levels of the boundary points associated with the content and the scene being reproduced from the storage means A content receiving method comprising: a second level conversion step of converting a level of the reproduced audio signal in a period according to a level conversion characteristic determined by an input / output level of the read boundary point .

In the boundary point changing step,
When the instruction of the volume adjustment via the operation input unit is performed to calculate the level of the audio signal reproduced in accordance with said instructions and the calculation result, changes the output level of the boundary point The content receiving method according to claim 1.

In the speech period specified by the speech period identification information separated from the multiplexed signal, the display control step of displaying the information indicating that the a speech period to a display means
Content receiving method according to claim 1 which have a.

A multiplexed signal in which speech period identification information indicating a speech period in which the speech information of the speech information in the content is a main subject or a speech-only period is multiplexed with content that is a series of information including at least speech information A first reading step of reading from the recording medium;
From said multiplexed signal read, a first separation step of separating the content and the speech period identifying information,
By reproducing the content separated from the multiplexed signal, a first reproducing step for reproducing audio signals,
In the speech period specified by the speech period identification information separated from the multiplexed signal, the audio signal reproduced, set values and the high input level region defining the input and output level conversion at low input level range A first level conversion step of performing level conversion according to a level conversion characteristic set by a set value that determines input / output level conversion at
When an instruction for volume adjustment is made via the operation input unit, a boundary point changing step of changing the input / output level of the boundary point between the low input level region and the high input level region in the level conversion characteristic according to the instruction ,
When an instruction to save the volume adjustment state is given via the operation input unit, the input / output level of the boundary point at that time, the content identification code indicating the content at that time, and the scene identification indicating the scene at that time A recording step of associating a code with the storage means and recording the code;
A second reading step of reading the multiplexed signal from the recording medium again;
A second separation step of separating the content and the speech period identification information from the multiplexed signal read again;
A second reproduction step of reproducing the content separated from the multiplexed signal and reproducing an audio signal;
The speech specified by the speech period identification information separated from the multiplexed signal by reading out the input / output levels of the boundary points associated with the content and the scene being reproduced from the storage means A content reproduction method comprising: a second level conversion step of converting a level of the reproduced audio signal in a period according to a level conversion characteristic determined by an input / output level of the read boundary point .

In the boundary point changing step,
When the operation input unit instruction volume control via is performed, the claims calculating the level of the audio signal reproduced in accordance with said instructions and the calculation result, changes the output level of the boundary point 4. The content reproduction method according to 4 .

In the speech period specified by the speech period identification information separated from the multiplexed signal, the display control step of displaying the information indicating that the a speech period to a display means
Content playback method according to claim 4 which have a.

A multiplexed signal in which speech period identification information indicating a speech period in which the speech information of the speech information in the content is a main subject or a speech-only period is multiplexed with content that is a series of information including at least speech information Receiving means for receiving
From the multiplexed signal received by said receiving means, separating means for separating the contents and the speech period identifying information,
By reproducing the content separated from the multiplexed signal by said separating means, a reproduction means for reproducing the audio signal,
In the speech period specified by the speech period discerning information separated from the multiplexed signal by said separating means, the audio signal reproduced by said reproducing means, input and output level conversion at low input level range Voice processing means for level conversion by a level conversion characteristic set by a setting value for determining the input / output level conversion in the high input level region
When a sound volume adjustment instruction is given via the operation input unit, a control means for changing the input / output level of the boundary point between the low input level region and the high input level region in the level conversion characteristic according to the instruction,
When an instruction to save the volume adjustment state is given via the operation input unit, the input / output level of the boundary point at that time, the content identification code indicating the content at that time, and the scene identification indicating the scene at that time Recording means for associating and recording the code in the storage means ,
The voice processing means is
The multiplexed signal is received again by the receiving means, the content and the speech period identification information are separated from the multiplexed signal by the separating means, the content is reproduced by the reproducing means, and the audio signal is When reproduced, the speech period separated from the multiplexed signal is read out from the storage means and the input / output levels of the boundary points associated with the content and the scene being reproduced by the reproduction means In the speech period specified by the identification information, the reproduced voice signal is level-converted by a level conversion characteristic determined by the input / output level of the read boundary point.
Content receiving device.

A multiplexed signal in which speech period identification information indicating a speech period in which the speech information of the speech information in the content is a main subject or a speech-only period is multiplexed with content that is a series of information including at least speech information Reading means for reading from the recording medium;
From the multiplexed signal read from said recording medium by said reading means, separating means for separating the contents and the speech period identifying information,
By reproducing the content separated from the multiplexed signal by said separating means, a reproduction means for reproducing the audio signal,
In the speech period specified by the speech period discerning information separated from the multiplexed signal by said separating means, the audio signal reproduced by said reproducing means, input and output level conversion at low input level range Voice processing means for level conversion by a level conversion characteristic set by a setting value for determining the input / output level conversion in the high input level region
When a sound volume adjustment instruction is given via the operation input unit, a control means for changing the input / output level of the boundary point between the low input level region and the high input level region in the level conversion characteristic according to the instruction,
When an instruction to save the volume adjustment state is given via the operation input unit, the input / output level of the boundary point at that time, the content identification code indicating the content at that time, and the scene identification indicating the scene at that time Recording means for associating and recording the code in the storage means ,
The voice processing means is
The multiplexed signal is read again from the recording medium by the reading means, the content and the speech period identification information are separated from the multiplexed signal by the separating means, and the content is reproduced by the reproducing means. When the audio signal is reproduced, the input / output level of the boundary point associated with the content and the scene being reproduced by the reproduction means is read from the storage means and separated from the multiplexed signal. In the speech period specified by the speech period identification information, the reproduced speech signal is level-converted by a level conversion characteristic determined by the input / output level of the read boundary point.
Content playback device.