JP2006047561A

JP2006047561A - Audio signal encoding device and audio signal decoding device

Info

Publication number: JP2006047561A
Application number: JP2004226813A
Authority: JP
Inventors: Yasuhito Watanabe; 泰仁渡邊
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2004-08-03
Filing date: 2004-08-03
Publication date: 2006-02-16

Abstract

<P>PROBLEM TO BE SOLVED: To provide an audio signal encoding device capable of performing efficient encoding processing and an audio decoding device capable of decoding an encoded signal generated through such encoding processing. <P>SOLUTION: The audio decoding device is provided with an FM composition part 12 which predict spectrum data of a frequency range by using an FM composition system capable of representing a complicated waveform with less parameters than linear prediction, a residue signal calculation part 13 which finds a residue signal as the difference between the signal and the original signal, and an quantization/encoding part 14 which encodes the parameters and residue signal, so that encoding processing which is more efficient than processing using the linear prediction can be performed. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、オーディオ信号を符号化するオーディオ信号符号化装置、並びに、符号化されたビットストリームを復号化するオーディオ信号復号化装置に関するものである。 The present invention relates to an audio signal encoding apparatus that encodes an audio signal, and an audio signal decoding apparatus that decodes an encoded bit stream.

従来、オーディオ信号を符号化する方法として、ＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔＧｒｏｕｐ）Ａｕｄｉｏ規格がある。ＭＰＥＧＡｕｄｉｏ規格には複数の方式があるが、ＭＰＥＧ２ＡＡＣ、ＭＰＥＧ４ＡＡＣ規格では、圧縮効率を上げるためのツールとしてＴＮＳ（ＴｅｍｐｏｒａｒｙＮｏｉｓｅＳｈａｐｉｎｇ）技術を使用できる。 Conventionally, as a method for encoding an audio signal, there is an MPEG (Moving Picture Expert Group) Audio standard. There are a plurality of MPEG Audio standards. In the MPEG2 AAC and MPEG4 AAC standards, a TNS (Temporary Noise Shaping) technique can be used as a tool for increasing compression efficiency.

図１０は、ＩＳＯ／ＩＥＣ１３８１８−７で標準化されているＭＰＥＧ２オーディオ規格ＡＡＣ（ＡｄｖａｎｃｅｄＡｕｄｉｏＣｏｄｉｎｇ）準拠の符号化装置のブロック図、図１１は、ＭＰＥＧ２オーディオ規格ＡＡＣのリファレンスソフトウェアにおけるＴＮＳ処理部のブロック図である。 FIG. 10 is a block diagram of an encoding apparatus compliant with the MPEG2 audio standard AAC (Advanced Audio Coding) standardized by ISO / IEC 13818-7, and FIG. 11 is a block of a TNS processing unit in the reference software of the MPEG2 audio standard AAC. FIG.

図１０において、聴覚心理モデル部１１０は、入力オーディオ信号を人間の聴覚特性に従って分析し、信号対マスク比（ＳＭＲ）値を算出する。ゲイン制御部１１１は、ＳＳＲプロファイルのみに使用され、入力信号を４つの等間隔の帯域に分割し、最低域以外の帯域について利得の制御を行う。ＭＤＣＴ部１１２は、時間領域の入力オーディオ信号を周波数領域のスペクトルデータに変換する。ＴＮＳ処理部１１３は、量子化雑音の時間的な形状を制御する。インテンシティ／カップリング部１１４、Ｍ／Ｓステレオ部１１６は、ステレオ信号を効率よく処理するモジュールの一つであり、ステレオ相関符号化処理を行う。予測部１１５は、予測符号化を行う。正規化係数部１１７は、正規化係数を算出し、量子化部１１８では、正規化係数を基に音響信号を非線形量子化する。量子化された各出力は、ノイズレス符号化部１１９により符号化処理が行われ、マルチプレクサ部１２０でビットストリームを形成する。また、スペクトル処理を行うＭＤＣＴ部１１２からＭ／Ｓステレオ部１１６の処理部をまとめてスペクトル処理部１２１、量子化／符号化を行う正規化係数部１１７からノイズレス符号化部１１９の処理部をまとめて量子化／符号化部１２２とする。 In FIG. 10, an auditory psychological model unit 110 analyzes an input audio signal in accordance with human auditory characteristics and calculates a signal-to-mask ratio (SMR) value. The gain control unit 111 is used only for the SSR profile, divides the input signal into four equally-spaced bands, and controls gain for bands other than the lowest band. The MDCT unit 112 converts the input audio signal in the time domain into spectrum data in the frequency domain. The TNS processing unit 113 controls the temporal shape of the quantization noise. The intensity / coupling unit 114 and the M / S stereo unit 116 are one of modules that efficiently process stereo signals, and perform stereo correlation coding processing. The prediction unit 115 performs predictive coding. The normalization coefficient unit 117 calculates a normalization coefficient, and the quantization unit 118 nonlinearly quantizes the acoustic signal based on the normalization coefficient. Each quantized output is encoded by the noiseless encoding unit 119, and the multiplexer unit 120 forms a bit stream. Also, the spectrum processing unit 121 is grouped from the MDCT unit 112 that performs spectrum processing to the processing unit of the M / S stereo unit 116, and the processing unit of the noiseless coding unit 119 is grouped from the normalization coefficient unit 117 that performs quantization / coding. The quantization / encoding unit 122 is used.

次に、ＴＮＳ処理部１１３の動作について、図１１を用いて説明する。図１１において、ＭＤＣＴ部１３１は、時間領域の入力オーディオ信号を周波数領域のスペクトルデータに変換する。線形予測部１３２は、周波数領域のスペクトルデータ上で線形予測を行う。ここで、線形予測によってスペクトルデータを予測できた場合は、残差信号算出部１３３では、線形予測部１３２で予測されたスペクトルとＭＤＣＴ部１３１で変換されたスペクトルデータとの残差信号を算出する。さらに、この残差信号および線形予測係数は、量子化／符号化部１３４によって出力ビットストリームに変換される。 Next, the operation of the TNS processing unit 113 will be described with reference to FIG. In FIG. 11, the MDCT unit 131 converts a time domain input audio signal into frequency domain spectrum data. The linear prediction unit 132 performs linear prediction on spectrum data in the frequency domain. If the spectrum data can be predicted by linear prediction, the residual signal calculation unit 133 calculates a residual signal between the spectrum predicted by the linear prediction unit 132 and the spectrum data converted by the MDCT unit 131. . Further, the residual signal and the linear prediction coefficient are converted into an output bit stream by the quantization / encoding unit 134.

このようにして、ＴＮＳ処理によりＭＤＣＴ係数の分散が小さくなり、スペクトルは平坦化される。通常、量子化ノイズは時間軸上全体に平均的に分布しているが、ＴＮＳ処理により量子化ノイズは時間軸上で波形の大きいところに多く分布されるようになる。このことにより、プリエコーと呼ばれる音質劣化を減少させることができる。 In this way, the dispersion of MDCT coefficients is reduced by the TNS process, and the spectrum is flattened. Usually, the quantization noise is distributed on the entire time axis on the average, but the quantization noise is distributed a lot on the time axis where the waveform is large by the TNS processing. As a result, sound quality deterioration called pre-echo can be reduced.

デコード処理を行う際には、復号化された残差信号と線形予測係数からスペクトルが算出され、そのスペクトルを逆ＭＤＣＴすることでオーディオ信号は復元される。 When performing the decoding process, a spectrum is calculated from the decoded residual signal and the linear prediction coefficient, and the audio signal is restored by performing inverse MDCT on the spectrum.

また、このような線形予測と残差信号を用いて伝達情報量の増加を軽減するとともにピッチ予測精度を上げ、音声を圧縮する音声符号化復号化装置も提案されている（例えば特許文献１参照）。
特開平９−０８１１９１号公報 In addition, a speech coding / decoding device that compresses speech by reducing the increase in the amount of transmitted information using such linear prediction and the residual signal and increasing the pitch prediction accuracy has also been proposed (see, for example, Patent Document 1). ).
JP-A-9-081911

しかしながら、従来のオーディオ信号符号化復号化装置におけるＴＮＳ処理では、単純な波形の場合は予測が有効であるが、複雑な波形の場合には線形予測ができず、ＴＮＳ処理による効果を十分に発揮できないという問題があった。 However, in the TNS processing in the conventional audio signal encoding / decoding apparatus, prediction is effective for a simple waveform, but linear prediction cannot be performed for a complicated waveform, and the effect of the TNS processing can be sufficiently exhibited. There was a problem that I could not.

本発明は、このような従来の問題を解決するためになされたもので、効率的な符号化処理を行うことができるオーディオ信号符号化装置、および、このような符号化信号を復号化することができるオーディオ信号復号化装置を提供することを目的とする。 The present invention has been made to solve such a conventional problem, and an audio signal encoding apparatus capable of performing efficient encoding processing and decoding such an encoded signal. An object of the present invention is to provide an audio signal decoding apparatus capable of performing the above.

本発明のオーディオ信号符号化装置は、時間領域のオーディオ信号を周波数領域へ変換する時間周波数変換手段と、前記時間周波数変換手段によって周波数領域へ変換された信号を近似するＦＭ合成手段と、前記時間周波数変換手段によって周波数領域へ変換された信号と前記ＦＭ合成手段によって近似された周波数領域の信号との差分を算出する残差信号算出手段と、前記残差信号算出手段に算出された周波数領域の残差信号と、前記ＦＭ合成手段で使用したパラメータとを符号化する符号化手段とを備えたことを特徴とした構成を有している。 The audio signal encoding apparatus of the present invention includes a time-frequency conversion unit that converts a time-domain audio signal into a frequency domain, an FM synthesis unit that approximates the signal converted into the frequency domain by the time-frequency conversion unit, and the time A residual signal calculating means for calculating a difference between the signal converted into the frequency domain by the frequency converting means and the frequency domain signal approximated by the FM synthesizing means; and a frequency domain calculated by the residual signal calculating means. An encoding means for encoding the residual signal and the parameters used in the FM synthesizing means is provided.

この構成により、複雑な波形を線形予測よりも少ないパラメータで表現可能なＦＭ合成方式を用いて、周波数領域のスペクトルデータを予測し、また、この信号との差分である残差信号を求め、前記パラメータと前記残差信号とを符号化するので、線形予測を使用した処理よりもより効率的な符号化処理を行うことができる。 With this configuration, frequency domain spectrum data is predicted using an FM synthesis method that can represent a complex waveform with fewer parameters than linear prediction, and a residual signal that is a difference from this signal is obtained. Since the parameter and the residual signal are encoded, the encoding process can be performed more efficiently than the process using linear prediction.

また、本発明のオーディオ信号符号化装置は、前記残差信号算出手段において残差信号を求める周波数の適用範囲を算出する適用周波数算出手段を備え、前記残差信号算出手段は、前記適用周波数算出手段に算出された適用周波数帯域の前記残差信号を算出し、前記符号化手段は、前記適用周波数算出手段が算出した適用周波数帯域をパラメータとして符号化することを特徴とした構成を有している。 The audio signal encoding apparatus according to the present invention further includes an applied frequency calculating unit that calculates an application range of a frequency for obtaining a residual signal in the residual signal calculating unit, and the residual signal calculating unit includes the applied frequency calculating unit. The residual signal of the applied frequency band calculated by the means is calculated, and the encoding means has a configuration characterized in that the applied frequency band calculated by the applied frequency calculating means is encoded as a parameter. Yes.

この構成により、残差信号を算出する周波数帯域を指定し、複雑な波形を線形予測よりも少ないパラメータで表現可能なＦＭ合成方式を用いて、周波数領域のスペクトルデータを予測し、ＦＭ合成による信号との差分が実質的に影響を及ぼす範囲でのみ残差信号を符号化するので、必要外の残差信号を符号化せず、線形予測を使用した処理よりもより効率的な符号化処理を行うとともに、再生時の品質を落とさずに、さらなる符号化効率を高めることができる。 With this configuration, a frequency band in which a residual signal is calculated is designated, spectrum data in the frequency domain is predicted using an FM synthesis method capable of expressing a complex waveform with fewer parameters than linear prediction, and a signal obtained by FM synthesis. Since the residual signal is encoded only within the range where the difference between and is substantially affected, the residual signal is not encoded, and the encoding process is more efficient than the process using linear prediction. In addition, the encoding efficiency can be further increased without degrading the reproduction quality.

さらに、本発明のオーディオ信号復号化装置は、符号化された周波数領域の残差信号およびＦＭ合成に使用するパラメータを復号化する復号化手段と、前記復号化手段によって復号化されたパラメータを用いてＦＭ合成を行うＦＭ合成手段と、前記復号化手段によって復号化された周波数領域の残差信号と前記ＦＭ合成手段によって出力された周波数領域の信号とを加算する加算信号算出手段と、前記加算信号算出手段によって生成された周波数領域の信号を時間領域のオーディオ信号に変換する周波数時間変換手段とを備えたことを特徴とした構成を有している。 Furthermore, the audio signal decoding apparatus of the present invention uses a decoding means for decoding the encoded frequency domain residual signal and the parameters used for FM synthesis, and the parameters decoded by the decoding means. FM combining means for performing FM combining, an addition signal calculating means for adding the frequency domain residual signal decoded by the decoding means and the frequency domain signal output by the FM combining means, and the addition A frequency time conversion means for converting a frequency domain signal generated by the signal calculation means into a time domain audio signal.

この構成により、ＦＭ合成信号のパラメータと、オーディオ信号と前記ＦＭ合成信号との残差信号とを復号化して、オーディオ信号を再生することができるので、線形予測による符号化信号より圧縮率の高い符号化信号から、オーディオ信号を復号化することができる。 With this configuration, it is possible to reproduce the audio signal by decoding the parameter of the FM synthesized signal and the residual signal of the audio signal and the FM synthesized signal, so that the compression rate is higher than that of the encoded signal by linear prediction. From the encoded signal, the audio signal can be decoded.

さらに、本発明のオーディオ信号復号化装置は、符号化された周波数領域の残差信号、ＦＭ合成に使用するパラメータおよび前記残差信号を算出した周波数帯域のパラメータを復号化する復号化手段と、前記復号化手段によって復号化されたＦＭ合成のパラメータを用いてＦＭ合成を行うＦＭ合成手段と、前記周波数帯域のパラメータで設定された適用周波数帯域において、前記復号化手段によって復号化された周波数領域の残差信号と前記ＦＭ合成手段によって出力された周波数領域の信号とを加算する加算信号算出手段と、前記加算信号算出手段によって生成された周波数領域の信号を時間領域のオーディオ信号に変換する周波数時間変換手段とを備えたことを特徴とした構成を有している。 Furthermore, the audio signal decoding apparatus of the present invention includes a decoding means for decoding the encoded frequency domain residual signal, the parameter used for FM synthesis, and the parameter of the frequency band in which the residual signal is calculated, FM synthesis means for performing FM synthesis using the FM synthesis parameters decoded by the decoding means, and a frequency domain decoded by the decoding means in an applied frequency band set by the frequency band parameters An addition signal calculation means for adding the residual signal of the signal and the frequency domain signal output by the FM synthesis means, and a frequency for converting the frequency domain signal generated by the addition signal calculation means into a time domain audio signal. And a time conversion means.

この構成により、ＦＭ合成信号のパラメータと、オーディオ信号と前記ＦＭ合成信号との残差信号と、この残差信号が適用された周波数を示す適用周波数情報とを復号化して、オーディオ信号を再生することができるので、ＦＭ合成による信号との差分が実質的に影響を及ぼす範囲でのみ残差信号が符号化された、線形予測による符号化信号より特に圧縮率の高い符号化信号から、オーディオ信号を復号化することができる。 With this configuration, the audio signal is reproduced by decoding the FM composite signal parameter, the residual signal of the audio signal and the FM composite signal, and the applied frequency information indicating the frequency to which the residual signal is applied. Since the residual signal is encoded only in a range in which the difference from the signal by FM synthesis substantially affects, the encoded signal having a higher compression rate than that of the encoded signal by linear prediction can be used. Can be decrypted.

本発明は、複雑な波形を線形予測よりも少ないパラメータで表現可能なＦＭ合成方式を用いて、周波数領域のスペクトルデータを予測するＦＭ合成手段と、この信号と元の信号との差分である残差信号を求める残差信号算出手段と、前記パラメータと前記残差信号とを符号化する符号化手段とを設けることにより、線形予測を使用した処理よりもより効率的な符号化処理を行うことができるという効果を有するオーディオ信号符号化復号化装置を提供することができるものである。 The present invention uses an FM synthesis method capable of expressing a complex waveform with fewer parameters than linear prediction, and FM synthesis means for predicting frequency domain spectrum data, and a residual difference between this signal and the original signal. By providing a residual signal calculating means for obtaining a difference signal and an encoding means for encoding the parameter and the residual signal, the encoding process can be performed more efficiently than the process using linear prediction. Therefore, it is possible to provide an audio signal encoding / decoding device having the effect of being able to perform the above.

以下、本発明の実施の形態におけるオーディオ信号符号化装置およびオーディオ信号復号化装置について、図面を用いて説明する。 Hereinafter, an audio signal encoding device and an audio signal decoding device according to embodiments of the present invention will be described with reference to the drawings.

（第１の実施の形態）
本発明の第１の実施の形態におけるオーディオ信号符号化装置のブロック図を、図１に示し説明する。 (First embodiment)
A block diagram of an audio signal encoding apparatus according to the first embodiment of the present invention will be described with reference to FIG.

図１に示すように、オーディオ信号符号化装置１０は、時間領域の入力オーディオ信号を周波数領域のスペクトル信号へ変換する時間周波数変換部１１と、少ないパラメータで高音質な音声を合成することができるＦＭ合成部１２と、上記周波数領域のスペクトル信号とＦＭ合成部１２によって算出された周波数領域のスペクトル信号との差を算出する残差信号算出部１３と、残差信号算出部１３によって算出された残差信号とＦＭ合成部１２において使用されたパラメータ等を量子化／符号化する量子化／符号化部１４とを備えた構成である。 As shown in FIG. 1, the audio signal encoding apparatus 10 can synthesize a high-quality sound with a small number of parameters and a time-frequency conversion unit 11 that converts an input audio signal in the time domain into a spectrum signal in the frequency domain. The FM synthesis unit 12, the residual signal calculation unit 13 that calculates the difference between the spectrum signal in the frequency domain and the spectrum signal in the frequency domain calculated by the FM synthesis unit 12, and the residual signal calculation unit 13 The configuration includes a residual signal and a quantization / encoding unit 14 that quantizes / encodes parameters used in the FM synthesis unit 12.

以上のように構成されたオーディオ信号符号化装置１０について、その動作を説明する。 The operation of the audio signal encoding apparatus 10 configured as described above will be described.

まず、時間領域の入力オーディオ信号は、時間周波数変換部１１によって周波数領域のスペクトル信号へと変換される。時間周波数変換部１１は、ＦＦＴやＭＤＣＴなどを使用することができる。次に、ＦＭ合成部１２は、時間周波数変換部１１によって変換された周波数領域のスペクトル信号をＦＭ合成方式を用いて近似する。ＦＭ合成方式は、楽器のシンセサイザーや携帯電話の音源として幅広く使われている音声合成方式で、少ないパラメータで複雑な波形を出力することが可能である。ＦＭ合成方式では、ＦＭ（周波数変調）によって波形を変形させることでさまざまな音を作りだす。 First, the time domain input audio signal is converted into a frequency domain spectrum signal by the time frequency converter 11. The time frequency conversion unit 11 can use FFT, MDCT, or the like. Next, the FM synthesizing unit 12 approximates the spectrum signal in the frequency domain converted by the time-frequency converting unit 11 using the FM synthesizing method. The FM synthesis method is a voice synthesis method widely used as a musical instrument synthesizer and a sound source of a mobile phone, and can output a complex waveform with a small number of parameters. In the FM synthesis method, various sounds are created by deforming a waveform by FM (frequency modulation).

ここで、ＦＭ合成部１２は、複数の発信器から構成されており、この発信器の出力で別の発信器を変調することで音声を作成する。この発信器の個数は、回路規模や演算能力によって変更することが可能である。携帯電話の音源としては、２〜４個、楽器などでは、４〜８個を使用している。ＦＭ合成部のもっとも簡単な構成図を、図２に示す。 Here, the FM synthesizing unit 12 is composed of a plurality of transmitters, and creates a voice by modulating another transmitter with the output of the transmitter. The number of the transmitters can be changed according to the circuit scale and computing capacity. Two to four sound sources are used for mobile phones, and four to eight are used for musical instruments. The simplest configuration diagram of the FM synthesis unit is shown in FIG.

図２に示すように、ＦＭ合成部は、２つの発信器２１、発信器２２を備えた構成であり、発信器２１は、発信器２２に接続されている。発信器２２は、発信器２１から出力された波形を変調して出力する。発信器の接続の仕方はアルゴリズムと呼ばれ、複数のアルゴリズムを有することが可能である。例えば、図２のような直列に２つの発信器を接続したものや、図３に示すように、２つの発信器３１、発信器３２を並列に接続したものなどがある。 As shown in FIG. 2, the FM synthesizing unit includes two transmitters 21 and 22, and the transmitter 21 is connected to the transmitter 22. The transmitter 22 modulates and outputs the waveform output from the transmitter 21. The method of connecting the transmitters is called an algorithm, and can have a plurality of algorithms. For example, there are those in which two transmitters are connected in series as shown in FIG. 2, and those in which two transmitters 31 and 32 are connected in parallel as shown in FIG.

次に図４を用いて、ＦＭ合成部１２の構成方法について説明する。ＦＭ合成部１２では、周波数領域のスペクトル信号を解析部４６によって基本周波数などを解析し、その解析結果に基づきＦＭ合成のアルゴリズムやパラメータを設定し、ＦＭ合成部４５によって時間領域の合成波形を出力する。合成波形は、時間周波数変換部４９によって周波数領域のスペクトル信号へと変換される。 Next, a configuration method of the FM synthesis unit 12 will be described with reference to FIG. In the FM synthesizing unit 12, the fundamental frequency and the like of the spectrum signal in the frequency domain are analyzed by the analyzing unit 46, an FM synthesis algorithm and parameters are set based on the analysis result, and the time domain synthesized waveform is output by the FM synthesizing unit 45. To do. The combined waveform is converted into a spectrum signal in the frequency domain by the time frequency converter 49.

また、演算量が十分に取れる回路であれば、図５のような構成をとることも可能である。ここでは、時間周波数変換部５１によって周波数領域のスペクトル信号へと変換された信号と、ＦＭ合成部５５によって生成されたＦＭ合成波形を時間周波数変換部５９によって周波数領域へと変換した信号とを比較部５７によって比較し、２つのスペクトルの差が最小になるパラメータを計算する。誤差が最小となるＦＭ合成波形を、周波数領域のスペクトル信号へと変換した信号をＦＭ合成部１２は出力する。 Further, if the circuit has a sufficient amount of calculation, the configuration as shown in FIG. 5 can be adopted. Here, the signal converted into the spectrum signal in the frequency domain by the time-frequency converter 51 is compared with the signal converted from the FM synthesized waveform generated by the FM synthesizer 55 into the frequency domain by the time-frequency converter 59. The parameters are compared by the unit 57 and the difference between the two spectra is minimized. The FM synthesizing unit 12 outputs a signal obtained by converting the FM synthesized waveform that minimizes the error into a spectrum signal in the frequency domain.

次に、残差信号算出部１３は、ＦＭ合成部１２によって出力された周波数領域のスペクトル信号と、時間周波数変換部１１によって変換された周波数領域のスペクトル信号との残差信号を算出する。次に、量子化／符号化部１４では、上記残差信号とＦＭ合成に用いたアルゴリズムや周波数などのパラメータ等を量子化／符号化し、ビットストリームを出力する。 Next, the residual signal calculation unit 13 calculates a residual signal between the frequency domain spectral signal output by the FM synthesis unit 12 and the frequency domain spectral signal converted by the time frequency conversion unit 11. Next, the quantization / encoding unit 14 quantizes / encodes the residual signal and parameters such as the algorithm and frequency used for FM synthesis, and outputs a bit stream.

また、ＭＰＥＧ２オーディオ規格ＡＡＣにおいて圧縮効率をあげるためのツールである、Ｍ／Ｓステレオ部や聴覚心理モデル部を本符号化に適用することも可能である。 It is also possible to apply the M / S stereo part and the psychoacoustic model part, which are tools for increasing the compression efficiency in the MPEG2 audio standard AAC, to this encoding.

このような本発明の第１の実施の形態のオーディオ信号符号化装置によれば、複雑な波形を線形予測よりも少ないパラメータで表現可能なＦＭ合成方式を用いて、周波数領域のスペクトルデータを予測することにより、線形予測を使用した処理よりもより効率的な符号化処理を行うことができる。 According to the audio signal encoding apparatus of the first embodiment of the present invention as described above, the spectrum data in the frequency domain is predicted using the FM synthesis method that can express a complex waveform with fewer parameters than the linear prediction. By doing so, it is possible to perform more efficient encoding processing than processing using linear prediction.

（第２の実施の形態）
次に、本発明の第２の実施の形態におけるオーディオ信号符号化装置のブロック図を、図６に示し説明する。 (Second Embodiment)
Next, a block diagram of an audio signal encoding apparatus according to the second embodiment of the present invention will be described with reference to FIG.

図６に示すように、オーディオ信号符号化装置６０は、時間領域の入力オーディオ信号を周波数領域のスペクトル信号へ変換する時間周波数変換部６１と、少ないパラメータで高音質な音声を合成することができるＦＭ合成部６２と、上記周波数領域のスペクトル信号とＦＭ合成部６２によって算出された周波数領域のスペクトル信号との差をどの周波数帯域で算出するかを決定する適用周波数算出部６８と、上記周波数領域のスペクトル信号とＦＭ合成部６２によって算出された周波数領域のスペクトル信号との差を算出する残差信号算出部６３と、残差信号算出部６３によって算出された残差信号とＦＭ合成部６２において使用されたパラメータ等を量子化／符号化する量子化／符号化部６４とを備えた構成である。 As shown in FIG. 6, the audio signal encoding device 60 can synthesize a high-quality sound with a small number of parameters and a time-frequency conversion unit 61 that converts a time-domain input audio signal into a frequency-domain spectral signal. An FM synthesizing unit 62, an applied frequency calculating unit 68 that determines in which frequency band the difference between the spectrum signal in the frequency domain and the spectrum signal in the frequency domain calculated by the FM synthesizing unit 62 is calculated, and the frequency domain In the residual signal calculation unit 63 that calculates the difference between the spectrum signal of the frequency domain and the spectrum signal in the frequency domain calculated by the FM synthesis unit 62, the residual signal calculated by the residual signal calculation unit 63 and the FM synthesis unit 62 The configuration includes a quantization / encoding unit 64 that quantizes / encodes used parameters and the like.

以上のように構成されたオーディオ信号符号化装置６０について、その動作を説明する。 The operation of the audio signal encoding device 60 configured as described above will be described.

まず、時間領域の入力オーディオ信号は、時間周波数変換部６１によって周波数領域のスペクトル信号へと変換される。時間周波数変換部６１は、ＦＦＴやＭＤＣＴなどを使用することができる。次に、ＦＭ合成部６２は、時間周波数変換部６１によって変換された周波数領域のスペクトル信号をＦＭ合成方式を用いて上記第１の実施の形態で記載した手法を用いて近似する。 First, the time domain input audio signal is converted into a frequency domain spectrum signal by the time frequency converter 61. The time frequency conversion unit 61 can use FFT, MDCT, or the like. Next, the FM synthesizing unit 62 approximates the frequency domain spectrum signal converted by the time-frequency converting unit 61 using the technique described in the first embodiment using the FM synthesizing method.

次に、適用周波数算出部６８は、時間周波数変換部６１によって変換された周波数領域のスペクトル信号と、ＦＭ合成部６２によって算出された信号とについて、差分をどの周波数帯域で行うかを決定する。例えば、図７に示すように、ある周波数以上は差分を求め、ある周波数以下は差分を求めずに量子化／符号化部６４に入力する。また、ＭＰＥＧ２オーディオ規格ＡＡＣで設定されたスケールファクタバンドと呼ばれる帯域分割された帯域毎に差分を求めてもよい。 Next, the applied frequency calculation unit 68 determines in which frequency band the difference is to be performed between the spectrum signal in the frequency domain converted by the time frequency conversion unit 61 and the signal calculated by the FM synthesis unit 62. For example, as shown in FIG. 7, the difference is obtained at a certain frequency or higher, and is input to the quantization / encoding unit 64 without obtaining the difference at a certain frequency or lower. Further, a difference may be obtained for each band obtained by band division called a scale factor band set in the MPEG2 audio standard AAC.

次に、残差信号算出部６３は、ＦＭ合成部６２によって出力された周波数領域のスペクトル信号と、時間周波数変換部６１によって変換された周波数領域のスペクトル信号との残差信号を、適用周波数算出部６８で算出された周波数帯域において算出する。次に、量子化／符号化部６４では、上記残差信号と残差信号を求めた周波数帯域の情報とＦＭ合成に用いたアルゴリズムや周波数などのパラメータ等を量子化／符号化し、ビットストリームを出力する。 Next, the residual signal calculation unit 63 calculates an application frequency of a residual signal between the frequency domain spectrum signal output by the FM synthesis unit 62 and the frequency domain spectrum signal converted by the time frequency conversion unit 61. Calculation is performed in the frequency band calculated by the unit 68. Next, the quantization / encoding unit 64 quantizes / encodes the residual signal, the frequency band information for which the residual signal has been obtained, the parameters such as the algorithm and frequency used for the FM synthesis, and the bit stream Output.

このように、本発明の第２の実施の形態におけるオーディオ信号符号化装置によれば、残差信号を算出する周波数帯域を指定し、複雑な波形を線形予測よりも少ないパラメータで表現可能なＦＭ合成方式を用いて、周波数領域のスペクトルデータを予測し、ＦＭ合成による信号との差分が問題になる範囲でのみ残差信号を符号化することにより、必要外の残差信号を符号化せず、線形予測を使用した処理よりもより効率的な符号化処理を行うとともに、再生時の品質を落とさずに、さらなる符号化効率を高めることができる。 As described above, according to the audio signal encoding device in the second embodiment of the present invention, the frequency band in which the residual signal is calculated can be designated, and a complex waveform can be expressed with fewer parameters than linear prediction. Using the synthesis method, spectrum data in the frequency domain is predicted, and the residual signal is encoded only in a range where the difference from the signal by the FM synthesis becomes a problem, so that the unnecessary residual signal is not encoded. In addition to performing more efficient encoding processing than processing using linear prediction, it is possible to further increase the encoding efficiency without degrading the quality during reproduction.

（第３の実施の形態）
次に、本発明の第３の実施の形態におけるオーディオ信号復号化装置のブロック図を、図８に示し説明する。 (Third embodiment)
Next, a block diagram of an audio signal decoding apparatus according to the third embodiment of the present invention will be described with reference to FIG.

図８に示すように、オーディオ信号復号化装置８０は、圧縮されたビットストリームを復号化する復号化部８１と、符号化時にＦＭ合成で使用したアルゴリズムや周波数などのパラメータを用いてＦＭ合成し、ＦＭ合成された信号を周波数領域の信号として出力するＦＭ合成部８２と、ＦＭ合成された周波数領域の信号と残差信号とを加算する加算信号算出部８３と、周波数領域の信号を時間領域のオーディオ信号に変換する周波数時間変換部８４とを備えた構成である。 As shown in FIG. 8, the audio signal decoding apparatus 80 performs FM synthesis using a decoding unit 81 that decodes a compressed bitstream, and parameters such as an algorithm and frequency used in FM synthesis at the time of encoding. , An FM synthesizing unit 82 that outputs an FM synthesized signal as a frequency domain signal, an addition signal calculating unit 83 that adds the FM synthesized frequency domain signal and the residual signal, and a frequency domain signal as a time domain The frequency time conversion unit 84 converts the audio signal into an audio signal.

以上のように構成されたオーディオ信号復号化装置８０について、その動作を説明する。 The operation of the audio signal decoding apparatus 80 configured as described above will be described.

まず、圧縮された入力ビットストリームは、復号化部８１によって、符号化時にＦＭ合成で使用したアルゴリズムや周波数などのパラメータと、残差信号とに復号化される。次に、ＦＭ合成部８２では、復号化部８１によって復号化されたＦＭ合成で使用したアルゴリズムや周波数などのパラメータを用いてＦＭ合成をし、その合成波形を時間周波数変換を用いて周波数領域の信号へと変換する。 First, the compressed input bit stream is decoded by a decoding unit 81 into parameters such as an algorithm and frequency used in FM synthesis at the time of encoding, and a residual signal. Next, the FM synthesizing unit 82 performs FM synthesis using parameters such as the algorithm and frequency used in the FM synthesis decoded by the decoding unit 81, and the synthesized waveform is subjected to time-frequency conversion in the frequency domain. Convert to signal.

次に、加算信号算出部８３は、復号化部８１によって復号化された残差信号と、ＦＭ合成部８２によって出力された信号とを加算する。この加算された周波数領域の信号を周波数時間変換部８４によって時間領域のオーディオ信号へと変換する。 Next, the addition signal calculation unit 83 adds the residual signal decoded by the decoding unit 81 and the signal output by the FM synthesis unit 82. The frequency domain signal thus added is converted into a time domain audio signal by the frequency time converter 84.

このように、本発明の第３の実施の形態におけるオーディオ信号復号化装置によれば、ＦＭ合成信号のパラメータと、オーディオ信号と前記ＦＭ合成信号との残差信号とを復号化して、オーディオ信号を再生することができるので、線形予測による符号化信号より圧縮率の高い符号化信号から、オーディオ信号を復号化することができる。 Thus, according to the audio signal decoding device in the third embodiment of the present invention, the audio signal is decoded by decoding the parameter of the FM synthesized signal and the residual signal of the audio signal and the FM synthesized signal. Therefore, the audio signal can be decoded from the encoded signal having a higher compression rate than the encoded signal based on the linear prediction.

（第４の実施の形態）
次に、本発明の第４の実施の形態におけるオーディオ信号復号化装置のブロック図を、図９に示し説明する。 (Fourth embodiment)
Next, a block diagram of an audio signal decoding apparatus according to the fourth embodiment of the present invention will be described with reference to FIG.

図９に示すように、オーディオ信号復号化装置９０は、圧縮されたビットストリームを復号化する復号化部９１と、符号化時にＦＭ合成で使用したアルゴリズムや周波数などのパラメータを用いてＦＭ合成し、ＦＭ合成された信号を周波数領域の信号として出力するＦＭ合成部９２と、ＦＭ合成された周波数領域の信号と残差信号とを適用周波数情報で決定される周波数帯域で加算する加算信号算出部９３と、周波数領域の信号を時間領域のオーディオ信号に変換する周波数時間変換部９４とを備えた構成である。 As shown in FIG. 9, the audio signal decoding device 90 performs FM synthesis using a decoding unit 91 that decodes a compressed bitstream, and parameters such as an algorithm and frequency used in FM synthesis at the time of encoding. , An FM synthesizing unit 92 for outputting the FM synthesized signal as a frequency domain signal, and an addition signal calculating unit for adding the FM synthesized frequency domain signal and the residual signal in a frequency band determined by the applied frequency information 93 and a frequency time conversion unit 94 that converts a frequency domain signal into a time domain audio signal.

以上のように構成されたオーディオ信号復号化装置９０について、その動作を説明する。 The operation of the audio signal decoding apparatus 90 configured as described above will be described.

まず、圧縮された入力ビットストリームは、復号化部９１によって、符号化時にＦＭ合成で使用したアルゴリズムや周波数などのパラメータと、残差信号と、この残差信号が適用された周波数を示す適用周波数情報とに復号化される。次に、ＦＭ合成部９２では、復号化部９１によって復号化されたＦＭ合成で使用したアルゴリズムや周波数などのパラメータを用いてＦＭ合成をし、その合成波形を時間周波数変換を用いて周波数領域の信号へと変換する。 First, the compressed input bit stream is decoded by the decoding unit 91 using parameters such as an algorithm and frequency used in FM synthesis at the time of encoding, a residual signal, and an applied frequency indicating a frequency to which the residual signal is applied. Decoded into information. Next, the FM synthesizing unit 92 performs FM synthesis using parameters such as the algorithm and frequency used in the FM synthesis decoded by the decoding unit 91, and the synthesized waveform is subjected to time-frequency conversion in the frequency domain. Convert to signal.

次に、加算信号算出部９３は、復号化部９１によって復号化された残差信号と、ＦＭ合成部９２によって出力された信号とを、復号化部９１によって復号化され適用周波数情報で示された周波数帯域で加算する。この加算された周波数領域の信号を周波数時間変換部９４によって時間領域のオーディオ信号へと変換する。 Next, the addition signal calculation unit 93 decodes the residual signal decoded by the decoding unit 91 and the signal output by the FM synthesizing unit 92 by the decoding unit 91 and is indicated by the applied frequency information. Add in the selected frequency band. The added frequency domain signal is converted into a time domain audio signal by the frequency time converter 94.

このように、本発明の第４の実施の形態におけるオーディオ信号復号化装置によれば、ＦＭ合成信号のパラメータと、オーディオ信号と前記ＦＭ合成信号との残差信号と、この残差信号が適用された周波数を示す適用周波数情報とを復号化して、オーディオ信号を再生することができるので、ＦＭ合成による信号との差分が実質的に影響を及ぼす範囲でのみ残差信号が符号化された、線形予測による符号化信号より特に圧縮率の高い符号化信号から、オーディオ信号を復号化することができる。 Thus, according to the audio signal decoding device in the fourth embodiment of the present invention, the FM composite signal parameter, the residual signal between the audio signal and the FM composite signal, and the residual signal are applied. Since the audio signal can be reproduced by decoding the applied frequency information indicating the generated frequency, the residual signal is encoded only to the extent that the difference from the signal by the FM synthesis substantially affects, An audio signal can be decoded from an encoded signal having a compression rate that is particularly higher than that of an encoded signal based on linear prediction.

以上のように、本発明にかかるオーディオ信号符号化復号化装置は、複雑な波形を線形予測よりも少ないパラメータで表現可能なＦＭ合成方式を用いて、周波数領域のスペクトルデータを予測し、また、この信号との差分である残差信号を求め、前記パラメータと前記残差信号とを符号化するので、線形予測を使用した処理よりもより効率的な符号化処理を行うことができるという効果を有し、オーディオ信号を符号化するオーディオ信号符号化復号化装置等として有用である。 As described above, the audio signal encoding / decoding device according to the present invention predicts frequency domain spectrum data using an FM synthesis method capable of expressing a complex waveform with fewer parameters than linear prediction, and Since a residual signal that is a difference from this signal is obtained and the parameter and the residual signal are encoded, an effect that the encoding process can be performed more efficiently than the process using linear prediction is achieved. And is useful as an audio signal encoding / decoding device or the like for encoding an audio signal.

本発明の第１の実施の形態におけるオーディオ信号符号化装置のブロック図1 is a block diagram of an audio signal encoding device according to a first embodiment of the present invention. 本発明の第１の実施の形態におけるＦＭ合成部のブロック図The block diagram of the FM synthetic | combination part in the 1st Embodiment of this invention 本発明の第１の実施の形態におけるＦＭ合成部のブロック図The block diagram of the FM synthetic | combination part in the 1st Embodiment of this invention 本発明の第１の実施の形態におけるＦＭ合成パラメータの算出方法を示す図The figure which shows the calculation method of the FM synthetic | combination parameter in the 1st Embodiment of this invention 本発明の第１の実施の形態におけるＦＭ合成パラメータの算出方法を示す図The figure which shows the calculation method of the FM synthetic | combination parameter in the 1st Embodiment of this invention 本発明の第２の実施の形態におけるオーディオ信号符号化装置のブロック図Block diagram of an audio signal encoding apparatus according to the second embodiment of the present invention 本発明の第２の実施の形態における適用周波数算出部の適用周波数領域の一例を示す図The figure which shows an example of the applied frequency area | region of the applied frequency calculation part in the 2nd Embodiment of this invention. 本発明の第３の実施の形態におけるオーディオ信号復号化装置のブロック図The block diagram of the audio signal decoding apparatus in the 3rd Embodiment of this invention 本発明の第４の実施の形態におけるオーディオ信号復号化装置のブロック図The block diagram of the audio signal decoding apparatus in the 4th Embodiment of this invention 従来のオーディオ信号符号化装置のブロック図Block diagram of a conventional audio signal encoding device 従来のオーディオ信号符号化装置におけるＴＮＳ処理部のブロック図Block diagram of a TNS processing unit in a conventional audio signal encoding device

Explanation of symbols

１０、６０オーディオ信号符号化装置
１１、４１、４９、５１、５９、６１時間周波数変換部
１２、６２、８２、９２ＦＭ合成部
１３、６３残差信号算出部
１４、６４量子化／符号化部
２１、２２、３１、３２発信器
４５、５５ＦＭ合成部
４６解析部
５７比較部
６８適用周波数算出部
８０、９０オーディオ信号復号化装置
８１、９１復号化部
８３、９３加算信号算出部
８４、９４周波数時間変換部
１１０聴覚心理モデル部
１１１ゲイン制御部
１１２ＭＤＣＴ部
１１３ＴＮＳ処理部
１１４インテンシティ／カップリング部
１１５予測部
１１６Ｍ／Ｓステレオ部
１１７正規化係数部
１１８量子化部
１１９ノイズレス符号化部
１２０マルチプレクサ部
１２１スペクトル処理部
１２２量子化／符号化部
１３１ＭＤＣＴ部
１３２線形予測部
１３３残差信号算出部
１３４量子化／符号化部 10, 60 Audio signal encoding device 11, 41, 49, 51, 59, 61 Time frequency conversion unit 12, 62, 82, 92 FM synthesis unit
13, 63 Residual signal calculation unit 14, 64 Quantization / coding unit 21, 22, 31, 32 Transmitter 45, 55 FM synthesis unit 46 Analysis unit 57 Comparison unit 68 Applicable frequency calculation unit 80, 90 Audio signal decoding Device 81, 91 Decoding unit 83, 93 Addition signal calculation unit 84, 94 Frequency time conversion unit 110 Auditory psychological model unit 111 Gain control unit 112 MDCT unit 113 TNS processing unit 114 Intensity / coupling unit 115 Prediction unit 116 M / S stereo section 117 normalization coefficient section 118 quantization section 119 noiseless coding section 120 multiplexer section 121 spectrum processing section 122 quantization / coding section 131 MDCT section 132 linear prediction section 133 residual signal calculation section 134 quantization / coding Part

Claims

A time-frequency conversion means for converting a time-domain audio signal into a frequency domain;
FM synthesis means for approximating the signal transformed into the frequency domain by the time-frequency transformation means;
Residual signal calculation means for calculating a difference between the signal converted into the frequency domain by the time frequency conversion means and the frequency domain signal approximated by the FM synthesis means;
An audio signal encoding apparatus comprising: encoding means for encoding the residual signal in the frequency domain calculated by the residual signal calculating means and the parameter used in the FM synthesizing means.

An application frequency calculating means for calculating an application range of a frequency for obtaining a residual signal in the residual signal calculating means;
The residual signal calculating means calculates the residual signal of the applied frequency band calculated by the applied frequency calculating means;
The audio signal encoding apparatus according to claim 1, wherein the encoding means encodes the applied frequency band calculated by the applied frequency calculating means as a parameter.

Decoding means for decoding the encoded frequency domain residual signal and the parameters used for FM synthesis;
FM synthesis means for performing FM synthesis using the parameters decoded by the decoding means;
An addition signal calculation means for adding the frequency domain residual signal decoded by the decoding means and the frequency domain signal output by the FM synthesis means;
An audio signal decoding apparatus comprising: frequency time conversion means for converting a frequency domain signal generated by the addition signal calculation means into a time domain audio signal.

Decoding means for decoding the encoded frequency domain residual signal, parameters used for FM synthesis, and parameters of the frequency band from which the residual signal was calculated;
FM synthesis means for performing FM synthesis using the parameters of FM synthesis decoded by the decoding means;
Addition signal calculation for adding the frequency domain residual signal decoded by the decoding unit and the frequency domain signal output by the FM synthesizing unit in the applied frequency band set by the frequency band parameter Means,
An audio signal decoding apparatus comprising: frequency time conversion means for converting a frequency domain signal generated by the addition signal calculation means into a time domain audio signal.