[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN101754086B - Decoder and decoding method for multichannel audio coder using sound source location cue - Google Patents

Decoder and decoding method for multichannel audio coder using sound source location cue Download PDF

Info

Publication number
CN101754086B
CN101754086B CN2009102238140A CN200910223814A CN101754086B CN 101754086 B CN101754086 B CN 101754086B CN 2009102238140 A CN2009102238140 A CN 2009102238140A CN 200910223814 A CN200910223814 A CN 200910223814A CN 101754086 B CN101754086 B CN 101754086B
Authority
CN
China
Prior art keywords
signal
information
audio
sound source
multichannel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2009102238140A
Other languages
Chinese (zh)
Other versions
CN101754086A (en
Inventor
徐廷一
白承权
李用主
姜京玉
洪镇佑
金镇雄
安致得
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electronics and Telecommunications Research Institute ETRI
Original Assignee
Electronics and Telecommunications Research Institute ETRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electronics and Telecommunications Research Institute ETRI filed Critical Electronics and Telecommunications Research Institute ETRI
Publication of CN101754086A publication Critical patent/CN101754086A/en
Application granted granted Critical
Publication of CN101754086B publication Critical patent/CN101754086B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/26Pre-filtering or post-filtering
    • G10L19/265Pre-filtering, e.g. high frequency emphasis prior to encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/02Systems employing more than two channels, e.g. quadraphonic of the matrix type, i.e. in which input signals are combined algebraically, e.g. after having been phase shifted with respect to each other

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Algebra (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Stereophonic System (AREA)

Abstract

本发明提供了一种基于音源位置线索的多频道音频的解码装置及其方法。所述解码装置包括:解复用器,接收信号,且将接收的上述信号剖析成音频比特流和附加信息比特流;音频解码器,基于所述音频比特流,对下混信号进行复原,且将复原的下混信号传送给综合上混器,而不将分析滤波器组适用于所述复原的下混信号;综合上混器,通过使用所述下混信号和所述附加信息比特流,预测多频道信号,基于所述多频道信号将所述下混信号进行上混,来生成上混信号;和综合滤波器组的开窗口器,对所述上混信号实行综合滤波器组来抽取时间领域信号,且对所述上混信号进行开窗口,来抽取输出信号。

Figure 200910223814

The invention provides a multi-channel audio decoding device and method based on sound source location clues. The decoding device includes: a demultiplexer that receives a signal, and parses the received signal into an audio bit stream and an additional information bit stream; an audio decoder that restores the downmixed signal based on the audio bit stream, and passing the restored downmix signal to a synthesis upmixer without applying an analysis filter bank to said restored downmix signal; the synthesis upmixer, by using said downmix signal and said additional information bitstream, predicting a multi-channel signal, and upmixing the downmixed signal based on the multichannel signal to generate an upmixed signal; and a window opener of an integrated filter bank, performing an integrated filter bank on the upmixed signal to extract time domain signal, and perform windowing on the upmixed signal to extract the output signal.

Figure 200910223814

Description

一种基于音源位置线索的多频道音频的解码装置和其方法A decoding device and method for multi-channel audio based on sound source location clues

技术领域 technical field

本发明公开了一种基于音源位置线索的多频道音频的解码装置和其方法。The invention discloses a multi-channel audio decoding device and method based on sound source location clues.

背景技术 Background technique

声源定位线索编码(SSLCC:Sound Source Location Cue Coding)的基本编码概念从空间音频编码(SAC:Spatial Audio Coding)的方法出发。The basic coding concept of Sound Source Location Cue Coding (SSLCC: Sound Source Location Cue Coding) starts from the method of Spatial Audio Coding (SAC: Spatial Audio Coding).

所述SAC作为基于音源位置线索的多频道音频的压缩技术,基于空间内人们认知的空间线索(Spatial Cue)删除各频道信号的冗余(redundancy)可以使其压缩效率极大化。The SAC is a multi-channel audio compression technology based on sound source location cues, and deleting the redundancy of each channel signal based on the spatial cues (Spatial Cue) recognized by people in the space can maximize the compression efficiency.

并且,多频道信号基本上通过下混处理,使传来的音频的下载信号成为内核(core)信号。即,通过现有立体音频可以实现重现,其是SAC方法的基本原则。Moreover, the multi-channel signal is basically processed through down-mixing, so that the transmitted audio download signal becomes a core signal. That is, reproduction can be achieved through existing stereo audio, which is the basic principle of the SAC method.

所述SSLCC是SAC方法中的一个且是在空间上人们认知的空间线索,其可以预测音源的位置,从多频道信号中抽取音源的位置信息,来表现且传送。The SSLCC is one of the SAC methods and is a spatial cue recognized by people in space. It can predict the position of the sound source, and extract the position information of the sound source from the multi-channel signal to represent and transmit it.

此时,由于通过SAC编码策略抽取的信息的信息量太少,其可以传送到冗余领域,因此如果接收所述信息的接收处不是支持SAC编码的音频,则利用现有立体音频只重现立体信号,如果接收所述信息的接收处是支持SAC编码的音频,则使用被传来的冗余信息从下混的立体信号中复原基于音源位置线索的多频道的音频信号。At this time, because the amount of information extracted by the SAC encoding strategy is too small, it can be transmitted to the redundant field, so if the receiver receiving the information is not audio that supports SAC encoding, the existing stereo audio is used to reproduce only For the stereo signal, if the receiver receiving the information supports SAC coding audio, the transmitted redundant information is used to restore the multi-channel audio signal based on the location clue of the sound source from the downmixed stereo signal.

但是,为了通过使用冗余信息从下混的立体信号中复原基于音源位置线索的多频道的音频信号,需要使用与在通过SAC编码策略抽取信息的过程中所利用的T/F(time to frequency)变换方法相同的T/F变换方法,因此如果在通过SAC编码策略抽取信息的过程中所利用的T/F变换方法不是最适合接收处的T/F变换方法的话,其会给变换过程会带来不好的影响。However, in order to restore a multi-channel audio signal based on sound source position cues from the downmixed stereo signal by using redundant information, it is necessary to use the T/F (time to frequency ) transformation method with the same T/F transformation method, so if the T/F transformation method used in the process of extracting information through the SAC coding strategy is not the most suitable T/F transformation method at the receiving end, it will give the transformation process bring bad influence.

因此,需要一种最适合接收处的T/F变换方法来复原基于音源位置线索的多频道的音频信号的装置和方法。Therefore, there is a need for an apparatus and method for recovering multi-channel audio signals based on sound source location clues that are most suitable for the T/F conversion method at the receiving end.

发明内容 Contents of the invention

本发明提供了一种通过将多频道音频信号接收和压缩,且经由基本立体编码将立体信号压缩和传送,在其与基本立体音频编码保持反兼容性(backward compatible)的同时可以传送多频道音频信号的解码装置和其方法。The present invention provides a method of receiving and compressing multi-channel audio signals, and compressing and transmitting the stereo signals via basic stereo coding, which can transmit multi-channel audio while maintaining backward compatibility with basic stereo audio coding. Signal decoding device and method thereof.

本发明提供了一种通过使用时域混迭取消(TDAC:Time Domain AliasingCancellation)文档器组(file bank)根据选择可以变换对立体下混的音频信号的T/F的解码装置和其方法。The present invention provides a decoding device and method thereof that can convert the T/F of an audio signal for stereo downmixing according to selection by using a Time Domain Aliasing Cancellation (TDAC: Time Domain Aliasing Cancellation) file bank.

技术方案Technical solutions

根据本发明的一个示例性实施例,提供了一种基于音源位置线索的多频道音频的解码装置,其包括:解复用器,接收信号,且将接收的上述信号剖析成音频比特流和附加信息比特流;音频解码器,基于所述音频比特流,对下混信号进行复原;综合上混器,通过使用所述下混信号和所述附加信息比特流,预测多频道信号,基于所述多频道信号将所述下混信号进行上混,来生成上混信号;和综合滤波器组的开窗口器,对所述上混信号实行综合滤波器组来抽取时间领域信号,且对所述上混信号进行开窗口,来抽取输出信号。According to an exemplary embodiment of the present invention, a multi-channel audio decoding device based on sound source location clues is provided, which includes: a demultiplexer, receiving signals, and analyzing the received signals into audio bit streams and additional an information bit stream; an audio decoder, based on the audio bit stream, restores the downmixed signal; an integrated up-mixer, predicting a multi-channel signal by using the downmixed signal and the additional information bitstream, based on the The multi-channel signal upmixes the downmixed signal to generate an upmixed signal; and a windower of an integrated filter bank performs an integrated filter bank on the upmixed signal to extract a time domain signal, and the The upmix signal is windowed to extract the output signal.

并且,根据本发明的另外一个示例性实施例,提供了一种基于音源位置线索的多频道音频的解码方法,其包括:将接收的信号剖析成音频比特流和附加信息比特流;基于所述音频比特流,对下混信号进行复原;使用所述下混信号和所述附加信息来预测多频道信号;基于所述多频道信号,上混所述下混信号,以生成上混信号;对所述上混信号实行综合滤波器组来抽取时间领域信号;和对所述上混信号进行开窗口,来抽取输出信号。And, according to another exemplary embodiment of the present invention, a method for decoding multi-channel audio based on sound source location clues is provided, which includes: analyzing the received signal into an audio bit stream and an additional information bit stream; based on the an audio bit stream, restoring the downmixed signal; using the downmixed signal and the additional information to predict a multi-channel signal; based on the multi-channel signal, upmixing the downmixed signal to generate an upmixed signal; performing a synthesis filter bank on the upmixed signal to decimate a time domain signal; and windowing the upmixed signal to decimate an output signal.

技术效果technical effect

根据本发明,通过将多频道音频信号接收和压缩,且经由基本立体编码将立体信号压缩和传送,在其与基本立体音频编码保持反兼容性(backwardcompatible)的同时可以传送多频道音频信号。According to the present invention, by receiving and compressing the multi-channel audio signal, and compressing and transmitting the stereo signal via the basic stereo coding, the multi-channel audio signal can be transmitted while remaining backward compatible with the basic stereo coding.

本发明通过使用TDAC滤波器组,根据选择可以变换对立体下混的音频信号的T/F。The present invention can transform the T/F of the stereo downmixed audio signal according to selection by using the TDAC filter bank.

附图说明 Description of drawings

图1示出根据本发明的示例性实施例的基于音源位置线索的多频道音频的编码装置;Fig. 1 shows the coding device of the multi-channel audio based on sound source position clue according to the exemplary embodiment of the present invention;

图2示出根据本发明的示例性实施例的基于音源位置线索的多频道音频的解码装置;Fig. 2 shows the decoding device of the multi-channel audio based on sound source location clue according to the exemplary embodiment of the present invention;

图3示出根据本发明的另外一个示例性实施例的基于音源位置线索的多频道音频的解码装置;FIG. 3 shows a multi-channel audio decoding device based on sound source position clues according to another exemplary embodiment of the present invention;

图4示出根据本发明的示例性实施例的附加信息比特流的解码装置;FIG. 4 shows an apparatus for decoding an additional information bitstream according to an exemplary embodiment of the present invention;

图5示出根据本发明的示例性实施例的综合上混器预测每个频道的增益的过程;FIG. 5 shows a process in which an integrated up-mixer predicts the gain of each channel according to an exemplary embodiment of the present invention;

图6示出根据本发明的示例性实施例的逆相关器;Figure 6 shows an inverse correlator according to an exemplary embodiment of the present invention;

图7示出出根据本发明的示例性实施例的基于音源位置线索的多频道音频的解码方法。Fig. 7 shows a method for decoding multi-channel audio based on audio source position clues according to an exemplary embodiment of the present invention.

具体实施方式 Detailed ways

如下将参考附图对本发明的示例性实施例进行具体地说明。Exemplary embodiments of the present invention will be specifically described below with reference to the accompanying drawings.

图1示出根据本发明的示例性实施例的基于音源位置线索的多频道音频的编码装置。Fig. 1 shows an apparatus for encoding multi-channel audio based on audio source position clues according to an exemplary embodiment of the present invention.

根据本发明的示例性实施例的基于音源位置线索的多频道音频的编码装置100作为基于SSLCC的5频道多频道音频编码装置,如图1所示,由前处理滤波器组器110、分析器120、下混处理器130、音频编码器140和复用器150构成。According to the exemplary embodiment of the present invention, the multi-channel audio coding device 100 based on the sound source position clue is as a 5-channel multi-channel audio coding device based on SSLCC, as shown in FIG. 120, a downmix processor 130, an audio encoder 140 and a multiplexer 150.

此时,根据本发明的示例性实施例的基于音源位置线索的多频道音频的编码装置100可以扩展到5频道以上的基于音源位置线索的多频道音频的内容。At this time, the multi-channel audio encoding device 100 based on sound source location clues according to an exemplary embodiment of the present invention may be extended to more than 5 channels of multi-channel audio content based on sound source location clues.

前处理滤波器组器110进行前处理输入到基于音源位置线索的多频道音频的编码装置100的多频道的输入音频信号,将被前处理的输入音频信号经由滤波器组而变换为频率领域的信号。此时,滤波器组基于子频带分析进行T/F变换,其可以应用于MDCT、MDST、DFT等。The pre-processing filter bank unit 110 performs pre-processing on the multi-channel input audio signal input to the multi-channel audio encoding device 100 based on the sound source position clue, and converts the pre-processed input audio signal into a frequency domain through a filter bank. Signal. At this time, the filter bank performs T/F transformation based on sub-band analysis, which can be applied to MDCT, MDST, DFT, and the like.

在此,所述输入的音频信号可以包括输入信号LF(Left Front)、输入信号RF(Right Front)、输入信号C(Center Front)、输入信号Ls(Left Surround)、输入信号Rs(Right Surround)。Here, the input audio signal may include input signal LF (Left Front), input signal RF (Right Front), input signal C (Center Front), input signal Ls (Left Surround), input signal Rs (Right Surround) .

分析器120在前处理滤波器组110中从变换为频率领域的输入音频信号中抽取空间线索(spatial cue),将所述空间线索表现为附加信息的比特流来传送。此时,分析器120通过压缩所述输入音频信号而传送到下混处理器130。The analyzer 120 extracts spatial cues from the input audio signal transformed into the frequency domain in the pre-processing filter bank 110 , and expresses the spatial cues as a bit stream of additional information for transmission. At this time, the analyzer 120 transmits to the downmix processor 130 by compressing the input audio signal.

下混处理器130可以进行下混如分析器120在频率领域下混所述输入音频信号。并且,下混处理器130根据ITU-T的推荐案可以进行下混。The downmix processor 130 may perform downmixing as the analyzer 120 downmixes the input audio signal in the frequency domain. In addition, the downmix processor 130 can perform downmixing according to ITU-T recommendations.

在下混处理器130中下混的音频信号通过常用的立体音频可以表现为比特流。所述常用的立体音频可以利用MP3(MPEG Layer III)或AAC(AdvancedAudio Coding)等。The audio signal downmixed in the downmixing processor 130 may be expressed as a bit stream through common stereo audio. The commonly used stereo audio can utilize MP3 (MPEG Layer III) or AAC (Advanced Audio Coding) and the like.

音频编码器140可以编码从下混处理器130中下混的音频信号。The audio encoder 140 may encode the audio signal downmixed from the downmix processor 130 .

复用器150将从音频编码器140中编码的信号和从分析器120中传送的附加信息比特流进行结合和传送。The multiplexer 150 combines and transmits the encoded signal from the audio encoder 140 and the additional information bit stream transmitted from the analyzer 120 .

图2示出根据本发明的示例性实施例的基于音源位置线索的多频道音频的解码装置。Fig. 2 shows an apparatus for decoding multi-channel audio based on sound source position clues according to an exemplary embodiment of the present invention.

根据本发明的示例性实施例的基于音源位置线索的多频道音频的解码装置200作为基于SSLCC的5频道多频道音频解码装置,如图2所示,由解复用器210、音频解码器220、开窗口滤波器组230、综合上混器240和综合滤波器组开窗口器250构成。The decoding device 200 of the multi-channel audio based on the sound source position clue according to the exemplary embodiment of the present invention is as the 5-channel multi-channel audio decoding device based on SSLCC, as shown in Figure 2, by demultiplexer 210, audio decoder 220 , a windowing filter bank 230 , an integrated upmixer 240 and an integrated filter bank windower 250 .

解复用器210接收从解复用器150传送的信号,将接收的所述信号剖析成音频比特流和附加信息比特流。The demultiplexer 210 receives the signal transmitted from the demultiplexer 150, and parses the received signal into an audio bit stream and an additional information bit stream.

音频解码器220基于所述音频比特流复原下混的信号。The audio decoder 220 restores the downmixed signal based on the audio bitstream.

开窗口的滤波器组230对所述下混的信号施加分析滤波器组来进行T/F变换,将T/F变换的所述下混的信号开窗口且传送。The windowed filter bank 230 applies an analysis filter bank to the downmixed signal to perform T/F conversion, windowing and transmitting the T/F converted downmixed signal.

综合上混器240通过使用所述下混的信号与所述附加信息比特流预测多频道信号,基于所述多频道信号上混所述下混的信号来生成上混的信号。The integrated upmixer 240 generates an upmixed signal by upmixing the downmixed signal based on the multichannel signal by predicting a multichannel signal using the downmixed signal and the additional information bitstream.

具体地说,综合上混器240从所述下混的信号中分离振幅信息和位相信息,基于所述振幅信号对已存的随机时序开窗口来给所述位相信息赋予加权值,基于所述振幅信息和加权值赋予的位相信息可以预测多频道信号。Specifically, the integrated up-mixer 240 separates the amplitude information and the phase information from the down-mixed signal, and assigns a weighted value to the phase information based on the amplitude signal by windowing the stored random time series, based on the Amplitude information and phase information given by weighting values can predict multi-channel signals.

此时,综合上混器240复变换所述下混的信号,从复变换的所述下混的信号中分离振幅信息和位相信息。At this time, the integrated upmixer 240 complex transforms the downmixed signal, and separates amplitude information and phase information from the complex transformed downmixed signal.

并且,综合上混器240基于所述振幅信息的包络线建模用来修改位相信息的频谱型窗口,将所述频谱型窗口适用于已存的随机时序来开窗口,通过使用被开窗口的随机时序将加权值赋予所述位相信息。Moreover, the integrated upmixer 240 is used to modify the spectral window of the phase information based on the envelope modeling of the amplitude information, and applies the spectral window to the stored random time sequence to open the window, by using the windowed The random timing of will assign weighted values to the phase information.

综合上混器240结合所述振幅信息和加权值赋予的位相信息,将所述结合的振幅信息和加权值赋予的位相信息进行逆复变换来预测多频道信号。The integrated upmixer 240 combines the amplitude information and the phase information assigned by the weight value, and inversely transforms the combined amplitude information and the phase information assigned by the weight value to predict the multi-channel signal.

综合滤波器组的开窗口器250对所述上混的信号实行综合滤波器组来抽取时间领域的信号,对所述上混信号实行开窗口来抽取输出信号。The window opener 250 of the integrated filter bank performs an integrated filter bank on the upmixed signal to extract a signal in the time domain, and performs windowing on the upmixed signal to extract an output signal.

图3示出根据本发明的另外一个示例性实施例的基于音源位置线索的多频道音频的解码装置。Fig. 3 shows an apparatus for decoding multi-channel audio based on audio source position clues according to another exemplary embodiment of the present invention.

根据本发明的另外一个实施例的基于音源位置线索的多频道音频的解码装置300作为适用实际变换(real transform)的基于音源位置线索的多频道音频的解码装置,如图2所示,其由解复用器310、TDAC滤波器组320、综合上混器330和综合滤波器组的开窗口器340构成。According to another embodiment of the present invention, the decoding device 300 of multi-channel audio based on sound source position clues is used as a decoding device 300 based on the multi-channel audio based on sound source position clues of real transform, as shown in FIG. 2 , which consists of A demultiplexer 310, a TDAC filter bank 320, an integrated upmixer 330 and a windower 340 for the integrated filter bank are formed.

SSLCC基本上跟随着DFT滤波器组(变换)。可是,为了与内核立体音频相连动,可以使用多种滤波器组。SSLCC basically follows a DFT filter bank (transform). However, for interfacing with Kernel Stereo Audio, various filter banks are available.

虽然滤波器组的形态有所变化,但是SSLCC分析器120或综合上混器240与综合滤波器组的开窗口器250之间的综合(synthesis)的动作原理相同。Although the form of the filter bank is changed, the operation principle of the synthesis between the SSLCC analyzer 120 or the synthesis upmixer 240 and the windower 250 of the synthesis filter bank is the same.

由于在立体传送时不适用逆相关器(decorrelator),可以实现实际变换。根据本发明的另外一个实施例的基于音源位置线索的多频道音频的解码装置300使用可以实际变换的MDCT来与内核编解码互相连动。Since no decorrelator is used for stereoscopic transmission, the actual transformation can be realized. According to another embodiment of the present invention, the multi-channel audio decoding device 300 based on sound source location clues uses the MDCT that can be actually transformed to interlock with the kernel codec.

TDAC滤波器组320基于所述音频的比特流复原下混的信号,然后省略对所述下混的信号适用分析的滤波器组的过程和开窗口的过程,而传送到综合上混器330。此时,TDAC滤波器组320所传送的信号可以为频率下混的信号L和频率下混的信号R。The TDAC filter bank 320 restores the downmixed signal based on the audio bitstream, and then omits the process of applying the analytical filter bank and the windowing process to the downmixed signal, and transmits it to the integrated upmixer 330 . At this time, the signals transmitted by the TDAC filter bank 320 may be the frequency downmixed signal L and the frequency downmixed signal R.

综合滤波器组的开窗口器340将综合滤波器组适用于在综合上混器330中所生成的上混的信号而抽取时间领域的信号,用来将所述上混的信号和内核立体音频的分析开窗口互相搭配开窗口,以抽取输出信号。The synthesis filter bank windower 340 applies a synthesis filter bank to the upmixed signal generated in the synthesis upmixer 330 to decimate the time domain signal for combining said upmixed signal with the kernel stereo audio The analysis windows of the two are paired with each other to extract the output signal.

此时,解复用器310、综合上混器330与根据本发明的一个实施例的基于音源位置线索的多频道音频的解码装置200的解复用器210、综合上混器240具有相同的结构,因此省略详细的说明。At this time, the demultiplexer 310 and the integrated up-mixer 330 have the same functions as the demultiplexer 210 and the integrated up-mixer 240 of the multi-channel audio decoding device 200 based on sound source location clues according to an embodiment of the present invention. structure, so a detailed description is omitted.

根据本发明的示例性实施例的基于音源位置线索的多频道音频的解码装置300,根据选择可以变化对立体下混的音频信号的T/F变换。例如,在解码装置中逆相关器的动作为关闭off的情况下,实际T/F变换也可以被适用。此时,复T/F变换也可以实现,即使在复T/F变换的情况下,也可以不使用位相信息。According to the multi-channel audio decoding apparatus 300 based on sound source position clues according to an exemplary embodiment of the present invention, the T/F transformation of the stereo downmixed audio signal can be changed according to selection. For example, when the operation of the inverse correlator in the decoding device is off, the actual T/F conversion can also be applied. At this time, complex T/F conversion can also be realized, and even in the case of complex T/F conversion, phase information does not need to be used.

但是,在解码装置中需要位相信息的动作为打开on时,即在需要逆相关器的动作的情况下,必须使用复T/F变换。在复T/F变换时,DFT成为基本,也可以应用MDCT/MDST作为一个复变换对(complex transform pair)。However, when the operation requiring phase information is turned on in the decoding device, that is, when an inverse correlator operation is required, complex T/F conversion must be used. In complex T/F transformation, DFT becomes the basis, and MDCT/MDST can also be applied as a complex transform pair.

图4示出根据本发明的示例性实施例的附加信息比特流的解码装置。FIG. 4 shows an apparatus for decoding an additional information bitstream according to an exemplary embodiment of the present invention.

根据本发明的示例性实施例的附加信息比特流的解码装置作为解码从解复用器210所剖析的附加信息的比特流中附加信息即VLSA(Virtual SoundLocation Angle),如图4所示可以由霍夫曼解码器410和逆量子化器420构成。According to the decoding device of the additional information bit stream of the exemplary embodiment of the present invention, as the additional information in the bit stream of the additional information analyzed by the demultiplexer 210, that is, VLSA (Virtual SoundLocation Angle), as shown in FIG. Huffman decoder 410 and inverse quantizer 420 constitute.

并且,根据本发明的示例性实施例的附加信息比特流的解码装置可以属于开窗口的滤波器组230和综合滤波器组的开窗口器组250。And, the decoding apparatus of the additional information bitstream according to the exemplary embodiment of the present invention may belong to the windowed filter bank 230 and the windowed filter bank 250 to synthesize the filter bank.

霍夫曼解码器410用霍夫曼编码书对所述附加信息的比特流进行霍夫曼编码可以生成差别指数(differential index)。The Huffman decoder 410 uses the Huffman coding book to perform Huffman coding on the bit stream of the additional information to generate a differential index.

霍夫曼解码器410包括逆差分编码器411、差分编码器412、映射器413和霍夫曼编码器414来生成所述霍夫曼编码书。The Huffman decoder 410 includes an inverse differential encoder 411 , a differential encoder 412 , a mapper 413 and a Huffman encoder 414 to generate the Huffman codebook.

逆差分编码器411基于已经处理的前帧和霍夫曼编码形的信息实行逆差分编码来可以解码原指数(original index)。The inverse differential encoder 411 can decode the original index by performing inverse differential encoding based on the processed previous frame and Huffman coded information.

并且,差分编码器412与正弦sine比特的信息相应从原指数中删除负信息,然后进行差分编码来生成指数信息。In addition, the differential encoder 412 deletes negative information from the original exponent according to the sine bit information, and then performs differential encoding to generate exponent information.

映射器413用来解除在指数中的所述负信息删除偏移量(offset)信息,然后根据频率解答(solution)映射所述指数,以分成第一子频带(sub band)和除所述第一子频带以外的其他频带。The mapper 413 is used to remove the negative information in the exponent, delete the offset (offset) information, and then map the exponent according to the frequency solution (solution), to be divided into the first sub-band (sub band) and divide the second Other frequency bands other than a sub-band.

最后,霍夫曼编码器414对所述每个第一子频带和除所述第一子频带以外的其他频带适用霍夫曼编码方法来生成霍夫曼编码书。Finally, the Huffman encoder 414 applies a Huffman encoding method to each of the first sub-bands and other frequency bands except the first sub-band to generate a Huffman codebook.

霍夫曼解码器410通过参考表1的霍夫曼编码书解码第一子频带。The Huffman decoder 410 decodes the first sub-band by referring to the Huffman coding book of Table 1.

[表1][Table 1]

 Index Index  Num of bits Num of bits   Code word Code word  Index Index Num.of bits Num.of bits   Codeword Codeword

  0 0   5 5   0x17 0x17   16 16   5 5   0x1d 0x1d   1 1   8 8   0x64 0x64   17 17   5 5   0x19 0x19   2 2   8 8   0x65 0x65   18 18   5 5   0x1c 0x1c   3 3   8 8   0xf0 0xf0   19 19   5 5   0x16 0x16   4 4   8 8   0xf1 0xf1   20 20   5 5   0x18 0x18   5 5   7 7   0x33 0x33   21 twenty one   5 5   0x14 0x14   6 6   7 7   0x79 0x79   22 twenty two   5 5   0x13 0x13   7 7   6 6   0x18 0x18   23 twenty three   5 5   0x15 0x15   8 8   6 6   0x22 0x22   24 twenty four   5 5   0x1b 0x1b   9 9   6 6   0x23 0x23   25 25   5 5   0x10 0x10   10 10   6 6   0x3d 0x3d   26 26   5 5   0x0e 0x0e   11 11   5 5   0x0b 0x0b   27 27   5 5   0x0f 0x0f   12 12   5 5   0x12 0x12   28 28   5 5   0x0d 0x0d   13 13   5 5   0x1a 0x1a   29 29   5 5   0x0a 0x0a   14 14   4 4   0x04 0x04   30 30   2 2   0x00 0x00   15 15   5 5   0x1f 0x1f

并且,在解复用器210接收的所述信号是5比特量子化的信号的时候,霍夫曼解码器410通过参考表2的霍夫曼编码书进行霍夫曼解码。Moreover, when the signal received by the demultiplexer 210 is a 5-bit quantized signal, the Huffman decoder 410 performs Huffman decoding by referring to the Huffman coding book in Table 2.

[表2][Table 2]

Figure GSB00000713279300071
Figure GSB00000713279300071

Figure GSB00000713279300081
Figure GSB00000713279300081

并且,在解复用器210接收的所述信号是4比特量子化的信号的时候,霍夫曼解码器410通过参考表3的霍夫曼编码书进行霍夫曼解码。Moreover, when the signal received by the demultiplexer 210 is a 4-bit quantized signal, the Huffman decoder 410 performs Huffman decoding by referring to the Huffman coding book in Table 3.

[表3][table 3]

Figure GSB00000713279300091
Figure GSB00000713279300091

逆量子化器420利用逆量子化表来对所述差别指数进行逆量子化,以复原附加信息。具体地说,逆量子化器420通过映射每个框架内VLSA(VirtualSound Location Angle)信息和与每个VSLA相应的量子化表可以进行逆量子化。此时,由于根据本发明的示例性实施例的基于音源位置线索的多频道音频基本上以框架单位的DFT或MDCT进行解码,在框架之间的修匀(smoothing)主要被经由开窗口的重叠附加(overlap-add)方式满足。The inverse quantizer 420 uses an inverse quantization table to inverse quantize the difference index to recover additional information. Specifically, the inverse quantizer 420 can perform inverse quantization by mapping VLSA (VirtualSound Location Angle) information in each frame with a quantization table corresponding to each VSLA. At this time, since the multi-channel audio based on the sound source location clues according to the exemplary embodiment of the present invention is basically decoded with frame-by-frame DFT or MDCT, the smoothing between frames is mainly performed by windowed overlapping Additional (overlap-add) mode is satisfied.

在VLSA信息为LHA(Left Half-plane Angle)的时候,逆量子化器420通过映射表4的量子化表可以进行量子化。When the VLSA information is LHA (Left Half-plane Angle), the inverse quantizer 420 can perform quantization through the quantization table of the mapping table 4.

[表4][Table 4]

Figure GSB00000713279300092
Figure GSB00000713279300092

Figure GSB00000713279300101
Figure GSB00000713279300101

此时,复原所述附加信息的步骤,在VLSA信息为RHA(Right Half-planeAngle)的时候,逆量子化器420通过映射表5的量子化表可以进行量子化。At this time, in the step of restoring the additional information, when the VLSA information is RHA (Right Half-plane Angle), the inverse quantizer 420 can perform quantization through the quantization table of the mapping table 5.

[表5][table 5]

Figure GSB00000713279300102
Figure GSB00000713279300102

此时,复原所述附加信息的步骤,在VLSA信息为LSA(Left Subsequentvector Angle)的时候,逆量子化器420通过映射表6的量子化表可以进行量子化。At this time, in the step of restoring the additional information, when the VLSA information is LSA (Left Subsequent Vector Angle), the inverse quantizer 420 can perform quantization through the quantization table of the mapping table 6.

[表6][Table 6]

  Idx Idx   -15 -15   -14 -14   -13 -13   -12 -12   -11 -11   -10 -10   -9 -9   -8 -8   -7 -7   -6 -6   -5 -5   LSA[idx] LSA[idx]   -15 -15   -14 -14   -13 -13   -12 -12   -11 -11   -10 -10   -9 -9   -8 -8   -7 -7   -6 -6   -5 -5   Idx Idx   -4 -4   -3 -3   -2 -2   -1 -1   0 0   1 1   2 2   3 3   4 4   5 5   6 6   LSA[idx] LSA[idx]   -4 -4   -3 -3   -2 -2   -1 -1   0 0   1 1   2 2   3 3   4 4   5 5   6 6

  Idx Idx   7 7   8 8   9 9   10 10   11 11   12 12   13 13   14 14   15 15   LSA[idx] LSA[idx]   7 7   8 8   9 9   10 10   11 11   12 12   13 13   14 14   15 15

此时,复原所述附加信息的步骤,在VLSA信息为RSA(Right Subsequentvector Angle)的时候,逆量子化器420通过映射表7的量子化表可以进行量子化。At this time, in the step of restoring the additional information, when the VLSA information is RSA (Right Subsequent Vector Angle), the inverse quantizer 420 can perform quantization through the quantization table of the mapping table 7.

[表7][Table 7]

  Idx Idx   -15 -15   -14 -14   -13 -13   -12 -12   -11 -11   -10 -10   -9 -9   -8 -8   -7 -7   -6 -6   -5 -5   RSA[idx] RSA[idx]   -15 -15   -14 -14   -13 -13   -12 -12   -11 -11   -10 -10   -9 -9   -8 -8   -7 -7   -6 -6   -5 -5   Idx Idx   -4 -4   -3 -3   -2 -2   -1 -1   0 0   1 1   2 2   3 3   4 4   5 5   6 6   RSA[idx] RSA[idx]   -4 -4   -3 -3   -2 -2   -1 -1   0 0   1 1   2 2   3 3   4 4   5 5   6 6   Idx Idx   7 7   8 8   9 9   10 10   11 11   12 12   13 13   14 14   15 15   RSA[idx] RSA[idx]   7 7   8 8   9 9   10 10   11 11   12 12   13 13   14 14   15 15

并且,逆量子化器420从从每个指数信息中获得的VLSA信息中可以抽取满足数学式1的变数。And, the inverse quantizer 420 may extract a variable satisfying Mathematical Expression 1 from the VLSA information obtained from each index information.

[数学式1][mathematical formula 1]

θθ lhlh == (( LHALHA [[ idxidx ]] -- LSALSA [[ idxidx ]] PP 44 -- LSALSA [[ idxidx ]] )) ×× ππ 22

θθ rhrh == (( RHARHA [[ idxidx ]] -- PP 55 RSARSA [[ idxidx ]] -- PP 55 )) ×× ππ 22

gLs=sin(θlh)gLs=sin(θ lh )

gL=cos(θlh)·sin((LSA[idx]-2π)×-3)gL=cos(θ lh )·sin((LSA[idx]-2π)×-3)

gCL=cos(θlh)·cos((LSA[idx]-2π)×-3)gCL=cos(θ lh )·cos((LSA[idx]-2π)×-3)

gRs=cos(θrh)gRs=cos(θ rh )

gCR=sin(θrh)·cos(RSA[idx]×3)gCR=sin(θ rh )·cos(RSA[idx]×3)

gR=sin(θrh)·sin(RSA[idx]×3)gR=sin(θ rh )·sin(RSA[idx]×3)

此时,从在所述附加信息中被包括得子频带数的定义变数即bsFreqRes中可以获得子频带的数,逆量子化器420根据子频带的数可以映射传来的VSLA信息的数。且,最大的频带数以28频带(Mpar=28)为准根据比特率(bit rate)或框架的频率特性使分别率和频带数不同。At this time, the number of sub-bands can be obtained from the definition variable of the number of sub-bands included in the additional information, namely bsFreqRes, and the inverse quantizer 420 can map the number of the transmitted VSLA information according to the number of sub-bands. In addition, the maximum number of frequency bands is based on 28 frequency bands (Mpar=28), and the resolution and the number of frequency bands are different depending on the bit rate (bit rate) or the frequency characteristics of the frame.

逆量子化器420通过mapsubbands(bsFreqRes,Mpar)可以映射如下表8。The inverse quantizer 420 can be mapped as shown in Table 8 through mapsubbands(bsFreqRes, Mpar).

[表8][Table 8]

Figure GSB00000713279300121
Figure GSB00000713279300121

Figure GSB00000713279300131
Figure GSB00000713279300131

逆量子化器420使用以ERB频带为准设计的每个频带的分别率可以处理数学式1,在映射如8时Mpar=28的情况下,每个频带的分别率可以为如表9。The inverse quantizer 420 can process Mathematical Formula 1 by using the resolution of each frequency band designed based on the ERB frequency band. In the case of Mpar=28 when mapping such as 8, the resolution of each frequency band can be as shown in Table 9.

[表9][Table 9]

  m m   Mpar=28M par =28   kHz kHz   0 0   0 0   0.0702 0.0702   1 1   1 1   0.1639 0.1639   2 2   2 2   0.2576 0.2576   3 3   3 3   0.3512 0.3512   4 4   4 4   0.4449 0.4449   5 5   5 5   0.5385 0.5385   6 6   6 6   0.6322 0.6322   7 7   7 7   0.7259 0.7259   8 8   8 8   0.9132 0.9132   9 9   9 9   1.1005 1.1005   10 10   10 10   1.2878 1.2878   11 11   11 11   1.4751 1.4751   12 12   12 12   1.8498 1.8498   13 13   13 13   2.2244 2.2244   14 14   14 14   2.599 2.599

  15 15   15 15   2.9737 2.9737   16 16   16 16   3.7229 3.7229   17 17   17 17   4.4722 4.4722   18 18   18 18   5.2215 5.2215   19 19   19 19   5.9707 5.9707   20 20   20 20   6.72 6.72   21 twenty one   21 twenty one   7.4693 7.4693   22 twenty two   22 twenty two   8.5932 8.5932   23 twenty three   23 twenty three   9.7171 9.7171   24 twenty four   24 twenty four   11.2156 11.2156   25 25   25 25   13.0888 13.0888   26 26   26 26   15.3366 15.3366   27 27   27 27   24 twenty four

此时,周期在先符号删除器210、逆二进制傅里叶变换器220、保护频带删除器230具有与所述解码器所使用的接收天线的数量相同的数量,以与每个接收天线相对应。At this time, the periodic prior symbol remover 210, the inverse binary Fourier transformer 220, and the guard band remover 230 have the same number as the number of receiving antennas used by the decoder to correspond to each receiving antenna .

在开窗口的滤波器组230实行T/F变换后,可以把频率领域的频带定义为一个处理频带(processing band)。After the T/F transform is performed by the windowed filter bank 230, the frequency band in the frequency domain can be defined as a processing band.

例如,在实行2048点DFT变换时,如表10所示,以start bin、stop bin的位置为中心,可以把频率领域的频带定义为一个处理频带。For example, when performing 2048-point DFT transformation, as shown in Table 10, with the positions of start bin and stop bin as the center, the frequency band in the frequency domain can be defined as a processing frequency band.

[表10][Table 10]

Figure GSB00000713279300141
Figure GSB00000713279300141

Figure GSB00000713279300151
Figure GSB00000713279300151

综合上混器240基于从附加信息的比特流中复原的VSLA信息通过在所述下混的信号中每个子频带内的音频位相定位可以复原多频道信号。具体地说,如图5所示,利用从附加信息的比特流中平移角度(panning angle)预测每个子频带内的动力信息,再通过适用动力信息可以预测每个频道的子频道的信号。The integrated upmixer 240 can restore the multi-channel signal through audio phase positioning within each sub-band in the downmixed signal based on the VSLA information restored from the bitstream of the side information. Specifically, as shown in Figure 5, the dynamic information in each sub-band is predicted by using the panning angle from the bit stream of additional information, and then the sub-channel signal of each channel can be predicted by applying the dynamic information.

图5示出根据本发明的示例性实施例的综合上混器预测每个频道的增益的过程。FIG. 5 illustrates a process of predicting a gain of each channel by an integrated upmixer according to an exemplary embodiment of the present invention.

如图5所示,综合上混器240通过阶段地复原每个频道的音频位相的信息来预测每个频道的增益(gain)。As shown in FIG. 5 , the integrated up-mixer 240 predicts the gain of each channel by recovering the audio phase information of each channel in stages.

首先,综合上混器240可以复原LHA[idx]510和RHA[idx]520。First, the integrated upmixer 240 can restore LHA[idx] 510 and RHA[idx] 520 .

综合上混器240从LHA[idx]510中预测gLs[idx]530,复原LSA[idx]511,从RHA[idx]520中预测gRs[idx]540,复原RSA[idx]521。Synthetic upmixer 240 predicts gLs[idx] 530 from LHA[idx] 510 , restores LSA[idx] 511 , predicts gRs[idx] 540 from RHA[idx] 520 , and restores RSA[idx] 521 .

然后,综合上混器240从LSA[idx]511中预测gL[idx]550和gCL[idx]512,从RSA[idx]521中预测gRs[idx]560和gCR[idx]522。Then, the integrated upmixer 240 predicts gL[idx] 550 and gCL[idx] 512 from LSA[idx] 511 , and gRs[idx] 560 and gCR[idx] 522 from RSA[idx] 521 .

最后,综合上混器240从gCL[idx]512和gCR[idx]522中预测gCL[idx]/sqrt(2)570。此时,gCL[idx]/sqrt(2)作为gCL[idx]512*0。7071可以为中心频道的增益被调整的值。Finally, synthetic upmixer 240 predicts gCL[idx]/sqrt(2) 570 from gCL[idx] 512 and gCR[idx] 522 . At this time, gCL[idx]/sqrt(2) is gCL[idx] 512*0. 7071 may be the adjusted value of the gain of the center channel.

综合上混器240基于通过所述步骤预测的所述多频道信号把下混的信号上混可以生成上混的信号。The integrated upmixer 240 may generate an upmixed signal by upmixing the downmixed signal based on the multi-channel signal predicted through the steps.

如果XdmxL(m,k)是传来的Left下混的信号的第m子频带的第k频率bin,‘Left upmixing Matrix’可以满足数学式2。If X dmxL (m, k) is the k-th frequency bin of the m-th sub-band of the transmitted Left downmixed signal, the 'Left upmixing Matrix' can satisfy Mathematical Formula 2.

[数学式2][mathematical formula 2]

gCLgCL (( mm )) gLgL (( mm )) gLswxya (( mm )) Xx dmxLwxya (( mm ,, kk )) == CLCL (( mm ,, kk )) LL (( mm ,, kk )) Lsls (( mm ,, kk ))

并且,对Right下混的信号‘Rightupmixing Matrix’可以满足数学式3。In addition, the 'Rightupmixing Matrix' of the Right downmixed signal can satisfy Mathematical Formula 3.

[数学式3][mathematical formula 3]

gCLgCL (( mm )) gRGR (( mm )) gRswxya (( mm )) Xx dmxRwxya (( mm ,, kk )) == CRCR (( mm ,, kk )) RR (( mm ,, kk )) RsRs. (( mm ,, kk ))

并且,综合上混器240可以包括基于DFT的逆相关器(decorrelator)即DL和DRAlso, the integrated upmixer 240 may include DFT-based decorrelators, ie, DL and DR .

所述DL和DR以高复杂模式(high complexity mode)和作为普通模式低复杂模式(low complexity mode)可以动作。此时,所述DL和DR只在解码器内生成,在生成高音质时其以高复杂模式动作,在重现普通音质时以低复杂模式动作。The DL and DR can operate in a high complexity mode (high complexity mode) and a low complexity mode (low complexity mode) as a normal mode. At this time, the DL and DR are generated only in the decoder, and operate in a high-complexity mode when generating high-quality sound, and operate in a low-complexity mode when reproducing normal sound quality.

在高复杂模式的时候,所述DL和DR实行对L(m,k)和R(m,k)的数学式4的矩阵变换(matrixing)来生成逆关联信号。In high complexity mode, the DL and DR perform matrix transformation (matrixing) of Mathematical Formula 4 on L(m,k) and R(m,k) to generate an inverse correlation signal.

[数学式4][mathematical formula 4]

Figure GSB00000713279300171
Figure GSB00000713279300171

Figure GSB00000713279300172
Figure GSB00000713279300172

在普通模式的时候,所述DL和DR满足数学式5,而不生成逆关联信号。In normal mode, the DL and DR satisfy Mathematical Formula 5, and no inverse correlation signal is generated.

[数学式5][mathematical formula 5]

Figure GSB00000713279300173
Figure GSB00000713279300173

Figure GSB00000713279300174
Figure GSB00000713279300174

综合上混器240使用数学式6上混所述数学式2和3所计算的值。The integrated upmixer 240 upmixes the values calculated by the Mathematical Formulas 2 and 3 using Mathematical Formula 6.

[数学式6][mathematical formula 6]

0.7071 0.7071 0 0 0 0 0 0 0 0 &alpha; ( m ) 1 - &alpha; ( m ) 0 0 0 0 0 0 0 0 &alpha; ( m ) 1 - &alpha; ( m ) 0 0 0 0 0 0 0 &delta; 0 0 0 0 0 0 0 0 &delta; 0 0 0 1 0 1 0 0 0 CL ( m , k ) CR ( m , k ) L ( m , k ) L wet ( m , k ) R ( m , k ) R wet ( m , k ) Ls ( m , k ) Rs ( m , k ) = C ( m , k ) L ( m , k ) R ( m , k ) Ls ( m , k ) Rs ( m , k ) lfe ( m , k ) , m<4 0.7071 0.7071 0 0 0 0 0 0 0 0 &alpha; ( m ) 1 - &alpha; ( m ) 0 0 0 0 0 0 0 0 &alpha; ( m ) 1 - &alpha; ( m ) 0 0 0 0 0 0 0 &delta; 0 0 0 0 0 0 0 0 &delta; 0 0 0 1 0 1 0 0 0 CL ( m , k ) CR ( m , k ) L ( m , k ) L wet ( m , k ) R ( m , k ) R wet ( m , k ) ls ( m , k ) Rs. ( m , k ) = C ( m , k ) L ( m , k ) R ( m , k ) ls ( m , k ) Rs. ( m , k ) lfe ( m , k ) , m<4

0.7071 0.7071 0 0 0 0 0 0 0 0 1 - &alpha; ( m ) &alpha; ( m ) 0 0 0 0 0 0 0 0 1 - &alpha; ( m ) &alpha; ( m ) 0 0 0 0 0 0 0 &delta; 0 0 0 0 0 0 0 0 &delta; 0 0 0 0 0 0 0 0 0 CL ( m , k ) CR ( m , k ) L ( m , k ) L wet ( m , k ) R ( m , k ) R wet ( m , k ) Ls ( m , k ) Rs ( m , k ) = C ( m , k ) L ( m , k ) R ( m , k ) Ls ( m , k ) Rs ( m , k ) lfe ( m , k ) , m≥4 0.7071 0.7071 0 0 0 0 0 0 0 0 1 - &alpha; ( m ) &alpha; ( m ) 0 0 0 0 0 0 0 0 1 - &alpha; ( m ) &alpha; ( m ) 0 0 0 0 0 0 0 &delta; 0 0 0 0 0 0 0 0 &delta; 0 0 0 0 0 0 0 0 0 CL ( m , k ) CR ( m , k ) L ( m , k ) L wet ( m , k ) R ( m , k ) R wet ( m , k ) ls ( m , k ) Rs. ( m , k ) = C ( m , k ) L ( m , k ) R ( m , k ) ls ( m , k ) Rs. ( m , k ) lfe ( m , k ) , m≥4

此时,α(m)可以为指示每个频带的L和R信号之间的关系的因素。δ是固定系数,在编码器下载的时候,其可以为对周围(surround)信号的反混合系数的固定系数。At this time, α(m) may be a factor indicating a relationship between L and R signals of each frequency band. δ is a fixed coefficient, which may be a fixed coefficient for the demixing coefficient of the surrounding signal when the encoder is downloaded.

所述α(m)是将所述数学式4和5所计算出的值利用数学式7来计算而获得的。The α(m) is obtained by calculating the value calculated by the above-mentioned Mathematical Expressions 4 and 5 using Mathematical Expression 7.

[数学式7][mathematical formula 7]

&alpha;&alpha; (( mm )) == Xx dmxLwxya (( mm ,, kk )) Xx dmxRwxya ** (( mm ,, kk )) Xx dmxLwxya (( mm ,, kk )) Xx dmxLwxya ** (( mm ,, kk )) Xx dmxRwxya (( mm ,, kk )) Xx dmxRwxya ** (( mm ,, kk )) &times;&times; &gamma;&gamma;

入作为加权值的系数,可以为用来调整逆关联信号的混合程度的值。因此,在0≤α(m)≤γ,γ在0≤γ≤1的范围内可以定义α(m)。Inputting a coefficient as a weighting value may be a value used to adjust the mixing degree of the inverse correlation signal. Therefore, α(m) can be defined in the range of 0≤α(m)≤γ, where γ is in the range of 0≤γ≤1.

并且,所述wetL(m,k)和wetR(m,k)作为逆关联的信号经由逆相关器所实行的逆相关工艺可以被生成。And, the wet L(m,k) and wet R(m,k) may be generated as inversely correlated signals via an inverse correlation process performed by an inverse correlator.

图6示出根据本发明的示例性实施例的逆相关器。Fig. 6 shows an inverse correlator according to an exemplary embodiment of the present invention.

根据本发明的逆相关器600是在综合上混器240内被包括而形成逆关联信号的要素,如图6所示,其可以包括复变换器610、振幅信息抽取器620、位相信息抽取器630、随机时序存储器640、开窗口器650、位相变换器660、综合器670和逆复变换器680。The inverse correlator 600 according to the present invention is included in the integrated up-mixer 240 to form an element of the inverse correlation signal, as shown in FIG. 6 , it may include a complex transformer 610, an amplitude information extractor 620, a phase information extractor 630 , random timing memory 640 , window opener 650 , phase converter 660 , synthesizer 670 and inverse converter 680 .

复变换器610对所述下混的信号可以进行复变换。The complex transformer 610 may perform complex transformation on the downmixed signal.

振幅信息抽取器620和位相信息的抽取器630从在复变换器610所变换的所述下混的信号中分别抽取振幅信息和位相信息,以分离所述下混的信号。The amplitude information extractor 620 and the phase information extractor 630 respectively extract amplitude information and phase information from the downmixed signal transformed in the complex transformer 610 to separate the downmixed signal.

开窗口器650基于振幅信息抽取器620所抽取的所述振幅信息的包络线用来修改位相信息建模频谱型窗口,通过对在随机时序存储器640中已存的随机时序使用所述频谱型窗口开窗口。The window opener 650 is based on the envelope of the amplitude information extracted by the amplitude information extractor 620 to modify the phase information modeling spectral window, by using the spectral type for the random timing stored in the random timing memory 640 window open window.

此时,在随机时序存储器640中已存的随机时序的数量根据所述下混的信号的数量而被定。即,为了生成所述wetL(m,k)和wetR(m,k),使用互相不同的随机时序,此时所使用的两个随机时序的关联度接近0。At this time, the number of random timings stored in the random timing memory 640 is determined according to the number of downmixed signals. That is, in order to generate the wet L(m, k) and wet R(m, k), different random time series are used, and the degree of correlation between the two random time series used at this time is close to zero.

位相变换器660利用从开窗口器650中开个窗口的随机时序将加权值可以赋予给从位相信息抽取器630中抽取的所述位相信息。The phase converter 660 may assign a weighted value to the phase information extracted from the phase information extractor 630 by using the random timing of opening a window from the window opener 650 .

综合器670可以结合从振幅信息抽取器620中抽取的所述振幅信息和从位相变换器660中施加加权值的位相信息。The synthesizer 670 may combine the amplitude information extracted from the amplitude information extractor 620 and the phase information from the phase transformer 660 to apply a weighted value.

逆复变换器680将在综合器670中所结合的信息进行逆复变换来计算逆关联信号。The inverse transformer 680 inversely transforms the information combined in the synthesizer 670 to calculate an inverse correlation signal.

图7示出出根据本发明的示例性实施例的基于音源位置线索的多频道音频的解码方法。Fig. 7 shows a method for decoding multi-channel audio based on audio source position clues according to an exemplary embodiment of the present invention.

在步骤S710中,解复用器210接收由复用器150传送的信号,将接收的信号剖析成立体音频的比特流和附加信息的比特流。In step S710, the demultiplexer 210 receives the signal transmitted by the multiplexer 150, and dissects the received signal into a bit stream of stereo audio and a bit stream of additional information.

在步骤S720中,音频解码器220基于在步骤S710中所剖析的所述音频的比特流可以复原下混的信号。In step S720, the audio decoder 220 may restore the downmixed signal based on the bitstream of the audio parsed in step S710.

在步骤S730中,霍夫曼解码器410用霍夫曼的编码书将在步骤S710中所剖析的所述附加信息的比特流进行霍夫曼解码,以生成差别指数。In step S730, the Huffman decoder 410 performs Huffman decoding on the bit stream of the additional information parsed in step S710 using Huffman's coding book to generate a difference index.

在步骤S740中,逆量子化器420用逆量子化表将在S730中所生成的差别指数进行逆量子化,以复原附加信息。具体地说,逆量子化器420通过对每个帧的VSLA的信息映射与每个VSLA相对应的量子化表来可以进行逆量子化。In step S740, the inverse quantizer 420 uses the inverse quantization table to inverse quantize the difference index generated in S730 to restore the additional information. Specifically, the inverse quantizer 420 can perform inverse quantization by mapping the VSLA information of each frame to a quantization table corresponding to each VSLA.

在步骤S750中,综合上混器240使用在步骤S720中所复原的所述下混的信号与在步骤S740中所复原的所述附加信息来预测多频道的信号,基于所述多频道的信号上混所述下混的信号来生成上混的信号。In step S750, the integrated up-mixer 240 uses the downmixed signal restored in step S720 and the additional information restored in step S740 to predict a multi-channel signal, based on the multi-channel signal The downmixed signal is upmixed to generate an upmixed signal.

在步骤S760中,综合滤波器组的开窗口器250对在步骤S750中所生成的所述上混的信号实行综合滤波器组来抽取在时间领域的信号,且开所述上混的信号的窗口来可以抽取输出信号。In step S760, the windower 250 of the integrated filter bank performs an integrated filter bank on the upmixed signal generated in step S750 to extract the signal in the time domain, and opens the upmixed signal window to extract the output signal.

如上所述,根据本发明的示例性实施例的基于音源位置线索的多频道音频的解码装置和方法通过将多频道音频信号接收和压缩,且经由内核立体编解码器(core stereo codec)将立体信号压缩和传送,提供与现有立体音频编码的逆兼容性的同时,可以传送多频道音频。As described above, the multi-channel audio decoding device and method based on sound source location clues according to the exemplary embodiments of the present invention receive and compress the multi-channel audio signal, and convert the stereo Signal compression and transmission, allowing the transmission of multi-channel audio while providing backward compatibility with existing stereo audio coding.

尽管本发明的具体示例性实施例为阐述的意图已经做了描述,但本领域的技术人员在不脱离由权利要求定义的本发明的精神和范围的情况下可以对其进行各种修改、添加和替换。因此本发明的范围应该被附属权利要求和平等权利要求所定义。Although specific exemplary embodiments of the present invention have been described for illustrative purposes, various modifications, additions, and and replace. The scope of the invention should therefore be defined by the appended claims and equal claims.

Claims (10)

1. decoding device based on the multichannel audio of sound source position clue, it comprises:
Demodulation multiplexer receives signal, and the above-mentioned signal that will receive is parsed into audio bitstream and additional information bits stream;
Audio decoder based on said audio bitstream, restores mixing signal down, and the following mixed signal that will restore sends to and comprehensively mixed device, and analysis filterbank is not applicable to the following mixed signal of said recovery;
Comprehensive going up mixed device, through from the said mixed signal down of recovery, separating amplitude information and phase information, based on said amplitude information the random sequence of having deposited carried out windowing; Weighted value is given to said phase information, based on said amplitude information and the said phase information that is endowed weighted value, and based on the said additional information bits stream that restores; Prediction multichannel signal; Based on said multichannel signal said down mixed signal is gone up and to be mixed, generate mixed signal, wherein said additional information bits is flowed and carry out Hofmann decoding and generate the difference index; And with the quantization table said difference index is carried out inverse guantization (IQ), to restore said additional information; With
The mouthpart of windowing of synthesis filter group uses the synthesis filter group to extract time field signal to the said signal that upward mixes, and the said signal that upward mixes is carried out windowing, extracts the output signal.
2. the decoding device of the multichannel audio based on the sound source position clue as claimed in claim 1, said audio decoder have and use time domain to mix the three-dimensional kernel codec of repeatedly cancelling the TDAC bank of filters.
3. the decoding device of the multichannel audio based on the sound source position clue as claimed in claim 1, the mouthpart of windowing of said synthesis filter group shines upon mutually for the analysis windowing with the kernel stereo audio, and said going up mixed signal and carried out windowing.
4. the decoding device of the multichannel audio based on the sound source position clue as claimed in claim 1, said comprehensive going up mixed the said mixed signal down of device complex transformation, separated amplitude information and phase information the following mixed signal of complex transformation from said.
5. the decoding device of the multichannel audio based on the sound source position clue as claimed in claim 1; Said comprehensive going up mixed the envelope of device based on said amplitude information; For revising phase information, modeling frequency spectrum type window uses the random sequence of having deposited that said frequency spectrum type window is carried out windowing; And use the random sequence of windowing, weighted value is given to said phase information.
6. the decoding device of the multichannel audio based on the sound source position clue as claimed in claim 1; The said comprehensive device that upward mixes interosculates said amplitude information and the said phase information that is endowed weighted value; Said amplitude information and the said phase information that is endowed weighted value to combining carry out contrary complex transformation, predict the multichannel signal.
7. coding/decoding method based on the multichannel audio of sound source position clue, it comprises:
The signal that receives is parsed into audio bitstream and additional information bits stream;
Based on said audio bitstream, restore mixing signal down, but analysis filterbank is not applicable to said mixed signal down;
Said additional information bits stream is carried out Hofmann decoding, generate the difference index;
With the quantization table said difference index is carried out inverse guantization (IQ), to restore additional information;
Separation amplitude information and phase information from the said mixed signal down that restores; Based on said amplitude information the random sequence of having deposited is carried out windowing; Weighted value is given to said phase information; Based on said amplitude information and the said phase information that is endowed weighted value, and, predict the multichannel signal based on the said additional information of restoring;
Based on said multichannel signal, upward mix said mixed signal down, mix signal to generate;
Use the synthesis filter group to extract time field signal to the said signal that upward mixes; With
The said signal that upward mixes is carried out windowing, extract the output signal.
8. the coding/decoding method of the multichannel audio based on the sound source position clue as claimed in claim 7, said separating step comprises: the said mixed signal down of complex transformation; With separated amplitude information and phase information from said the following mixed signal of complex transformation.
9. the coding/decoding method of the multichannel audio based on the sound source position clue as claimed in claim 7, said step of giving weighted value comprises: based on the envelope of said amplitude information, be to revise phase information, modeling frequency spectrum type window; Use the random sequence of having deposited that said frequency spectrum type window is carried out windowing; With the random sequence of using windowing, weighted value is given to said phase information.
10. the coding/decoding method of the multichannel audio based on the sound source position clue as claimed in claim 7, the step of said prediction multichannel signal comprises: said amplitude information and the said phase information that is endowed weighted value are interosculated; With being carried out, said amplitude information that combines and the phase information that is endowed weighted value predict the multichannel signal against complex transformation.
CN2009102238140A 2008-12-03 2009-11-23 Decoder and decoding method for multichannel audio coder using sound source location cue Expired - Fee Related CN101754086B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR20080121709 2008-12-03
KR121709/08 2008-12-03
KR1020090064918A KR101176703B1 (en) 2008-12-03 2009-07-16 Decoder and decoding method for multichannel audio coder using sound source location cue
KR64918/09 2009-07-16

Publications (2)

Publication Number Publication Date
CN101754086A CN101754086A (en) 2010-06-23
CN101754086B true CN101754086B (en) 2012-11-14

Family

ID=42363561

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009102238140A Expired - Fee Related CN101754086B (en) 2008-12-03 2009-11-23 Decoder and decoding method for multichannel audio coder using sound source location cue

Country Status (2)

Country Link
KR (1) KR101176703B1 (en)
CN (1) CN101754086B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012045203A1 (en) * 2010-10-05 2012-04-12 Huawei Technologies Co., Ltd. Method and apparatus for encoding/decoding multichannel audio signal
WO2017132082A1 (en) * 2016-01-27 2017-08-03 Dolby Laboratories Licensing Corporation Acoustic environment simulation
KR20220005379A (en) * 2020-07-06 2022-01-13 한국전자통신연구원 Apparatus and method for encoding/decoding audio that is robust against coding distortion in transition section

Also Published As

Publication number Publication date
CN101754086A (en) 2010-06-23
KR20100063639A (en) 2010-06-11
KR101176703B1 (en) 2012-08-23

Similar Documents

Publication Publication Date Title
US12080307B2 (en) Stereo audio encoder and decoder
EP1934973B1 (en) Temporal and spatial shaping of multi-channel audio signals
CN112786063B (en) Audio encoders and decoders
CN103493128B (en) A method and apparatus for performing an adaptive down- and up-mixing of a multi-channel audio signal
CN100571043C (en) A spatial parameter stereo encoding and decoding method and device thereof
CN101202043B (en) Method and system for encoding and decoding audio signal
JP7493073B2 (en) Integration of high frequency reconstruction techniques with post-processing delay reduction
CN113963706A (en) Frequency Domain Processors and Audio Encoders and Decoders for Time Domain Processors
CN103329197A (en) Improved stereo parametric encoding/decoding for channels in phase opposition
US20230419976A1 (en) Apparatus for Encoding or Decoding an Encoded Multichannel Signal Using a Filling Signal Generated by a Broad Band Filter
CN104838442A (en) Encoder, decoder and methods for backward compatible multi-resolution spatial-audio-object-coding
JP2021507316A (en) Backwards compatible integration of high frequency reconstruction technology for audio signals
CN101223598A (en) Channel Level Difference Quantization and Dequantization Method Based on Virtual Source Position Information
CN101754086B (en) Decoder and decoding method for multichannel audio coder using sound source location cue
JP2021522543A (en) Integration of high frequency reconstruction technology with post-processing delay reduction
RU2798009C2 (en) Stereo audio coder and decoder
RU2832544C2 (en) Integration of high-frequency reconstruction techniques with reduced post-processing delay
RU2024117821A (en) METHODS AND DEVICES FOR ENCODING OR DECODING SCENE-ORIENTED IMMERSIVE AUDIO CONTENT

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20100623

Assignee: Neo Lab Convergence Inc.

Assignor: Korea Electronic Communication Institute

Contract record no.: 2016990000259

Denomination of invention: Decoder and decoding method for multichannel audio coder using sound source location cue

Granted publication date: 20121114

License type: Exclusive License

Record date: 20160630

LICC Enforcement, change and cancellation of record of contracts on the licence for exploitation of a patent or utility model
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121114

Termination date: 20191123

CF01 Termination of patent right due to non-payment of annual fee