[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN101754086B - Decoder and decoding method for multichannel audio coder using sound source location cue - Google Patents

Decoder and decoding method for multichannel audio coder using sound source location cue Download PDF

Info

Publication number
CN101754086B
CN101754086B CN2009102238140A CN200910223814A CN101754086B CN 101754086 B CN101754086 B CN 101754086B CN 2009102238140 A CN2009102238140 A CN 2009102238140A CN 200910223814 A CN200910223814 A CN 200910223814A CN 101754086 B CN101754086 B CN 101754086B
Authority
CN
China
Prior art keywords
signal
information
multichannel
audio
sound source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2009102238140A
Other languages
Chinese (zh)
Other versions
CN101754086A (en
Inventor
徐廷一
白承权
李用主
姜京玉
洪镇佑
金镇雄
安致得
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electronics and Telecommunications Research Institute ETRI
Original Assignee
Electronics and Telecommunications Research Institute ETRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electronics and Telecommunications Research Institute ETRI filed Critical Electronics and Telecommunications Research Institute ETRI
Publication of CN101754086A publication Critical patent/CN101754086A/en
Application granted granted Critical
Publication of CN101754086B publication Critical patent/CN101754086B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/26Pre-filtering or post-filtering
    • G10L19/265Pre-filtering, e.g. high frequency emphasis prior to encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/02Systems employing more than two channels, e.g. quadraphonic of the matrix type, i.e. in which input signals are combined algebraically, e.g. after having been phase shifted with respect to each other

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Algebra (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Stereophonic System (AREA)

Abstract

The present invention provides a decoder and a decoding method for multichannel audio coder using sound source location cue. The decoder includes a demultiplexer used for receiving a signal and analyzing the received signal into an audio bit stream and an additional information bit stream; an audio decoder used for reproducing a download mixing signal based on the audio bit stream and transmitting the download mixing signal to a synthetic upload mixer, and making the reproduced download mixing signal to be not suitable for an analysis filter set; a synthetic upload mixer used for forecasting a multichannel signal using the download mixing signal and the additional information bit stream, and performing an upload mixture to the download mixing signal to generate an upload mixing signal based on the multichannel signal; and a window-opening device for synthetic filter set, used for extracting a time field signal from the upload mixing signal by the synthetic filter set, and performing window open to the upload mixing signal so as to extract an output signal.

Description

A kind of decoding device of the multichannel audio based on the sound source position clue and its method
Technical field
The invention discloses a kind of decoding device and its method of the multichannel audio based on the sound source position clue.
Background technology
The basic coding notion of auditory localization clue coding (SSLCC:Sound Source Location Cue Coding) is from the method for spatial audio coding (SAC:Spatial Audio Coding).
Said SAC conduct is based on the compress technique of the multichannel audio of sound source position clue, and the redundancy (redundancy) of deleting each channel signals based on the spatial cues (Spatial Cue) of people's cognition in the space can make its compression efficiency maximization.
And the multichannel signal is handled through mixing down basically, makes the download signal of the audio frequency that transmits become kernel (core) signal.That is, can realize reappearing through existing stereo audio, it is the basic principle of SAC method.
Said SSLCC is in the SAC method and is the spatial cues of people's cognition spatially that it can predict the position of source of sound, from the multichannel signal, extracts the positional information of source of sound, shows and transmits.
At this moment; Because the amount of information of the information that extracts through the SAC coding strategy very little; It can be sent to redundant field, is not the audio frequency of supporting the SAC coding if therefore receive reception place of said information, then utilizes existing stereo audio only to reappear three-dimensional signal; If receive reception place of said information is the audio frequency of supporting the SAC coding, then uses the redundant information that is transmitted from the three-dimensional signal that mixes down, to restore the multichannel audio signal based on the sound source position clue.
But; In order from the three-dimensional signal that mixes down, to restore multichannel audio signal based on the sound source position clue through the use redundant information; Need use and extracting T/F (time to frequency) the T/F transform method that transform method is identical that is utilized in the process of information through the SAC coding strategy; If therefore extracting the T/F transform method that the T/F transform method that is utilized in the process of information is not best suited for reception place through the SAC coding strategy, it can bring bad influence to conversion process.
Therefore, the apparatus and method that need the former multichannel audio signal based on the sound source position clue of a kind of T/F transform method reflex of the most suitable reception place.
Summary of the invention
The invention provides a kind of through multi channel audio signal is received and compression; And with three-dimensional signal compression and transmission,, itself and basic stereo audio coding can transmit the decoding device and its method of multi channel audio signal when keeping anti-compatible (backward compatible) via basic D encoding.
The invention provides a kind of through using time domain to mix repeatedly the decoding device and its method of cancellation (TDAC:Time Domain Aliasing Cancellation) document device group (file bank) according to the T/F of the audio signal of selecting to mix under can the transfer pair solid.
Technical scheme
According to one exemplary embodiment of the present invention, a kind of decoding device of the multichannel audio based on the sound source position clue is provided, it comprises: demodulation multiplexer receive signal, and the above-mentioned signal that will receive is parsed into audio bitstream and additional information bits flows; Audio decoder based on said audio bitstream, restores mixing signal down; Comprehensively go up and mix device,, predict the multichannel signal, based on said multichannel signal said mixed signal is down gone up and mixed, generate and mix signal through using said mixed signal down and said additional information bits stream; With the mouthpart of windowing of synthesis filter group, the said signal implementation synthesis filter group of upward mixing is extracted time field signal, and the said signal that upward mixes is carried out windowing, extract the output signal.
And exemplary embodiment according to another preferred provides a kind of coding/decoding method of the multichannel audio based on the sound source position clue, and it comprises: the signal that receives is parsed into audio bitstream flows with additional information bits; Based on said audio bitstream, restore mixing signal down; Use said mixed signal down and said additional information to predict the multichannel signal; Based on said multichannel signal, upward mix said mixed signal down, mix signal to generate; The said signal implementation synthesis filter group of upward mixing is extracted time field signal; With the said signal that go up to mix is carried out windowing, extract the output signal.
Technique effect
According to the present invention; Through multi channel audio signal is received and compression; And with three-dimensional signal compression and transmission, keep to transmit multi channel audio signal in anti-compatible (backward compatible) at itself and basic stereo audio coding via basic D encoding.
The present invention is through using the TDAC bank of filters, according to the T/F of the audio signal of selecting to mix under can the transfer pair solid.
Description of drawings
Fig. 1 illustrates the code device based on the multichannel audio of sound source position clue according to exemplary embodiment of the present invention;
Fig. 2 illustrates the decoding device based on the multichannel audio of sound source position clue according to exemplary embodiment of the present invention;
Fig. 3 illustrates the decoding device based on the multichannel audio of sound source position clue of exemplary embodiment according to another preferred;
Fig. 4 illustrates the decoding device according to the additional information bits stream of exemplary embodiment of the present invention;
Fig. 5 illustrates the comprehensive process that goes up the gain that mixes each channel of device prediction according to exemplary embodiment of the present invention;
Fig. 6 illustrates the inversely related device according to exemplary embodiment of the present invention;
Fig. 7 illustrates out the coding/decoding method based on the multichannel audio of sound source position clue according to exemplary embodiment of the present invention.
Embodiment
To explain particularly exemplary embodiment of the present invention with reference to accompanying drawing as follows.
Fig. 1 illustrates the code device based on the multichannel audio of sound source position clue according to exemplary embodiment of the present invention.
According to exemplary embodiment of the present invention based on the code device 100 of the multichannel audio of sound source position clue as 5 channel multichannel audio code devices based on SSLCC; As shown in Figure 1, constitute by pre-treatment bank of filters device 110, analyzer 120, down mixed processor 130, audio coder 140 and multiplexer 150.
At this moment, can expand to the content more than 5 channels according to the code device 100 based on the multichannel audio of sound source position clue of exemplary embodiment of the present invention based on the multichannel audio of sound source position clue.
Pre-treatment bank of filters device 110 carries out pre-treatment and is input to the multichannel input audio signal based on the code device 100 of the multichannel audio of sound source position clue, will be transformed to the signal of frequency field by the input audio signal of pre-treatment via bank of filters.At this moment, bank of filters is carried out the T/F conversion based on subband analysis, and it can be applied to MDCT, MDST, DFT etc.
At this, the audio signal of said input can comprise input signal LF (Left Front), input signal RF (Right Front), input signal C (Center Front), input signal Ls (Left Surround), input signal Rs (Right Surround).
Analyzer 120 extracts spatial cues (spatial cue) from the input audio signal that is transformed to frequency field in pre-treatment bank of filters 110, the bit stream that said spatial cues is shown as additional information transmits.At this moment, analyzer 120 is sent to down mixed processor 130 through compressing said input audio signal.
Mixed processor 130 can descend to mix like analyzer 120 and under frequency field, mix said input audio signal down.And following mixed processor 130 can descend to mix according to the recommendation case of ITU-T.
The audio signal of under mixing in the processor 130 down, mixing can show as bit stream through stereo audio commonly used.Said stereo audio commonly used can be utilized MP3 (MPEG Layer III) or AAC (Advanced Audio Coding) etc.
The audio signal that audio coder 140 can be encoded and mixed down the processor 130 from mixing down.
Multiplexer 150 will be from audio coder 140 encoded signals and the additional information bits stream that from analyzer 120, transmits combine and transmit.
Fig. 2 illustrates the decoding device based on the multichannel audio of sound source position clue according to exemplary embodiment of the present invention.
According to exemplary embodiment of the present invention based on the decoding device 200 of the multichannel audio of sound source position clue as 5 channel multichannel audio decoding devices based on SSLCC; As shown in Figure 2, mix device 240 and the synthesis filter group mouthpart 250 of windowing and constitute by demodulation multiplexer 210, audio decoder 220, windowing bank of filters 230, comprehensive going up.
Demodulation multiplexer 210 receives the signal that transmits from demodulation multiplexer 150, and the said signal that receives is parsed into audio bitstream and additional information bits stream.
Audio decoder 220 restores the signal that mixes down based on said audio bitstream.
230 pairs of said signals that mix down of the bank of filters of windowing apply analysis filterbank and carry out the T/F conversion, with the said signal windowing and the transmission that mixes down of T/F conversion.
Comprehensive going up mixed device 240 through using said signal that mixes down and said additional information bits stream prediction multichannel signal, generates the signal that mixes based on mixing the said signal that mixes down on the said multichannel signal.
Specifically; The comprehensive device 240 that upward mixes separates amplitude information and phase information from the said signal that mixes down; Based on said amplitude signal the random sequence windowing of having deposited to give weighted value to said phase information, can predict the multichannel signal based on the phase information that said amplitude information and weighted value are given.
At this moment, comprehensive going up mixed the said signal that mixes down of device 240 complex transformations, from the said signal that mixes down of complex transformation, separates amplitude information and phase information.
And; Comprehensive going up mixed device 240 is used for revising phase information based on the envelope modeling of said amplitude information frequency spectrum type window; Said frequency spectrum type window is applicable to that the random sequence of having deposited comes windowing, is given said phase information by the random sequence of windowing with weighted value through using.
Comprehensive going up mixed the phase information that device 240 combines said amplitude information and weighted value to give, and the phase information that the amplitude information and the weighted value of said combination are given carries out predicting the multichannel signal against complex transformation.
The 250 pairs of said signals that mix of going up of mouthpart of windowing of synthesis filter group are carried out the signal that the synthesis filter group extracts the time field, said mixed signal is carried out windowing extract the output signal.
Fig. 3 illustrates the decoding device based on the multichannel audio of sound source position clue of exemplary embodiment according to another preferred.
According to another preferred embodiment based on the decoding device 300 of the multichannel audio of sound source position clue as the decoding device that is suitable for real transform (real transform) based on the multichannel audio of sound source position clue; As shown in Figure 2, it is made up of demodulation multiplexer 310, TDAC bank of filters 320, the comprehensive mouthpart 340 of windowing that mixes device 330 and synthesis filter group of going up.
SSLCC is following DFT bank of filters (conversion) basically., for kernel stereo audio interlock mutually, can use the various filters group.
Though the form of bank of filters changes to some extent, SSLCC analyzer 120 or comprehensive mixed device 240 are identical with the operating principle of window comprehensive (synthesis) between the mouthpart 250 of synthesis filter group.
Because inapplicable inversely related device (decorrelator) when solid transmits can be realized real transform.Decoding device 300 according to another preferred embodiment based on the multichannel audio of sound source position clue use can real transform MDCT come and the mutual interlock of kernel encoding and decoding.
TDAC bank of filters 320 is restored the signal that mixes down based on the bit stream of said audio frequency, omits then the said signal that mixes down is suitable for the process of the bank of filters of analyzing and the process of windowing, mixes device 330 and be sent to comprehensive going up.At this moment, the signal that transmitted of TDAC bank of filters 320 can be the signal L that mixes under the frequency with frequency under mixed signal R.
The synthesis filter group window mouthpart 340 with the synthesis filter group be applicable on comprehensive, mix generated in the device 330 on the signal that mixes and extract the signal in time field; Be used for the analysis windowing of said signal that go up to mix and the kernel stereo audio windowing of arranging in pairs or groups is mutually exported signal to extract.
At this moment; Demodulation multiplexer 310, comprehensive go up the demodulation multiplexer 210 that mixes device 330 and the decoding device 200 of the multichannel audio based on the sound source position clue according to an embodiment of the invention, comprehensive going up mixed device 240 and had identical structure, so omits detailed explanation.
According to the decoding device 300 based on the multichannel audio of sound source position clue of exemplary embodiment of the present invention, according to the T/F conversion of selecting to change the audio signal of mixing under the stereo.For example, the action of inversely related device is to close under the situation of off in decoding device, and actual T/F conversion also can be suitable for.At this moment, multiple T/F conversion also can realize, even under the situation of multiple T/F conversion, also can not use phase information.
But the action that in decoding device, needs phase information promptly under the situation of the action that needs the inversely related device, must be used multiple T/F conversion when opening on.When multiple T/F conversion, DFT becomes basically, also can use MDCT/MDST as a complex transformation to (complex transform pair).
Fig. 4 illustrates the decoding device according to the additional information bits stream of exemplary embodiment of the present invention.
Additional information is VLSA (Virtual Sound Location Angle) the bit stream of the additional information of being analyzed from demodulation multiplexer 210 as decoding according to the decoding device of the additional information bits of exemplary embodiment of the present invention stream, as shown in Figure 4ly can be made up of huffman decoder 410 and inverse guantization (IQ) device 420.
And, can belong to the bank of filters 230 of windowing and the mouthpart group 250 of windowing of synthesis filter group according to the decoding device of the additional information bits of exemplary embodiment of the present invention stream.
Huffman decoder 410 usefulness huffman coding books carry out huffman coding to the bit stream of said additional information can generate difference index (differential index).
Huffman decoder 410 comprises that contrary differential encoder 411, differential encoder 412, mapper 413 and huffman encoder 414 generate said huffman coding book.
Contrary differential encoder 411 is carried out unfavourable balance based on the information of treated preceding frame and huffman coding shape and is divided coding the next original index (original index) of can decoding.
And the corresponding negative information of from original index, deleting of information of differential encoder 412 and sinusoidal sine bit carries out differential coding then and generates index information.
Mapper 413 is used for removing said negative information deletion side-play amount (offset) information in index, shines upon said index according to frequency answer (solution) then, to be divided into first sub-band (sub band) and other frequency bands except that said first sub-band.
At last, 414 pairs of said each first sub-bands of huffman encoder and other frequency bands except that said first sub-band are suitable for the huffman coding method and generate the huffman coding book.
Huffman decoder 410 is through the huffman coding book of reference table 1 first sub-band of decoding.
[table 1]
?Index ?Num?of?bits Code?word ?Index Num.of?bits Codeword
0 5 0x17 16 5 0x1d
1 8 0x64 17 5 0x19
2 8 0x65 18 5 0x1c
3 8 0xf0 19 5 0x16
4 8 0xf1 20 5 0x18
5 7 0x33 21 5 0x14
6 7 0x79 22 5 0x13
7 6 0x18 23 5 0x15
8 6 0x22 24 5 0x1b
9 6 0x23 25 5 0x10
10 6 0x3d 26 5 0x0e
11 5 0x0b 27 5 0x0f
12 5 0x12 28 5 0x0d
13 5 0x1a 29 5 0x0a
14 4 0x04 30 2 0x00
15 5 0x1f
And when the said signal that demodulation multiplexer 210 receives was the quantized signal of 5 bits, huffman decoder 410 carried out Hofmann decoding through the huffman coding book of reference table 2.
[table 2]
Figure GSB00000713279300071
Figure GSB00000713279300081
And when the said signal that demodulation multiplexer 210 receives was the quantized signal of 4 bits, huffman decoder 410 carried out Hofmann decoding through the huffman coding book of reference table 3.
[table 3]
Figure GSB00000713279300091
Inverse guantization (IQ) device 420 utilizes the inverse guantization (IQ) table to come said difference index is carried out inverse guantization (IQ), to restore additional information.Specifically, inverse guantization (IQ) device 420 can carry out inverse guantization (IQ) through VLSA (Virtual Sound Location Angle) information in each framework of mapping with the corresponding quantization table of each VSLA.At this moment; Owing to decode with the DFT or the MDCT of framework unit basically according to the multichannel audio based on the sound source position clue of exemplary embodiment of the present invention, the smooth between framework (smoothing) is mainly satisfied by overlapping additional (overlap-add) mode via windowing.
When VLSA information was LHA (Left Half-plane Angle), inverse guantization (IQ) device 420 can carry out quantization through the quantization table of mapping table 4.
[table 4]
Figure GSB00000713279300092
Figure GSB00000713279300101
At this moment, restore the step of said additional information, when VLSA information was RHA (Right Half-plane Angle), inverse guantization (IQ) device 420 can carry out quantization through the quantization table of mapping table 5.
[table 5]
Figure GSB00000713279300102
At this moment, restore the step of said additional information, when VLSA information was LSA (Left Subsequent vector Angle), inverse guantization (IQ) device 420 can carry out quantization through the quantization table of mapping table 6.
[table 6]
Idx -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5
LSA[idx] -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5
Idx -4 -3 -2 -1 0 1 2 3 4 5 6
LSA[idx] -4 -3 -2 -1 0 1 2 3 4 5 6
Idx 7 8 9 10 11 12 13 14 15
LSA[idx] 7 8 9 10 11 12 13 14 15
At this moment, restore the step of said additional information, when VLSA information was RSA (Right Subsequent vector Angle), inverse guantization (IQ) device 420 can carry out quantization through the quantization table of mapping table 7.
[table 7]
Idx -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5
RSA[idx] -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5
Idx -4 -3 -2 -1 0 1 2 3 4 5 6
RSA[idx] -4 -3 -2 -1 0 1 2 3 4 5 6
Idx 7 8 9 10 11 12 13 14 15
RSA[idx] 7 8 9 10 11 12 13 14 15
And inverse guantization (IQ) device 420 can extract the parameter that satisfies mathematical expression 1 from the VLSA information that from each index information, obtains.
[mathematical expression 1]
θ lh = ( LHA [ idx ] - LSA [ idx ] P 4 - LSA [ idx ] ) × π 2
θ rh = ( RHA [ idx ] - P 5 RSA [ idx ] - P 5 ) × π 2
gLs=sin(θ lh)
gL=cos(θ lh)·sin((LSA[idx]-2π)×-3)
gCL=cos(θ lh)·cos((LSA[idx]-2π)×-3)
gRs=cos(θ rh)
gCR=sin(θ rh)·cos(RSA[idx]×3)
gR=sin(θ rh)·sin(RSA[idx]×3)
At this moment, from said additional information, being comprised to such an extent that the definition parameter of sub-band number is the number that can obtain sub-band among the bsFreqRes, inverse guantization (IQ) device 420 can shine upon the number of the VSLA information that transmits according to the number of sub-band.And maximum frequency band number makes be as the criterion frequency characteristic according to bit rate (bit rate) or framework of 28 frequency bands (Mpar=28) that rate is different with frequency band number respectively.
(bsFreqRes Mpar) can shine upon like following table 8 inverse guantization (IQ) device 420 through mapsubbands.
[table 8]
Figure GSB00000713279300121
Figure GSB00000713279300131
Inverse guantization (IQ) device 420 uses with the be as the criterion rate respectively of each frequency band of design of ERB frequency band can handle mathematical expression 1, and under mapping as the situation of 8 o'clock Mpar=28, the rate respectively of each frequency band can be for like table 9.
[table 9]
m M par=28 kHz
0 0 0.0702
1 1 0.1639
2 2 0.2576
3 3 0.3512
4 4 0.4449
5 5 0.5385
6 6 0.6322
7 7 0.7259
8 8 0.9132
9 9 1.1005
10 10 1.2878
11 11 1.4751
12 12 1.8498
13 13 2.2244
14 14 2.599
15 15 2.9737
16 16 3.7229
17 17 4.4722
18 18 5.2215
19 19 5.9707
20 20 6.72
21 21 7.4693
22 22 8.5932
23 23 9.7171
24 24 11.2156
25 25 13.0888
26 26 15.3366
27 27 24
At this moment, the cycle formerly symbol canceller 210, contrary BIFORE transform device 220, protection frequency band canceller 230 have the identical quantity of quantity with the employed reception antenna of said decoder, with corresponding with each reception antenna.
After the bank of filters 230 of windowing is carried out the T/F conversion, can be defined as one to the frequency band of frequency field and handle frequency band (processing band).
For example, as shown in table 10 when carrying out 2048 DFT conversion, be the center with the position of start bin, stop bin, can be defined as one to the frequency band of frequency field and handle frequency band.
[table 10]
Figure GSB00000713279300141
Figure GSB00000713279300151
Comprehensive mixed device 240 is located through the audio frequency position in each sub-band in the said signal that mixes down based on the VSLA information of from the bit stream of additional information, restoring mutually can restore the multichannel signal.Specifically, as shown in Figure 5, utilization shift angle (panning angle) from the bit stream of additional information is predicted the dynamic Information in each sub-band, can predict the signal of the sub-channel of each channel through being suitable for dynamic Information again.
Fig. 5 illustrates the comprehensive process that goes up the gain that mixes each channel of device prediction according to exemplary embodiment of the present invention.
As shown in Figure 5, comprehensive going up mixed device 240 is predicted each channel through the information of the audio frequency position phase of each channel of stage ground recovery gain (gain).
At first, comprehensively go up mixed device 240 and can restore LHA [idx] 510 and RHA [idx] 520.
The comprehensive device 240 that upward mixes is predicted gLs [idx] 530 from LHA [idx] 510, restore LSA [idx] 511, from RHA [idx] 520, predicts gRs [idx] 540, restores RSA [idx] 521.
Then, the comprehensive device 240 that upward mixes is predicted gL [idx] 550 and gCL [idx] 512 from LSA [idx] 511, from RSA [idx] 521, predict gRs [idx] 560 and gCR [idx] 522.
At last, the comprehensive device 240 that upward mixes is predicted gCL [idx]/sqrt (2) 570 from gCL [idx] 512 and gCR [idx] 522.At this moment, gCL [idx]/sqrt (2) is as gCL [idx] 512*0.7071 can be the controlled value of the gain of center channel.
The comprehensive device 240 that upward mixes can generate the signal that mixes based on the said multichannel signal through said step prediction mixing on the mixed down signal.
If X DmxL(m is the k frequency of b in of the m sub-band of the signal that mixes down of the Left that transmits k), and ' Left upmixing Matrix ' can satisfy mathematical expression 2.
[mathematical expression 2]
gCL ( m ) gL ( m ) gLs ( m ) X dmxL ( m , k ) = CL ( m , k ) L ( m , k ) Ls ( m , k )
And ' Rightupmixing Matrix ' can satisfy mathematical expression 3 to the signal that Right is mixed down.
[mathematical expression 3]
gCL ( m ) gR ( m ) gRs ( m ) X dmxR ( m , k ) = CR ( m , k ) R ( m , k ) Rs ( m , k )
And the comprehensive device 240 that upward mixes can comprise that the inversely related device (decorrelator) based on DFT is D LAnd D R
Said D LAnd D RCan move with high complex patterns (high complexity mode) with as the low complex patterns (low complexity mode) of general mode.At this moment, said D LAnd D ROnly in decoder, generate, it moves with high complex patterns when generating high tone quality, when reappearing common tonequality, moves with low complex patterns.
In high complex patterns, said D LAnd D RImplementation to L (m, k) and R (m, the matrixing (matrixing) of mathematical expression 4 k) generates the inverse association signal.
[mathematical expression 4]
Figure GSB00000713279300171
Figure GSB00000713279300172
In general mode, said D LAnd D RSatisfy mathematical expression 5, and do not generate the inverse association signal.
[mathematical expression 5]
Figure GSB00000713279300173
Figure GSB00000713279300174
Comprehensive going up mixed the value that mixed said mathematical expression 2 and 3 is calculated on the device 240 use mathematical expressions 6.
[mathematical expression 6]
0.7071 0.7071 0 0 0 0 0 0 0 0 α ( m ) 1 - α ( m ) 0 0 0 0 0 0 0 0 α ( m ) 1 - α ( m ) 0 0 0 0 0 0 0 δ 0 0 0 0 0 0 0 0 δ 0 0 0 1 0 1 0 0 0 CL ( m , k ) CR ( m , k ) L ( m , k ) L wet ( m , k ) R ( m , k ) R wet ( m , k ) Ls ( m , k ) Rs ( m , k ) = C ( m , k ) L ( m , k ) R ( m , k ) Ls ( m , k ) Rs ( m , k ) lfe ( m , k ) , m<4
0.7071 0.7071 0 0 0 0 0 0 0 0 1 - α ( m ) α ( m ) 0 0 0 0 0 0 0 0 1 - α ( m ) α ( m ) 0 0 0 0 0 0 0 δ 0 0 0 0 0 0 0 0 δ 0 0 0 0 0 0 0 0 0 CL ( m , k ) CR ( m , k ) L ( m , k ) L wet ( m , k ) R ( m , k ) R wet ( m , k ) Ls ( m , k ) Rs ( m , k ) = C ( m , k ) L ( m , k ) R ( m , k ) Ls ( m , k ) Rs ( m , k ) lfe ( m , k ) , m≥4
At this moment, α (m) can be the L that indicates each frequency band and the factor of the relation between the R signal.δ is a fixed coefficient, when encoder is downloaded, its can for to around the fixed coefficient of back mixing syzygy number of (surround) signal.
The value that said α (m) is calculated said mathematical expression 4 and 5 is utilized mathematical expression 7 to calculate and is obtained.
[mathematical expression 7]
α ( m ) = X dmxL ( m , k ) X dmxR * ( m , k ) X dmxL ( m , k ) X dmxL * ( m , k ) X dmxR ( m , k ) X dmxR * ( m , k ) × γ
Go into coefficient, can be the value of the mixability that is used for adjusting the inverse association signal as weighted value.Therefore, at 0≤α (m)≤γ, γ can define α (m) in the scope of 0≤γ≤1.
And, said WetL (m, k) with Wet(m, k) signal as inverse association can be generated via the practiced inversely related technology of inversely related device R.
Fig. 6 illustrates the inversely related device according to exemplary embodiment of the present invention.
Inversely related device 600 according to the present invention is on comprehensive, to mix in the device 240 to be comprised the key element that forms the inverse association signal; As shown in Figure 6, it can comprise complex transformation device 610, amplitude information withdrawal device 620, phase information withdrawal device 630, random sequential memory 640, the mouthpart 650 of windowing, position phasing commutator 660, synthesizer 670 and contrary complex transformation device 680.
610 pairs of said signals that mix down of complex transformation device can carry out complex transformation.
The withdrawal device 630 of amplitude information withdrawal device 620 and phase information is from extracting amplitude information and phase information respectively, to separate the said signal that mixes down the said signal that mixes down of 610 conversion of complex transformation device.
The envelope of the said amplitude information that the mouthpart 650 of windowing is extracted based on amplitude information withdrawal device 620 is used for revising phase information modeling frequency spectrum type window, through the random sequence of in random sequential memory 640, having deposited being used said frequency spectrum type window windowing.
The quantity of the random sequence of in random sequential memory 640, having deposited at this moment, is according to the quantity of the said signal that mixes down and by fixed.That is, said in order to generate WetL (m, k) with Wet(m k), uses different random sequence mutually to R, and the degree of association of employed two random sequences is near 0 at this moment.
Position phasing commutator 660 utilizes the random sequence of from the mouthpart 650 of windowing, opening a window can give the said phase information that extracts to from phase information withdrawal device 630 with weighted value.
Synthesizer 670 can combine the said amplitude information that from amplitude information withdrawal device 620, extracts and from position phasing commutator 660, apply the phase information of weighted value.
The information that contrary complex transformation device 680 will be combined in synthesizer 670 is carried out contrary complex transformation and is calculated the inverse association signal.
Fig. 7 illustrates out the coding/decoding method based on the multichannel audio of sound source position clue according to exemplary embodiment of the present invention.
In step S710, demodulation multiplexer 210 receives the signal that is transmitted by multiplexer 150, and the signal that receives is parsed into the bit stream of stereo audio and the bit stream of additional information.
In step S720, the bit stream of the said audio frequency that audio decoder 220 is based among the step S710 to be analyzed can restore the signal that mixes down.
In step S730, the bit stream of the said additional information that the coding book of huffman decoder 410 usefulness Huffmans will be analyzed in step S710 carries out Hofmann decoding, to generate the difference index.
In step S740, the difference index that inverse guantization (IQ) device 420 usefulness inverse guantization (IQ) tables will be generated in S730 carries out inverse guantization (IQ), to restore additional information.Specifically, inverse guantization (IQ) device 420 can carry out inverse guantization (IQ) through information mapping and the corresponding quantization table of each VSLA to the VSLA of each frame.
In step S750; The comprehensive device 240 that upward mixes uses said signal that mixes down that in step S720, is restored and the said additional information of in step S740, being restored to predict the multichannel signal, generates the signal that mixes based on mixing the said signal that mixes down on the said multichannel signal.
In step S760, the signal that mixes on 250 pairs of in step S750, generated said of mouthpart of windowing of synthesis filter group is carried out the synthesis filter group and is extracted the signal in the time field, and opens the said window of going up the signal that mixes and can extract the output signal.
As stated; According to exemplary embodiment of the present invention based on the decoding device of the multichannel audio of sound source position clue and method through multi channel audio signal is received and compression; And three-dimensional signal is compressed and transmission via the three-dimensional codec of kernel (core stereo codec); When the reverse compatibility of encoding with existing stereo audio is provided, can transmit multichannel audio.
Although concrete exemplary embodiment of the present invention has been described for the intention of setting forth, those skilled in the art can carry out various modifications, interpolation and replacement to it under the situation that does not break away from the spirit and scope of the present invention that defined by claim.Therefore scope of the present invention should be defined by Rights attached thereto claim such as will demand for peace.

Claims (10)

1. decoding device based on the multichannel audio of sound source position clue, it comprises:
Demodulation multiplexer receives signal, and the above-mentioned signal that will receive is parsed into audio bitstream and additional information bits stream;
Audio decoder based on said audio bitstream, restores mixing signal down, and the following mixed signal that will restore sends to and comprehensively mixed device, and analysis filterbank is not applicable to the following mixed signal of said recovery;
Comprehensive going up mixed device, through from the said mixed signal down of recovery, separating amplitude information and phase information, based on said amplitude information the random sequence of having deposited carried out windowing; Weighted value is given to said phase information, based on said amplitude information and the said phase information that is endowed weighted value, and based on the said additional information bits stream that restores; Prediction multichannel signal; Based on said multichannel signal said down mixed signal is gone up and to be mixed, generate mixed signal, wherein said additional information bits is flowed and carry out Hofmann decoding and generate the difference index; And with the quantization table said difference index is carried out inverse guantization (IQ), to restore said additional information; With
The mouthpart of windowing of synthesis filter group uses the synthesis filter group to extract time field signal to the said signal that upward mixes, and the said signal that upward mixes is carried out windowing, extracts the output signal.
2. the decoding device of the multichannel audio based on the sound source position clue as claimed in claim 1, said audio decoder have and use time domain to mix the three-dimensional kernel codec of repeatedly cancelling the TDAC bank of filters.
3. the decoding device of the multichannel audio based on the sound source position clue as claimed in claim 1, the mouthpart of windowing of said synthesis filter group shines upon mutually for the analysis windowing with the kernel stereo audio, and said going up mixed signal and carried out windowing.
4. the decoding device of the multichannel audio based on the sound source position clue as claimed in claim 1, said comprehensive going up mixed the said mixed signal down of device complex transformation, separated amplitude information and phase information the following mixed signal of complex transformation from said.
5. the decoding device of the multichannel audio based on the sound source position clue as claimed in claim 1; Said comprehensive going up mixed the envelope of device based on said amplitude information; For revising phase information, modeling frequency spectrum type window uses the random sequence of having deposited that said frequency spectrum type window is carried out windowing; And use the random sequence of windowing, weighted value is given to said phase information.
6. the decoding device of the multichannel audio based on the sound source position clue as claimed in claim 1; The said comprehensive device that upward mixes interosculates said amplitude information and the said phase information that is endowed weighted value; Said amplitude information and the said phase information that is endowed weighted value to combining carry out contrary complex transformation, predict the multichannel signal.
7. coding/decoding method based on the multichannel audio of sound source position clue, it comprises:
The signal that receives is parsed into audio bitstream and additional information bits stream;
Based on said audio bitstream, restore mixing signal down, but analysis filterbank is not applicable to said mixed signal down;
Said additional information bits stream is carried out Hofmann decoding, generate the difference index;
With the quantization table said difference index is carried out inverse guantization (IQ), to restore additional information;
Separation amplitude information and phase information from the said mixed signal down that restores; Based on said amplitude information the random sequence of having deposited is carried out windowing; Weighted value is given to said phase information; Based on said amplitude information and the said phase information that is endowed weighted value, and, predict the multichannel signal based on the said additional information of restoring;
Based on said multichannel signal, upward mix said mixed signal down, mix signal to generate;
Use the synthesis filter group to extract time field signal to the said signal that upward mixes; With
The said signal that upward mixes is carried out windowing, extract the output signal.
8. the coding/decoding method of the multichannel audio based on the sound source position clue as claimed in claim 7, said separating step comprises: the said mixed signal down of complex transformation; With separated amplitude information and phase information from said the following mixed signal of complex transformation.
9. the coding/decoding method of the multichannel audio based on the sound source position clue as claimed in claim 7, said step of giving weighted value comprises: based on the envelope of said amplitude information, be to revise phase information, modeling frequency spectrum type window; Use the random sequence of having deposited that said frequency spectrum type window is carried out windowing; With the random sequence of using windowing, weighted value is given to said phase information.
10. the coding/decoding method of the multichannel audio based on the sound source position clue as claimed in claim 7, the step of said prediction multichannel signal comprises: said amplitude information and the said phase information that is endowed weighted value are interosculated; With being carried out, said amplitude information that combines and the phase information that is endowed weighted value predict the multichannel signal against complex transformation.
CN2009102238140A 2008-12-03 2009-11-23 Decoder and decoding method for multichannel audio coder using sound source location cue Expired - Fee Related CN101754086B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR20080121709 2008-12-03
KR121709/08 2008-12-03
KR64918/09 2009-07-16
KR1020090064918A KR101176703B1 (en) 2008-12-03 2009-07-16 Decoder and decoding method for multichannel audio coder using sound source location cue

Publications (2)

Publication Number Publication Date
CN101754086A CN101754086A (en) 2010-06-23
CN101754086B true CN101754086B (en) 2012-11-14

Family

ID=42363561

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009102238140A Expired - Fee Related CN101754086B (en) 2008-12-03 2009-11-23 Decoder and decoding method for multichannel audio coder using sound source location cue

Country Status (2)

Country Link
KR (1) KR101176703B1 (en)
CN (1) CN101754086B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012045203A1 (en) * 2010-10-05 2012-04-12 Huawei Technologies Co., Ltd. Method and apparatus for encoding/decoding multichannel audio signal
KR20240028560A (en) * 2016-01-27 2024-03-05 돌비 레버러토리즈 라이쎈싱 코오포레이션 Acoustic environment simulation
KR20220005379A (en) * 2020-07-06 2022-01-13 한국전자통신연구원 Apparatus and method for encoding/decoding audio that is robust against coding distortion in transition section

Also Published As

Publication number Publication date
KR20100063639A (en) 2010-06-11
CN101754086A (en) 2010-06-23
KR101176703B1 (en) 2012-08-23

Similar Documents

Publication Publication Date Title
CN102892070B (en) Enhancing coding and the Parametric Representation of object coding is mixed under multichannel
CN101930740B (en) Multichannel audio signal decoding using de-correlated signals
CN101821799B (en) Audio coding using upmix
Herre et al. The reference model architecture for MPEG spatial audio coding
EP1749296B1 (en) Multichannel audio extension
KR101823278B1 (en) Audio encoder, audio decoder, methods and computer program using jointly encoded residual signals
CN101202043B (en) Method and system for encoding and decoding audio signal
CN100533990C (en) Coder for compressing coding of multiple audio track digital audio signal
CN101887726A (en) The method of stereo coding and decoding and equipment thereof
WO2009059631A1 (en) Audio coding apparatus and method thereof
EP2095364A1 (en) Method for encoding and decoding object-based audio signal and apparatus thereof
CN105103225A (en) Stereo audio encoder and decoder
CN1875402A (en) Audio signal encoding or decoding
CN105378832A (en) Audio object separation from mixture signal using object-specific time/frequency resolutions
US20110029113A1 (en) Combination device, telecommunication system, and combining method
CN102592598A (en) Apparatus and method for restoring multi-channel audio signal
CN103403799A (en) Apparatus and method for processing an audio signal and for providing a higher temporal granularity for a combined unified speech and audio codec (USAC)
CN104756186A (en) Decoder and method for multi-instance spatial-audio-object-coding employing a parametric concept for multichannel downmix/upmix cases
CN101754086B (en) Decoder and decoding method for multichannel audio coder using sound source location cue
CN102792369A (en) Audio-processing device, audio-processing method and program
CN101031961B (en) Processing of encoded signals
CN113314132B (en) Audio object coding method, decoding method and device in interactive audio system
JP2006003580A (en) Device and method for coding audio signal
CN103165135B (en) Digital audio coarse layering coding method and digital audio coarse layering coding device
CN1783726B (en) Decoder for decoding and reestablishing multi-channel audio signal from audio data code stream

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20100623

Assignee: Neo Lab Convergence Inc.

Assignor: Korea Electronic Communication Institute

Contract record no.: 2016990000259

Denomination of invention: Decoder and decoding method for multichannel audio coder using sound source location cue

Granted publication date: 20121114

License type: Exclusive License

Record date: 20160630

LICC Enforcement, change and cancellation of record of contracts on the licence for exploitation of a patent or utility model
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121114

Termination date: 20191123