CN101479787A

CN101479787A - Method for encoding and decoding object-based audio signal and apparatus thereof

Info

Publication number: CN101479787A
Application number: CNA2007800242526A
Authority: CN
Inventors: 尹圣龙; 房熙锡; 李显国; 金东秀; 林宰显
Original assignee: LG Electronics Inc
Current assignee: LG Electronics Inc
Priority date: 2006-09-29
Filing date: 2007-10-01
Publication date: 2009-07-08
Anticipated expiration: 2027-10-01
Also published as: CN101479787B; CN101484935A; CN101479786A; CN101479785A; CN101484935B; CN101479785B; CN101479786B

Abstract

Provided are an audio encoding method and apparatus and an audio decoding method and apparatus in which audio signals can be encoded or decoded so that sound images can be localized at any desired position for each object audio signal. The audio decoding method comprises extracting downmix signal and object-based side information from the audio signal; generating corrected downmix signal based on the information extracted from the downmix signal and the and object-based side information; processing the downmix signal based on the channel signal losing correlation; and generating multi-channel audio signal based on the processed downmix signal and the channel-based side information.

Description

The method and apparatus that is used for the object-based sound signal of Code And Decode

Technical field

The present invention relates to a kind of audio coding method and device, and a kind of audio-frequency decoding method and device, wherein the acoustic image of each object audio signal can be located in the position of any hope.

Background technology

In general, in multi-channel audio coding and decoding technique, a plurality of sound channel signals of multi-channel signal are reduced audio mixing and are advanced in the minority sound channel signal, and transmission has multi-channel signal with the as many sound channel of original multi-channel signal about the side information of original channel signal and recovery.

Object-based audio coding and decoding technique and multi-channel audio coding and decoding technique are advancing a plurality of sound source reduction audio mixings in the minority sound source signals, and the side information aspect of transmitting about original sound source is similar basically.Yet, in object-based audio coding and decoding technique, object signal, it is the fundamental element (for example sound of musical instrument or people's voice) of sound channel signal, be regarded as identical with sound channel signal in multi-channel audio coding and the decoding technique, and also can be by coding/decoding.

In other words, in object-based audio coding and decoding technique, each object signal will be regarded as the main body (entities) of coding/decoding.In this, object-based audio coding and decoding technique and multi-channel audio coding and decoding technique are distinguishing, this difference is that the multichannel audio coding/decoding is simple according to information between sound channel and by coding/decoding, and with irrelevant by the number of elements in the sound channel signal of coding/decoding.

Summary of the invention

Technical matters

The invention provides a kind of audio coding method and device, and a kind of audio-frequency decoding method and device, wherein can be to coding audio signal or decoding so that the acoustic image of each object audio signal can be located in the position of any hope.

Technical scheme

According to an aspect of the present invention, it provides a kind of audio-frequency decoding method, comprising: extract reduction audio signal and object-based side information from sound signal; The control information of playing up the reduction audio signal according to object-based side information and being used to generates the side information based on sound channel; Use the sound channel signal of decorrelation to handle the reduction audio signal; And the reduction audio signal after use handling and generate multi-channel audio signal based on the side information of sound channel.

According to another aspect of the present invention, it provides a kind of audio decoding apparatus, comprising: demodulation multiplexer is used for extracting reduction audio signal and object-based side information from sound signal; Parametric converter is used for generating side information based on sound channel according to the control information that object-based side information and being used to is played up the reduction audio signal; Reduction audio mixing processor, being used in the reduction audio signal is under the situation of stereo reduction audio signal, uses the reduction audio signal of decorrelation to revise the reduction audio signal; And multi-channel decoder, be used to use amended reduction audio signal that reduction audio mixing processor obtains and generate multi-channel audio signal based on the side information of sound channel.

According to another aspect of the present invention, it provides a kind of audio-frequency decoding method, comprising: extract reduction audio signal and object-based side information from sound signal; The control information of playing up the reduction audio signal according to object-based side information and being used to generates side information and the one or more processing parameter based on sound channel; Use the reduction audio signal and generate multi-channel audio signal based on the side information of sound channel; Use processing parameter to revise multi-channel audio signal.

According to another aspect of the present invention, it provides a kind of audio decoding apparatus, comprising: demodulation multiplexer is used for extracting reduction audio signal and object-based side information from sound signal; Parametric converter, the control information of playing up the reduction audio signal according to object-based side information and being used to generates side information and the one or more processing parameter based on sound channel; Multi-channel decoder is used to use the reduction audio signal and generates multi-channel audio signal based on the side information of sound channel; And the sound channel processor, be used to use processing parameter to revise multi-channel audio signal.

According to another aspect of the present invention, it provides a kind of computer readable recording medium storing program for performing, wherein records a kind of audio-frequency decoding method, and this method comprises: extract reduction audio signal and object-based side information from sound signal; The control information of playing up the reduction audio signal according to object-based side information and being used to generates the side information based on sound channel; Use the sound channel signal of decorrelation to handle the reduction audio signal; And use the reduction audio signal after the processing that obtains by exchange and generate multi-channel audio signal based on the side information of sound channel.

According to another aspect of the present invention, it provides a kind of computer readable recording medium storing program for performing, wherein records a kind of audio-frequency decoding method, and this method comprises: extract reduction audio signal and object-based side information from sound signal; The control information of playing up the reduction audio signal according to object-based side information and being used to generates side information and the one or more processing parameter based on sound channel; Use the reduction audio signal and generate multi-channel audio signal based on the side information of sound channel; Use processing parameter to revise multi-channel audio signal.

Beneficial effect

Provide a kind of audio coding method and device, and a kind of audio-frequency decoding method and device, wherein can be to coding audio signal or decoding so that the acoustic image of each object audio signal can be located in the position of any hope.

Description of drawings

By the following detailed description and accompanying drawing, the present invention's easy to understand more that will become, accompanying drawing is exemplary, and it is not construed as limiting the invention, wherein:

Fig. 1 is the block scheme of typical object-based audio coding/decoding system;

Fig. 2 is the block scheme according to the audio decoding apparatus of first embodiment of the invention;

Fig. 3 is the block scheme according to the audio decoding apparatus of second embodiment of the invention;

Fig. 4 is used to explain amplitude difference and mistiming for the acoustic image location influence, and it is separate;

Fig. 5 is the functional arrangement about amplitude difference and the corresponding relation between the mistiming, and wherein this amplitude difference and mistiming are that acoustic image is positioned the precalculated position is needed;

Fig. 6 represents to comprise the form of the control data of harmonic information;

Fig. 7 is the block scheme according to the audio decoding apparatus of third embodiment of the invention;

Fig. 8 is the block scheme of art reduction audio mixing gain (ADG) module in the audio decoding apparatus that can be applied to as shown in Figure 7;

Fig. 9 is the block scheme according to the audio decoding apparatus of fourth embodiment of the invention;

Figure 10 is the block scheme according to the audio decoding apparatus of fifth embodiment of the invention;

Figure 11 is the block scheme according to the audio decoding apparatus of sixth embodiment of the invention;

Figure 12 is the block scheme according to the audio decoding apparatus of seventh embodiment of the invention;

Figure 13 is the block scheme according to the audio decoding apparatus of eighth embodiment of the invention;

Figure 14 is used to explain that the audio decoding apparatus by shown in Figure 13 is used in the block diagram of the application of the three-dimensional of frame (3D) information;

Figure 15 is the block scheme according to the audio decoding apparatus of ninth embodiment of the invention;

Figure 16 is the block scheme according to the audio decoding apparatus of tenth embodiment of the invention;

Figure 17-the 19th is used to explain the block diagram of audio-frequency decoding method according to an embodiment of the invention;

Figure 20 is the block scheme of audio coding apparatus according to an embodiment of the invention.

Implement optimal mode of the present invention

Describe the present invention in detail referring now to accompanying drawing, represented exemplary embodiment of the present invention in the accompanying drawings.

Can be applied to object-based Audio Processing operation according to a kind of audio coding method of the present invention and device and a kind of audio-frequency decoding method and device, but the present invention is not limited to this.In other words, this audio coding method and device and audio-frequency decoding method and device also can be applied to the various signal processing operations outside the object-based Audio Processing operation.

Fig. 1 is the block scheme of typical object-based audio coding/decoding system.As a rule, the sound signal that inputs to object-based audio coding apparatus is not corresponding with the sound channel of multi-channel signal, and these sound signals are object signal independently.In this, object-based audio coding apparatus is different with the multi-channel audio coding device, and its difference is the sound channel signal of multi-channel audio coding device input multi-channel signal.

For instance, be imported in the multi-channel audio coding device such as the left front sound channel signal of 5.1 sound channel signals and the sound channel signal the right front channels signal, yet the object audio signal of the little main body of the ratio sound channel signal such as people's voice or musical instrument sound (for example sound of violin or piano) can be imported in the object-based audio coding apparatus.

Referring to Fig. 1, this object-based audio coding/decoding system comprises: object-based audio coding apparatus and object-based audio decoding apparatus.Object-based audio coding apparatus comprises object encoder 100, and object-based audio decoding apparatus comprises object decoder 111 and renderer 113.

Object encoder 100 receives N object audio signal, and generate the object-based reduction audio signal and the side information that have one or more sound channels, above-mentioned side information comprises many message slots that extract, for example energy difference, phase differential and relevance values from N object audio signal.Side information and object-based reduction audio signal are integrated with in the single bit stream, and this bit stream is transferred to object-based decoding device.

Side information can comprise and indicates whether to carry out based on the audio coding/decoding of sound channel or the sign of object-based audio coding/decoding, then, can determine that the audio coding/decoding of carrying out based on sound channel still is to carry out object-based audio coding/decoding according to the sign of side information.Side information also can comprise envelope information about object signal, grouping information, repose period information and deferred message.Side information also can comprise simple crosscorrelation information between object level difference information, object, reduction audio mixing gain information, reduction upmixed channels level difference information and absolute object energy information.

Object decoder 111 receives from object-based reduction audio signal and side information based on the object audio coding apparatus, and recovers to have object signal with N object audio signal like attribute according to object-based reduction audio signal and side information.The object signal that is generated by object decoder 111 is not assigned to any position in the multichannel space.Therefore each of renderer 113 object signal that will be generated by object decoder 111 is distributed to the precalculated position in the multichannel space, and renderer 113 is determined the level of object signal, like this can be by reproducing object signal by each relevant position of renderer 113 appointments and each corresponding level of being determined by renderer 113.The control information relevant with each object signal that is generated by object decoder 111 can change in time, then, can be changed according to control signal by the level and the locus of the object signal of object decoder 111 generations.

Fig. 2 is the block scheme according to the audio decoding apparatus 120 of first embodiment of the invention.Referring to Fig. 2, this audio decoding apparatus 120 comprises: object decoder 121, renderer 123 and parametric converter 125.This audio decoding apparatus 120 also comprises the demodulation multiplexer (not shown), be used for extracting reduction audio signal and side information from the bit stream of input, and this demodulation multiplexer will be applied in all audio decoding apparatus according to other embodiments of the invention.

Object decoder 121 generates a plurality of object signal according to the reduction audio signal with by the amended side information that parametric converter 125 provides.Each of the object signal that renderer 123 will be generated by object decoder 121 is assigned to the precalculated position in the multichannel space, and determines level by the object signal of object decoder 121 generations according to control information.Parametric converter 125 generates amended side information by combination side information and control information.Then, parametric converter 125 is transferred to object decoder 121 with amended side information.

Object decoder 121 can be carried out adaptive decoding by the control information in the side information after the analysis modify.

For instance, if control information indicates first object signal and second object signal to be assigned to identical position in the multichannel space, and has identical level, then typical audio decoding apparatus first and second object signal of can decoding respectively are then by audio mixing/play up operation they are arranged in the multichannel space.

On the other hand, learn that first and second object signal are assigned to the same position in the multichannel space in the control information of object decoder 121 from amended side information of audio decoding apparatus 120, and having same level, is independent sound sources as first and second object signal.Thereby object decoder 121 is regarded first and second object signal as an independent sound source and first and second object signal of decoding, and not with they separately decodings.Like this, complexity of decoding has reduced.In addition, because the quantity of the sound source of need handling has reduced, the complexity of audio mixing/play up has also reduced.

Audio decoding apparatus 120 can effectively be used in quantity when object signal greater than this situation of the quantity of output channels, because a plurality of object signal probably is assigned to identical locus.

Optionally, audio decoding apparatus 120 can be used in when first object signal and second object signal and be assigned to same position in the multichannel space, but has this situation of varying level.In this case, audio decoding apparatus 120 is considered as one first and second object signal of decoding with first and second object signal, and first and second object signal of not decoding respectively, and decoded first and second object signal are transferred to renderer 123.More particularly, the control information of object decoder 121 from amended side information obtains the information about the difference between the level of first and second object signal, and according to the information that obtains first and second object signal of decoding.Like this, even first and second object signal have varying level, also first and second object signal can be decoded as the single sound source.

Equally optionally, object decoder 121 can be adjusted the level of the object signal that is generated by object decoder 121 according to control information.Then, object decoder 121 decodable codes are adjusted the object signal of over level.Thereby renderer 123 does not need to adjust the decoded object signal that is provided by object decoder 121, and as long as simply will be arranged in the multichannel space by the decoded object signal that object decoder 121 provides.In brief, because object decoder 121 has been adjusted the level of the object signal that is generated by object decoder 121 according to control information, renderer 123 can be easy to and will be arranged in the multichannel space by the object signal that object decoder 121 generates, and does not need the level of extra adjustment by the object signal of object decoder 121 generations.Therefore, can reduce the complexity of audio mixing/play up.

According to the embodiment of Fig. 2, the object decoder of audio decoding apparatus 120 can be by coming adaptive execution decode operation to the analysis of control information, thereby reduce the complexity of complexity of decoding and audio mixing/play up.Can use the combination of the said method of carrying out by audio decoding apparatus 120.

Fig. 3 is the block scheme according to the audio decoding apparatus 130 of second embodiment of the invention.Referring to Fig. 3, audio decoding apparatus 130 comprises object decoder 131 and renderer 133.This audio decoding apparatus 130 is characterised in that: it not only provides side information to object decoder 131, also offers renderer 133.

Even when the object signal that exists corresponding to repose period, audio decoding apparatus 130 also can effectively be carried out decode operation.For instance, second to the 4th object signal may be corresponding to the musical performance phase of instrument playing, and the repose period that first object signal may be played corresponding to accompaniment.In this case, indicate in a plurality of object signal which can be included in the side information, and this side information can be provided for renderer 133 and object decoder 131 corresponding to the information of repose period.

Object decoder 131 can be by not minimizing decoding complex degree to decoding corresponding to the object signal of repose period.131 1 object signal of object decoder are set to corresponding to 0 value, and give renderer 133 with the level transmissions of this object signal.In general, the object signal with 0 value is regarded as identical with the object signal with non-0 value, and enters audio mixing/play up operation together.

On the other hand, audio decoding apparatus 130 transmission comprises that a plurality of target object of indication give renderer 133 corresponding to the side information of the information of repose period, the audio mixing that then stops object signal corresponding to repose period to enter to be carried out by renderer 133/play up operation.Therefore, audio decoding apparatus 130 can stop the unnecessary increase of the complexity of audio mixing/play up.

Renderer 133 can use the audio mixing parameter information that is included in the control information to define the acoustic image of each object signal in the stereo scene.The audio mixing parameter information can only comprise amplitude information or comprise amplitude information and temporal information.The audio mixing parameter information not only influences the location of stereo sound image, also influences the psychoacoustic sensation of user for the spatial sound quality.

For instance, by what generate by elutriation service time method and amplitude elutriation method more respectively, and two acoustic images that use 2 channel stereo loudspeakers to reproduce in same position, can learn that amplitude elutriation method can realize the accurate location of acoustic image, and time elutriation method can provide the natural sound of the deep sense in space.Then, if renderer 133 only uses amplitude elutriation method to arrange object signal in the multichannel space, renderer 133 can each acoustic image of accurate localization, but the deep sense of the sound when elutriation service time method can not be provided.According to the type of sound source, the user's accurate location of preference sound rather than deep sense of sound sometimes, vice versa.

Fig. 4 (a) and 4 (b) explain that intensity difference (amplitude difference) and mistiming are for the acoustic image location influence when using 2 channel stereo loudspeakers to come reproducing signal.Referring to Fig. 4 (a) and 4 (b),, an acoustic image is navigated to predetermined angular according to independently amplitude difference and mistiming mutually.For example, can use the amplitude difference of about 8dB, or the mistiming of the about 0.5ms that equates with the amplitude difference of 8dB is positioned at angle 20 with acoustic image.Therefore, even only provide amplitude difference as the audio mixing parameter information, also can be by amplitude difference being converted to the multiple sound that the mistiming obtains to have different attribute, wherein the mistiming is equal to amplitude difference during the acoustic image location.

Fig. 5 represents about acoustic image being positioned

angle

10,20 and 30 needed amplitude differences and the function of corresponding relation between the mistiming.Function shown in Fig. 5 can obtain according to Fig. 4 (a) and 4 (b).Referring to Fig. 5, the comparison of multiple amplitude difference-mistiming can be provided to acoustic image is positioned the precalculated position.For example, the amplitude difference of supposing 8dB is provided as the audio mixing parameter information acoustic image is positioned at angle 20.According to function shown in Figure 5, also can use the combination of the mistiming of the amplitude difference of 3dB and 0.3ms that acoustic image is positioned at angle 20.In this case, not only provide amplitude difference information also to provide time difference information, thereby strengthened spatial impression as the audio mixing parameter information.

Therefore, in order to generate the sound of the attribute with user expectation in audio mixing/play up operating period, the audio mixing parameter information can be by suitable conversion, makes it possible to carry out the amplitude elutriation that is suitable for the user and any one in the time elutriation.That is to say that if the audio mixing parameter information only comprises amplitude difference information, but user expectation has the sound of the deep sense in space, this amplitude difference information can be converted into the time difference information that is equal to amplitude difference information with reference to psychoacoustic data.Optionally, if the user expects the accurate location of the sound and the acoustic image of the deep sense in space simultaneously, amplitude difference information can be converted into amplitude difference information and be equal to the combination of the time difference information of original amplitude information.Optionally, if the audio mixing parameter information only comprises time difference information, but the accurate location of user expectation acoustic image, this time difference information can be converted into the amplitude difference information that is equal to time difference information, maybe can be converted into the combination of amplitude difference information and time difference information, this combination can be by accurate location that strengthens acoustic image and the preference that spatial impression satisfies the user.

Still optionally, if the audio mixing parameter information comprises amplitude difference information and time difference information, and the user selects the accurate location of acoustic image, and the combination of amplitude difference information and time difference information can be converted into the amplitude difference information of the combination that is equal to original amplitude difference information and time difference information.On the other hand, if the audio mixing parameter information comprises amplitude difference information and time difference information, and the enhancing of user expectation spatial impression, the combination of amplitude difference information and time difference information can be converted into the time difference information that is equal to amplitude difference information and original time difference information combination.Referring to Fig. 6, control information can comprise audio mixing about one or more object signal/play up information and harmonic information.Harmonic information can comprise the Pitch Information about one or more object signal, fundamental frequency information and dominant frequency take a message in the breath at least one and the explanation of the frequency spectrum of each subband of each object signal and energy.

Because be the deficiency of sharpness of the renderer of unit executable operations with the subband, harmonic information can be used in is playing up operating period process object signal.

If this harmonic information comprises the Pitch Information about one or more object signal, can weaken or strengthen the gain that predetermined frequency area is adjusted each object signal by using comb filter or contrary comb filter.For instance, if in a plurality of object signal is a voice sound signal, these object signal can be used to Karaoke by only weakening voice sound signal.Optionally, if harmonic information comprises the dominant frequency domain information about one or more object signal, then can carry out the processing that weakens or strengthen the dominant frequency territory.Still optionally, if harmonic information comprises the spectrum information about one or more object signal, can be by carrying out not by the weakening of any subband boundary limitation or strengthening the gain of controlling each object signal.

Fig. 7 is the block scheme of audio decoding apparatus 140 in accordance with another embodiment of the present invention.Referring to Fig. 7, audio decoding apparatus 140 uses multi-channel decoders 141 to replace object decoder and renderer, and in object signal by proper arrangement decoding a plurality of object signal in back in the multichannel space.

Specifically, audio decoding apparatus 140 comprises multi-channel decoder 141 and parametric converter 145.Multi-channel decoder 141 generates multi-channel signal, the object signal of these multi-channel signals is arranged in the multichannel space according to reduction audio signal and spatial parameter information, and this spatial parameter information is the side information based on sound channel that is provided by parametric converter 145.Parametric converter 145 is analyzed by next side information and the control information of audio coding apparatus (not shown) transmission, and according to the parameter information of analyzing of the span as a result.More specifically, parametric converter 145 generates spatial parameter information by side information and control information, and this control information comprises playback configuration information and audio mixing information.That is to say that corresponding to one to two (OTT) box or two to three (TTT) box, parametric converter 145 is a spatial data to the combined transformation of side information and control information.

Audio decoding apparatus 140 can be carried out multi-channel decoding operation, wherein object-based decode operation and audio mixing/play up operation to be merged, and can skip decoding to each object signal.Therefore, can reduce the complexity of decoding and/or audio mixing/play up.

For instance, when the multi-channel signal that uses 5.1 channel loudspeaker playback systems to reproduce 10 object signal and obtain according to these 10 object signal, typical object-based audio decoding apparatus generates the decoded signal that corresponds respectively to these 10 object signal according to reduction audio signal and side information, and by these 10 object signal proper arrangements are generated 5.1 sound channel signals in the multichannel space, then these object signal become and are suitable for 5.1 channel loudspeaker environment.Yet during 5.1 sound channel signals generated, the efficient that generates 10 object signal was very low, and the difference of this problem between the number of channels of the quantity of object signal and the multi-channel signal that will generate becomes more serious when increasing.

On the other hand, according to embodiment shown in Figure 7, audio decoding apparatus 140 generates the spatial parameter information that is suitable for 5.1 sound channel signals according to side information and control information, and spatial parameter information and reduction audio signal are offered multi-channel decoder 141.Then, multi-channel decoder 141 generates 5.1 sound channel signals according to spatial parameter information and reduction audio signal.In other words, when the number of channels that will export is 5.1 sound channels, audio decoding apparatus 140 can be easy to generate 5.1 sound channel signals according to the reduction audio signal, and do not need to generate 10 object signal, then this audio decoding apparatus with respect to common audio decoding apparatus more efficient aspect the complexity.

When calculating the calculated amount required corresponding to the spatial parameter information of each OTT box and TTT box when carrying out audio mixing/play up operate required calculated amount after each object signal decoding by analyzing the side information that come by the audio coding apparatus transmission and control information, this audio decoding apparatus 140 is more effective.

Come a module that is used for span parameter information is joined typical multichannel audio decoding device by analyzing side information and control information, can obtain this audio decoding apparatus 140, and can keep the compatibility with typical multichannel audio decoding device.Same, audio decoding apparatus 140 can improve sound quality by the existing instrument that uses typical multi-channel decoding device, and such as the envelope shaping device, the subband time domain is handled (STP) instrument and decorrelator.By foregoing, can infer that all advantages of typical multichannel audio coding/decoding method all can be applied to object-based audio-frequency decoding method easily.

The spatial parameter information that is transferred to multi-channel decoder 141 by parametric converter 145 can be compressed to be suitable for transmission.Optionally, spatial parameter information can have and the same form of data that is transmitted by typical multi-channel encoder device.That is to say that spatial parameter information can enter Hofmann decoding operation or pilot tone decode operation, and can be used as unpressed spatial cues data (space cue data) and be transferred to each module.Preceding a kind of being suitable for comes the transmission space parameter information to give the multichannel audio decoding device by remote control, the back a kind of also very convenient because do not need the multichannel audio decoding device the compression the spatial cues data-switching to the easier unpressed spatial cues data of in decode operation, using.

May cause reducing delay between audio signal and the spatial parameter information according to the configuration of the spatial parameter information of the analysis of side information and control information.For fear of this point, can provide an extra impact damper to be used to reduce audio signal or be used for spatial parameter information, reduce audio signal like this and spatial parameter information can be synchronized with each other.Yet these methods are inconvenient, because extra impact damper need be provided.Optionally, side information can be transmitted before the reduction audio signal, and it has considered the delay between contingent reduction audio signal and the spatial parameter information.In this case, the spatial parameter information that obtains by combination side information and control information does not need to be adjusted again and can be easy to use.

If a plurality of object signal of reduction audio signal have varying level, art reduction audio mixing gain (ADG) module of energy direct compensation reduction audio signal can be determined the associated level of object signal, and can use such as levels of channels difference information, the spatial cues data of (ICC) information of correlativity between sound channel and sound channel predictive coefficient (CPC) information and so on are assigned to precalculated position in the multichannel space with each object signal.

For instance, if predetermine one signal of control information indication will be assigned to the precalculated position in the multichannel space, and the level of this object signal is higher than other object signal, typical multi-channel decoder can calculate poor between the channel energies of reduction audio signal, and will reduce audio signal according to result calculated and be divided into some output channels.Yet, the volume that typical multi-channel decoder can not increase or reduce to reduce sound in the audio signal.In other words, typical multi-channel decoder simply will reduce audio signal and distribute to some output channels, and not increase or reduce to reduce the volume of sound in the audio signal.

Each precalculated position that is assigned in the multichannel space of a plurality of object signal that will be generated by object encoder according to control information also is relatively very simple.Yet, increase or the amplification that reduces the predetermine one signal needs special technique.In other words, if use the reduction audio signal that is generated by object encoder, the amplitude that reduces to reduce each object signal of audio signal is difficult.

Therefore, according to one embodiment of the invention, can use as shown in Figure 8 ADG module 147 to change the correlation magnitude of object signal according to control information.Any one amplitude of a plurality of object signal that in particular, can be by using the reduction audio signal that ADG module 147 increases or reduce to be transmitted by object encoder.The reduction audio signal that is obtained by the 147 execution compensation of ADG module can be carried out multi-channel decoding.

If use the 147 suitable adjustment of ADG module to reduce the relative amplitude of the object signal of audio signal, then can use typical multi-channel decoder to carry out the object decoding.If the reduction audio signal that is generated by object encoder is monophony or stereophonic signal or multi-channel signal with three or more sound channels, this reduction audio signal can be handled by ADG module 147.If the reduction audio signal that is generated by object encoder has two or more sound channels, and need exist only in by the predetermine one signal that ADG module 147 is adjusted in the sound channel in the reduction audio signal, then ADG module 147 can only be applied to comprising the sound channel of this predetermine one signal, rather than is applied to reduce all sound channels of audio signal.Reduction audio signal after being handled by said method by ADG module 147 can use typical multi-channel decoder to handle easily, and does not need to revise the structure of multi-channel decoder.

Even when the signal of final output is not the multi-channel signal that can be reproduced by multi-channel loudspeaker, but binaural signal, can use ADG module 147 to go to adjust the correlation magnitude of the object signal of final output signal.

As using substituting of ADG module 147, during the generation of a plurality of object signal, can comprise in the control information that appointment will be applied to the gain information of the yield value of each object signal.For this reason, revise the structure of typical multi-channel decoder possibly.Even need to revise the structure of existing multi-channel decoder, during decode operation, by yield value being applied to each object signal, and do not need to calculate ADG and each object signal of compensation, this method is reducing aspect the decoding complex degree still very easily.

Fig. 9 is the block scheme according to the audio decoding apparatus 150 of fourth embodiment of the invention.Referring to Fig. 9, audio decoding apparatus 150 is characterised in that the generation binaural signal.

Specifically, audio decoding apparatus 150 comprises multichannel ears demoder 151, the first parametric converters 157 and second parametric converter 159.

Second parametric converter 159 is provided by side information and the control information that is provided by audio coding apparatus, and comes the configuration space parameter information according to analysis result.First parametric converter 157 is by increasing three-dimensional (3D) information, and for example a related transfer function (HRTF) parameter is given spatial parameter information, and disposing can be by the ears parameter information of multichannel ears demoder 151 uses.Multichannel ears demoder 151 generates virtual three-dimensional (3D) signal for the reduction audio signal by applying virtual 3D parameter information.

First parametric converter 157 and second parametric converter 159 can be replaced by an independent module, it is parameter transformation module 155, it receives side information, control information and HRTF parameter, and disposes the ears parameter information according to side information, control information and HRTF parameter.

As a rule, for the binaural signal of the reproduction of using headphone to generate to be used to the reduction audio signal that comprises 10 object signal, object signal must generate 10 decoded signals corresponding to 10 object signal respectively according to reduction audio signal and side information.Thereafter, renderer is assigned to precalculated position in the multichannel space to be suitable for 5 channel loudspeaker environment with reference to control signal with each of 10 object signal.Thereafter, renderer generates 5 sound channel signals that can use 5 channel loudspeakers to reproduce.Thereafter, renderer is applied to the HRTF parameter in 5 sound channel signals, thereby generates 2 sound channel signals.In brief, above-mentioned common audio-frequency decoding method comprises: reproduce 10 object signal, these 10 object signal are converted to 5 sound channel signals, and generate 2 sound channel signals according to 5 sound channel signals, as seen its efficient is very low.

On the other hand, audio decoding apparatus 150 can be easy to the binaural signal that generation can use headphone to reproduce according to object audio signal.In addition, audio decoding apparatus 150 comes the configuration space parameter information by the analysis to side information and control information, and uses typical multichannel ears demoder to generate binaural signal.Yet, even if when it is equipped with integrated parametric converter, audio decoding apparatus 150 still can use typical multichannel ears demoder, this parametric converter receives side information, control information and HRTF parameter, and disposes the ears parameter information according to side information, system information and HRTF parameter.

Figure 10 is the block scheme according to the audio decoding apparatus 160 of fifth embodiment of the invention.Referring to Figure 10, audio decoding apparatus 160 comprises reduction audio mixing processor 161, multi-channel decoder 163 and parametric converter 165.Reduction audio mixing processor 161 and parametric converter 163 can be substituted by single module 167.

Parametric converter 165 generates and can be reduced the parameter information that audio mixing processor 161 uses by spatial parameter information and the quilt that multi-channel decoder 163 uses.The pretreatment operation that reduction audio mixing processor 161 is carried out the reduction audio signal, and transmission pretreatment operation result's reduction audio signal is given multi-channel decoder 163.163 pairs of reduction audio signal of being come by 161 transmission of reduction audio mixing processor of multi-channel decoder are carried out decode operation, thus output stereophonic signal, ears stereophonic signal or multi-channel signal.The example of the pretreatment operation that reduction audio mixing processor 161 is performed comprises: revises in time domain or frequency domain or conversion reduces audio signal by filtering.

If the reduction audio signal that is input in the audio decoding apparatus 160 is a stereophonic signal, before this reduction audio signal is transfused to multi-channel decoder 163, this reduction audio signal can be used to be handled by the reduction audio mixing that reduction audio mixing processor 161 is carried out, because multi-channel decoder 163 can not be mapped to corresponding L channel and R channel with the component of reduction audio signal, wherein L channel is of multichannel, and R channel is multichannel another.Therefore, for the object signal that can will be categorized into L channel is transferred on the direction of R channel, the reduction audio signal that inputs to audio decoding apparatus 160 can be carried out the pre-service of reduction audio mixing processor, and pretreated reduction audio signal can be transfused to multi-channel decoder 163.

Can be according to from side information with carry out the pre-service of stereo reduction audio signal from the pretreatment information that control information obtains.

Figure 11 is the block scheme according to the audio decoding apparatus 170 of sixth embodiment of the invention.Referring to Figure 11, audio decoding apparatus 170 comprises multi-channel decoder 171, sound channel processor 173 and parametric converter 175.

The parameter information that parametric converter 175 generates the spatial parameter information that can be used by multi-channel decoder 171 and can be used by sound channel processor 173.Sound channel processor 173 is carried out the aftertreatment to the signal of being exported by multi-channel decoder 171.The example of the signal that multi-channel decoder 171 is exported comprises: stereophonic signal, ears stereophonic signal and multi-channel signal.

The example of the post-processing operation that sound channel processor 173 is performed comprises: revise or each sound channel or all sound channels of conversion output signal.For instance, if side information comprises the basic frequency information about the predetermine one signal, sound channel processor 173 can be removed harmonic component with reference to this basic frequency information from the predetermine one signal.The multichannel audio coding/decoding method may be efficient inadequately for karaoke OK system.Yet if be included in the side information about the basic frequency information of voice object, and the harmonic component of voice object signal is removed during aftertreatment, can realize high performance karaoke OK system by the embodiment that uses Figure 11.The embodiment of Figure 11 also can be applicable to the object signal except that the voice object signal.For instance, can use the embodiment of Figure 11 to remove the sound of being scheduled to musical instrument.Equally, can use the embodiment of Figure 11 to use and amplify predetermined harmonic component about the basic frequency information of object signal.

Sound channel processor 173 can be carried out extra effect process to the reduction audio signal.Optionally, sound channel processor 173 can join the signal that is obtained by extra effect process the signal of multi-channel decoder 171 outputs.Frequency spectrum or modification reduction audio signal that sound channel processor 173 can in officely be what is the need for and be changed object when wanting.If directly the implementation effect processing is operated (such as to reducing the reverberation of audio signal) and the signal that the effect process operation is obtained is transferred to multi-channel decoder 171 is not very suitable, sound channel processor 173 can join the signal that is obtained through the effect process operation output of multi-channel decoder 171, to replace the processing of reduction audio signal implementation effect.

Audio decoding apparatus 170 can be designed to not only comprise sound channel processor 173, also comprises reduction audio mixing processor.In this case, reduction audio mixing processor can be arranged at before the multi-channel decoder 171, and sound channel processor 173 can be arranged at after the multi-channel decoder 171.

Figure 12 is the block scheme according to the audio decoding apparatus 210 of seventh embodiment of the invention.Referring to Figure 12, audio decoding apparatus 210 uses multi-channel decoder 213 to replace object decoder.

Particularly, audio decoding apparatus 210 comprises multi-channel decoder 213, code converter 215, renderer 217 and 3D information database 219.

Renderer 217 is determined the 3D position of a plurality of object signal corresponding to the 3D information of index data according to being included in the control information.Code converter 215 is by comprehensively generating side information based on sound channel about the positional information of a plurality of object audio signal, and wherein renderer 217 has been applied to 3D information in these object audio signal.Multi-channel decoder 213 is exported the 3D signal by being applied to the reduction audio signal based on the side information of sound channel.

Related transfer function (HRTF) can be used as a kind of 3D information and is used.HRTF is a kind of transition function, its described at an arbitrary position sound source and the transmission of the sound wave between the ear, and return a value that changes according to the position and the height of sound source.If use HRTF to come filtering not with the signal of directivity, this signal can be heard as from certain direction and reproduce.

When receiving incoming bit stream, audio decoding apparatus 210 uses the demodulation multiplexer (not shown) to extract object-based reduction audio signal and object-based parameter information from incoming bit stream.Then, renderer 217 extracts the index data that is used for determining a plurality of object audio signal position from control information, and extracts (withdraw) and the corresponding 3D information of being extracted of index data from 3D information database 219 out.

Specifically, not only level information can be comprised, the necessary index data of search 3D information can also be comprised by the audio decoding apparatus 210 employed audio mixing parameter informations that are included in the control information.The audio mixing parameter information also can comprise the temporal information about the mistiming between sound channel, positional information and one or more parameter that is obtained by appropriate combination level information and temporal information.

Can initially determine the position of object audio signal according to default audio mixing parameter information, and change the position by the 3D information of using corresponding to user's desired position to object audio signal subsequently.Optionally, if the user wishes only 3D effect to be applied to some object audio signal, the level information and the temporal information of not wishing to use the object audio signal of 3D effect about other user can be used as the audio mixing parameter information.

Code converter 215 was by comprehensively generating the side information based on sound channel about the M sound channel by audio coding apparatus transmits about the positional information of the object-based parameter information of N object signal and a plurality of object signal, and renderer 217 will be applied in the positional information of above-mentioned object signal such as the 3D information of HRTF.

Multi-channel decoder 213 generates sound signal according to reduction audio signal and the side information based on sound channel that provided by code converter 215, and is included in by use and carries out 3D based on the 3D information in the side information of sound channel and play up operation and generate the 3D multi-channel signal.

Figure 13 is the block scheme according to the audio decoding apparatus 220 of eighth embodiment of the invention.Referring to Figure 13, audio decoding apparatus 220 is different from audio decoding apparatus shown in Figure 12 210, and its difference is that code converter 225 transmits discretely based on the side information of sound channel and 3D information and gives multi-channel decoder 223.In other words, the code converter 225 of audio decoding apparatus 220 is from about obtaining the side information based on sound channel about M sound channel the object-based parameter information of N object signal, and transmission is given multi-channel decoder 223 based on the side information of sound channel and each the 3D information that is applied to N object signal, however code converter 215 transmission of audio decoding apparatus 210 comprise 3D information based on the side information of sound channel to multi-channel decoder 213.

Referring to Figure 14, can comprise a plurality of frame index based on the side information and the 3D information of sound channel.Therefore, multi-channel decoder 223 can come synchronously side information and 3D information based on sound channel with reference to each frame index based on the side information of sound channel and 3D information, and can use 3D information and give frame corresponding to the bit stream of this 3D information.For example, the 3D information with index 2 can be applied to the beginning of the frame 2 with index 2.

Because side information and 3D information based on sound channel all comprise frame index, even 3D information is upgraded the temporary position based on the side information of sound channel that can determine effectively also that 3D information will be applied to along with the time.In other words, code converter 225 comprises 3D information and based on a plurality of frame index in the side information of sound channel, thus multi-channel decoder 223 can be easily synchronously based on the side information and the 3D information of sound channel.

Reduction audio mixing processor 231, code converter 235, renderer 237 and 3D information database can be substituted by an independent module 239.

Figure 15 is the block scheme according to the audio decoding apparatus 230 of ninth embodiment of the invention.Referring to Figure 15, audio decoding apparatus 230 is different from audio decoding apparatus shown in Figure 13 220, and its difference is that audio decoding apparatus 230 further comprises reduction audio mixing processor 231.

Specifically, audio decoding apparatus 230 comprises code converter 235, renderer 237,3D information database 238, multi-channel decoder 233 and reduction audio mixing processor 231.Code converter 235, renderer 237,3D information database 238 is identical respectively with counterpart shown in Figure 13 with multi-channel decoder 233.231 pairs of stereo reduction audio signal of reduction audio mixing processor are carried out pretreatment operation to adjust the position.3D information database 238 can merge with renderer 237.Can also be provided for using desired effects gives audio decoding apparatus 230 for the module of reduction audio signal.

Figure 16 represents the block scheme according to the audio decoding apparatus 240 of tenth embodiment of the invention.Referring to Figure 16, audio decoding apparatus 240 is different from audio decoding apparatus shown in Figure 15 230, and its difference is that audio decoding apparatus 240 comprises multipoint control unit combiner 241.

That is to say that audio decoding apparatus 240 is the same with audio decoding apparatus 230, comprise reduction audio mixing processor 243, multi-channel decoder 244, code converter 245, renderer 247 and 3D information database 249.Multipoint control unit combiner 241 makes up by a plurality of bit streams that object-based coding obtained, thereby obtains single bit stream.For instance, when input is used for first bit stream of first sound signal and is used for second bit stream of second sound signal, multipoint control unit combiner 241 extracts the first reduction audio signal from first bit stream, from second bit stream, extract the second reduction audio signal, and generate the 3rd reduction audio signal by making up the first and second reduction audio signal.In addition, multipoint control unit combiner 241 extracts the first object-based side information from first bit stream, from second bit stream, extract the second object-based side information, and by making up the first object-based side information and the second object-based side information generates the 3rd object-based side information.Thereafter, multipoint control unit combiner 241 generates bit stream by making up the 3rd reduction audio signal and the 3rd object-based side information, and exports the bit stream that is generated.

Therefore, according to tenth embodiment of the invention, be compared to the coding or the situation of each object signal of decoding, even by the signal of two or more communication parties' transmission, it also can be processed effectively.

Multipoint control unit combiner 241 is in order to extract respectively from a plurality of bit streams a plurality of, and merge in the independent reduction audio signal with the corresponding reduction audio signal of different compression coding and decodings, these reduction audio signal need be converted into the signal in pulse code modulation (pcm) signal or the predetermined frequency area according to the compression coding and decoding type of reduction audio signal, PCM signal or the signal that is obtained by conversion may need to combine, and the signal demand that is obtained by combination uses predetermined compression coding and decoding to change.In this case, whether be merged in signal in PCM signal or the predetermined frequency area, may postpone according to the reduction audio signal.Yet this delay possibly can't correctly be estimated by decoded device.Therefore, this delay may need to be included in the bit stream and with bit stream to be transmitted.This postpones the quantity of the delay sampling of indication in the PCM signal or the quantity of the delay sampling in predetermined frequency area.

Compare with the quantity of the input signal of handling usually at typical multichannel coding/decoding operating period (for example 5.1 sound channels or 7.1 sound channel coding/decodings operation), the quantity of the input signal that need handle in object-based audio coding/decoding operating period is quite big sometimes.Therefore, object-based audio coding/decoding method needs higher bit rate than typical audio coding/decoding based on sound channel.Yet because object-based audio coding/decoding method comprises the processing of the object signal that the contrast sound channel signal is littler, it can use object-based audio coding/decoding method to generate dynamic output signal.

To explain audio coding method according to an embodiment of the invention in detail referring to accompanying drawing 17-20 below.

In object-based audio coding method, object signal can be defined as representing independent sound, such as the mankind's the voice or the sound of musical instrument.Optionally, sound with same characteristic features, all sound (violin for example if any stringed musical instrument, viola and violoncello), the sound that belongs to same frequency band, or can be combined in together, and define by identical object signal according to the sound that the direction and the angle of sound source is classified into identical category.Still optionally, can use the combination of said method to define object signal.

A plurality of object signal can be used as reduction audio signal and side information and are transmitted.Between the startup stage of the information that will be transmitted, each energy or power of a plurality of object signal of reduction audio signal or reduction audio signal is carried out initial calculation to be used to detect the envelope of reduction audio signal.Result calculated can be used to the level ratio of connection object signal or reduction audio signal or calculating object signal.

Linear predictive coding (LPC) algorithm can be used to more low bit rate.Specifically, generate a plurality of LPC coefficients of envelope of expression signal by signal analysis, and these LPC coefficients will be transmitted to replace the envelope information of transmission about signal.This method is efficiently aspect bit rate.Yet the LPC parameter is variant with the actual envelope of signal probably, and this method needs extra processing, such as error recovery.In brief, the method that relates to the envelope information of transmission signals can guarantee the high-quality of sound, but this needing to have caused the increase of information transmitted amount.On the other hand, relate to and use the method for LPC coefficient can reduce the information transmitted amount that needs, but need extra processing, such as error recovery, this will cause the reduction of sound quality.

According to one embodiment of present invention, can use the combination of these methods.In other words, can represent the envelope of signal as the LPC coefficient with the energy of signal or power or index value or corresponding to another value of the energy or the power of signal.

Envelope information about signal can be that unit obtains with time period or frequency band.Specifically, referring to Figure 17, be that the unit obtains with the frame about the envelope information of signal.Optionally, if signal is represented by the band structure that uses the bank of filters of organizing such as quadrature mirror filter (QMF), envelope information about signal can be with frequency subband, the group of frequency subband, or the group that frequency subband is separated is that unit obtains, and it is than the frequency subband entity of fritter more that frequency subband is separated.Still optionally, based on the method for frame, the use of the combination of the method for separating based on the method for frequency subband with based on frequency subband is also within protection scope of the present invention.

Still optionally, the low frequency component of supposing signal has the high fdrequency component more information than signal, envelope information itself about the low frequency component of signal can be transmitted, yet, can be worth by LPC coefficient or other about the envelope information of the high fdrequency component of signal and to represent, and transmission LPC coefficient or other value are to replace the envelope information about the high fdrequency component of signal.But the low frequency component of signal not necessarily just has more information than the high fdrequency component of signal.Therefore need be according to the actual conditions said method of applying in a flexible way.

According to one embodiment of the invention, will be transmitted corresponding to the envelope information or the index data of the part (hereinafter referred to as major part) of signal, the part of this signal is to show as major part on time/frequency axis.Optionally, the energy of the major part of expression signal and the value (for example LPC coefficient) of power can be transmitted, and do not transmit these values corresponding to the non-major part of signal.Still optionally, envelope information or index data be can transmit, and the energy of the non-major part of representing signal or the value of power also can be transmitted corresponding to the major part of signal.Still optionally, only transmit information about the major part of signal, like this can be according to the non-major part of coming estimated signal about the information of the major part of signal.Still optionally, can use the combination of said method.

For instance, referring to Figure 18,, but transmit about the information usage flag of signal four kinds of diverse ways for (a)-(d) if signal is divided into main period and non-main period.

In order to transmit a plurality of object signal of the combination that reduces audio signal and side information, as the part of decode operation, the reduction audio signal need be divided into a plurality of elements, for example, has considered the ratio of the level of object signal.For the independence between the element that guarantees to reduce audio signal, need extra execution decorrelation operation.

The sound channel signal that likens to the codec unit in the multichannel decoding method as the object signal of the codec unit in the object-based decoding method has more independence.In other words, sound channel signal comprises a plurality of object signal, so need be by decorrelation.In yet another aspect, be independently between the object signal, be easy to carry out channel separation so can use the feature of object signal and do not need the decorrelation operation.

Specifically, referring to Figure 19, object signal A, B and C are in turn as the main object on the frequency axis.In this case, do not need according to object signal A, the level ratio of B and C and will reduce audio signal and be divided into a plurality of signals does not need to carry out decorrelation yet.Instead, about object signal A, the information of the main period of B and C will be transmitted, or yield value is applied to each object signal A, on each frequency component of B and C, thereby skip decorrelation.Therefore, it can reduce calculated amount, and can reduce the required bit rate of the necessary side information of decorrelation.

In brief, in order to skip decorrelation, can be used as side information about the information of the frequency domain that comprises each object signal and be transmitted, this decorrelation is performed to guarantee by the independence of dividing according to the ratio of the object signal rate of reduction audio signal between reduction a plurality of signals that audio signal was obtained.Optionally, the different gains value be can use and main period and non-main period given, therefore each object signal all shows as mainly in the main period, and each object signal all shows as not too mainly in the non-main period, and the information about main period can mainly be provided as side information.Still optionally, can be used as side information about the information of main period and be transmitted, and do not transmit not information about non-main period.Still optionally, the combination as the said method that substitutes of decorrelation method can be used.

The said method that substitutes as the decorrelation method can be applied to all signal objects, or only is applied to the object signal that some has the obvious discernible major cycle.Same, can frame be that unit is employed as the said method that substitutes of decorrelation method.

Below will describe the coding of the object audio signal of using residual signals in detail.

In general, in object-based audio coding/decoding method, a plurality of object signal are encoded, and coding result is transmitted as the combination that reduces audio signal and side information.Then, from the reduction audio signal, recover a plurality of object signal according to side information by decoding, and the object signal after recovering for example, is generated final sound channel signal according to control information by suitable audio mixing in user's request.Object-based audio coding/decoding method generally is devoted to change the output channels signal freely according to control signal under the help of mixer.Yet no matter object-based audio coding/decoding method also can be used to generate according to the sound channel output of predefine mode control information.

For this reason, side information not only comprises the necessary information of a plurality of object signal of acquisition from the reduction audio signal, also comprises generating the necessary audio mixing parameter information of sound channel signal.Then, do not need the help of mixer just can generate final channel output signal.In this case, can use this residual error coding/decoding algorithm to improve sound quality.

Typical residual error coding/decoding method comprises the coding/decoding signal and signal behind the coding/decoding and the mistake between the original signal is carried out coding/decoding, just residual signals.During decode operation, the signal behind the coding is decoded, signal behind the while compensation coding and the mistake between the original signal, thus recover the signal identical as far as possible with original signal.Because the mistake between decoded signal and the original signal as a rule is inappreciable, it can reduce the amount of carrying out the necessary extraneous information of residual error coding/decoding.

If the output of the final sound channel of demoder has been determined, not only to be provided for generating the necessary audio mixing parameter information of final sound channel signal, also to provide residual coding information with as side information.In this case, it can improve sound quality.

Figure 20 is the block scheme of audio coding apparatus 310 according to an embodiment of the invention.With reference to Figure 20, audio coding apparatus 310 is characterised in that it has used residual signals.

Specifically, audio coding apparatus 310 comprises scrambler 311, demoder 313, the first mixer 315, the second mixers 319, totalizer 317 and bit stream makers 321.

First mixer 315 is carried out the audio mixing operation for original signal, and second mixer 319 is carried out passing through original signal is carried out the audio mixing operation of encoding operation and the resulting signal of decode operation.Residual signals between the signal of totalizer 317 calculating first mixer 315 outputs and the signal of second mixer, 319 outputs.Bit stream maker 321 joins residual signals in the side information, and the result of transmission after adding.Like this, it can improve sound quality.

The calculating of residual signals can be applied to all parts of signal, or only is applied to the low frequency part of signal.Optionally, the calculating of residual signals can be comprised based on frame in the main signal frequency-domain of frame by variable only being applied to.Still optionally, can use the combination of said method.

Because comprise that the amount of side information of residual signals information is bigger than the amount of the side information that does not comprise residual signals information, the calculating of residual signals can only be applied to signal those directly influence parts of sound quality, thereby prevent the growth that bit rate is too much.But the computer-readable code of the present invention's service recorder on computer-readable medium realized.This computer readable recording medium storing program for performing can be the pen recorder of any kind, and data are stored in computer-readable mode therein.The example of computer readable recording medium storing program for performing comprises ROM, RAM, CD-ROM, disk, floppy disk, optical data memories and the carrier wave data transmission of the Internet (for example by).Computer readable recording medium storing program for performing can be assigned with by a plurality of computer systems that are connected on the network, so computer-readable code is written into wherein, and is performed with non-centralized system.Common those skilled in the art can be easy to construct and be used to realize functional programs of the present invention, code and code segment.

Industrial applicibility

As mentioned above, according to the present invention, by benefiting from the advantage of object-based audio coding and coding/decoding method, the acoustic image of each object audio signal can be positioned.Then, it can provide more real sound by the reproduction of object audio signal.In addition, the present invention can be applied to interactive entertainment, and can provide more real pseudo-entity to experience to the user.

Although the present invention is described and illustrates with reference to its preferred embodiment, clearly those skilled in the art can make on the various ways and details on change, and do not break away from by defined spirit of the present invention of following claim or category.

Claims

1. audio-frequency decoding method, it comprises:

From sound signal, extract reduction audio signal and object-based side information;

The control information of playing up described reduction audio signal according to object-based side information and being used to generates the side information based on sound channel;

Sound channel signal after the use decorrelation is handled described reduction audio signal; With

Use reduction audio signal and described side information after the described processing to generate multi-channel audio signal based on sound channel.

2. audio-frequency decoding method as claimed in claim 1, it further comprises:

Before generating described multi-channel audio signal, use described object-based side information and described control information to revise described reduction audio signal.

3. audio-frequency decoding method as claimed in claim 2 wherein, revises that described reduction audio signal comprises that execution is handled the level adjustment of described reduction audio signal, acoustic image and at least one among increasing of effect.

4. audio-frequency decoding method as claimed in claim 3 wherein, is revised described reduction audio signal and further is included in the time domain or the described reduction audio signal of modification in frequency domain.

5. audio-frequency decoding method as claimed in claim 3, it comprises that further described multi-channel audio signal is carried out reverberation to be handled.

6. audio-frequency decoding method as claimed in claim 3 further comprises and will be increased in the described multi-channel audio signal by the prearranged signals that effect process obtained.

7. audio-frequency decoding method as claimed in claim 1, wherein, the sound channel signal after the described decorrelation is based on described reduction audio signal.

8. audio decoding apparatus, it comprises:

Demodulation multiplexer is used for extracting reduction audio signal and object-based side information from sound signal;

Parametric converter is used for generating side information based on sound channel according to the control information that object-based side information and being used to is played up described reduction audio signal;

Reduction audio mixing processor, the reduction audio signal that is used for after using decorrelation under the situation that described reduction audio signal is stereo reduction audio signal is revised described reduction audio signal; With

Multi-channel decoder is used to use the amended reduction audio signal that obtained by described reduction audio mixing processor and described side information based on sound channel to generate multi-channel audio signal.

9. audio decoding apparatus as claimed in claim 8, wherein, described reduction audio mixing processor uses described object-based side information and described control information to revise described reduction audio signal.

10. audio decoding apparatus as claimed in claim 9, wherein, at least one among described reduction audio mixing processor increases level adjustment, acoustic image processing and the effect of reducing audio signal by execution revised described reduction audio signal.

11. audio decoding apparatus as claimed in claim 9, wherein, described reduction audio mixing processor is revised described reduction audio signal in time domain or in frequency domain.

12. audio decoding apparatus as claimed in claim 9, it further comprises the sound channel processor, is used for that described multi-channel audio signal is carried out reverberation and handles.

13. audio decoding apparatus as claimed in claim 9, it further comprises the sound channel processor, is used for and will be increased to described multi-channel audio signal by the prearranged signals that effect process obtained.

14. an audio-frequency decoding method, it comprises:

The control information of playing up described reduction audio signal according to object-based side information and being used to generates side information and the one or more processing parameter based on sound channel;

Use described reduction audio signal and described side information to generate multi-channel audio signal based on sound channel;

Use described processing parameter to revise described multi-channel audio signal.

15. audio signal decoding method as claimed in claim 14 wherein, is revised described multi-channel audio signal and is comprised that using described parameter that described multi-channel audio signal is carried out reverberation handles.

16. audio signal decoding method as claimed in claim 14 wherein, is revised described multi-channel audio signal and is comprised and will be increased in the described multi-channel audio signal by the signal that effect process obtained.

17. an audio decoding apparatus, it comprises:

Parametric converter is used for generating side information and one or more processing parameter based on sound channel according to the control information that object-based side information and being used to is played up described reduction audio signal;

Multi-channel decoder is used to use described reduction audio signal and described side information based on sound channel to generate multi-channel audio signal;

The sound channel processor is used to use described processing parameter to revise described multi-channel audio signal.

18. device as claimed in claim 17, wherein, described sound channel processor uses described parameter to come that described multi-channel audio signal is carried out reverberation and handles.

19. device as claimed in claim 17, wherein, described sound channel processor will be increased to described multi-channel audio signal by the signal that effect process obtained.

20. a computer readable recording medium storing program for performing records audio-frequency decoding method on it, it comprises:

Reduction audio signal after use handling and described side information based on sound channel generate multi-channel audio signal.

21. a computer readable recording medium storing program for performing records audio-frequency decoding method on it, it comprises:

According to object-based side information be used for described control information of playing up the reduction audio signal and generate side information and one or more processing parameter based on sound channel;