EP0685833B1 - Verfahren zur Sprachkodierung mittels linearer Prädiktion - Google Patents
Verfahren zur Sprachkodierung mittels linearer Prädiktion Download PDFInfo
- Publication number
- EP0685833B1 EP0685833B1 EP95401262A EP95401262A EP0685833B1 EP 0685833 B1 EP0685833 B1 EP 0685833B1 EP 95401262 A EP95401262 A EP 95401262A EP 95401262 A EP95401262 A EP 95401262A EP 0685833 B1 EP0685833 B1 EP 0685833B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- signal
- state
- short
- quantization
- determined
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000000034 method Methods 0.000 title claims description 21
- 238000013139 quantization Methods 0.000 claims description 47
- 239000013598 vector Substances 0.000 claims description 39
- 230000003595 spectral effect Effects 0.000 claims description 37
- 238000003786 synthesis reaction Methods 0.000 claims description 28
- 230000015572 biosynthetic process Effects 0.000 claims description 26
- 238000001228 spectrum Methods 0.000 claims description 12
- 230000001747 exhibiting effect Effects 0.000 claims 2
- 238000001914 filtration Methods 0.000 claims 1
- 238000011002 quantification Methods 0.000 description 17
- 230000006870 function Effects 0.000 description 16
- 230000005284 excitation Effects 0.000 description 15
- 238000012546 transfer Methods 0.000 description 15
- 238000004458 analytical method Methods 0.000 description 11
- 238000001514 detection method Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 230000007774 longterm Effects 0.000 description 6
- 230000008901 benefit Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 101000822695 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C1 Proteins 0.000 description 2
- 101000655262 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C2 Proteins 0.000 description 2
- 101000655256 Paraclostridium bifermentans Small, acid-soluble spore protein alpha Proteins 0.000 description 2
- 101000655264 Paraclostridium bifermentans Small, acid-soluble spore protein beta Proteins 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 102100029469 WD repeat and HMG-box DNA-binding protein 1 Human genes 0.000 description 1
- 101710097421 WD repeat and HMG-box DNA-binding protein 1 Proteins 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/038—Vector quantisation, e.g. TwinVQ audio
Definitions
- the present invention relates to a coding method linear prediction speech, in which a signal of speech digitized in successive frames is subjected to a analysis by synthesis to obtain, for each frame, quantification values of synthesis parameters used to reconstruct an estimate of the signal speech, analysis by synthesis including a prediction short-term linear speech signal to determine the coefficients of a short-term synthesis filter.
- Low bit rate speech coders (typically 5 kbit / s for a sampling frequency of 8 kHz) give their best performance on signals presenting a "telephone" spectrum, that is to say in the 300-3400 band Hz and with a pre-emphasis in the frequencies high.
- IRS Intermediate Reference System
- This template has been defined for telephone handsets, both as input (microphone) than out (headphones).
- the speech encoder input signal has a spectrum more "flat", for example when a hands-free installation is used, employing a frequency response microphone linear.
- the usual vocoders are designed to be independent of the input with which they operate, and they are also not informed of the characteristics of this entry. If characteristic microphones different are likely to be connected to the vocoder, or more generally if the vocoder is likely to receive acoustic signals with different spectral characteristics, then there are cases where the vocoder is used sub-optimally.
- a main purpose of this invention is to improve the performance of a vocoder by making them less dependent on spectral characteristics of the signal intended for it.
- the invention provides a speech coding method of the type indicated at the beginning, in which a state is determined speech signal spectral among first and second states such that the signal contains proportionately less energy at low frequencies in the first state than in the second state, and we apply one or the other of two modes quantization to get quantization values coefficients of the following short-term synthesis filter the determined spectral state of the speech signal.
- the detection of the spectral state allows to adapt the encoder to the characteristics of the input signal.
- the performance of the encoder can be improved or, identical performance, we can reduce the number of bits necessary for coding.
- the coefficients of the filter short-term synthesis are represented by a set of p frequency parameters of so-called ordered spectral lines "LSP parameters", p being the order of linear prediction.
- LSP parameters ordered spectral lines
- the distribution of these p LSP parameters can be analyzed to inform about the spectral state of the signal and to contribute upon detection of this state.
- LSP parameters can be scalar or vector quantized.
- the i-th LSP parameter is quantified by subdividing a variation interval included in a respective reference interval into 2 Ni segments, Ni being the number of coding bits devoted to the quantization of this parameter .
- a first possibility is to use at least for the first ordered LSP parameters, reference intervals each chosen from two distinct intervals according to the determined spectral state of the speech signal.
- An additional possibility is to give at least some numbers of coding bits Ni one or the other of two distinct values according to the determined spectral state of the speech signal, in order to effect a dynamic allocation of bits.
- vector quantization differential we subdivide the set of p LSP parameters ordered in m groups of consecutive parameters, and, at least for the first group, we can perform a quantification differential with respect to an average vector chosen from two distinct vectors according to the determined spectral state of the speech signal.
- the speech coder illustrated in FIG. 1A rests on the principle of analysis by synthesis. Its organization general is classic except for unit 8 of short term prediction and state detection unit 20 signal spectral.
- the speech coder processes the amplified output signal from a microphone 5.
- a low-pass filter 6 eliminates the frequency components of this signal above the upper limit (for example 4000 Hz) of the bandwidth processed by the coder .
- the signal is then digitized by the analog-digital converter 7 which delivers the input signal S I in the form of successive frames of 10 to 30 ms consisting of samples taken at a rate of 8000 Hz for example.
- the coefficients a i of this filter (1 ⁇ i ⁇ p) can be obtained by short-term linear prediction of the input signal, the number p designating the order of the linear prediction, which is typically equal to 10 for the narrowband speech.
- the short-term prediction unit 8 determines estimates i of the coefficients a i which correspond to a quantification of these coefficients by quantization values q (a i ).
- Each input signal frame S I is first subjected to the reverse filter 9 of transfer function A (z), then to a filter 10 of transfer function 1 / A (z / ⁇ ) where ⁇ denotes a predefined factor , generally between 0.8 and 0.9.
- the combined filter thus constituted, of transfer function W (z) A (z) / A (z / ⁇ ), is a filter for perceptual weighting of the residual error of the coder.
- the coefficients used in filters 9 and 10 are the estimates â i supplied by the short-term prediction unit 8.
- the output R1 of the reverse filter 9 has a long-term periodicity, corresponding to the pitch of the speech.
- the signal R1 is subjected to a reverse filter 11 of transfer function B (z) whose output R2 is supplied to the input of the filter 10.
- the output S W of the filter 10 thus corresponds to the input signal S I cleared of its long-term correlation by the filter 11 of transfer function B (z), and weighted perceptually by the filters 9, 10 of combined transfer function W (z).
- the filter 11 includes a subtractor whose input positive receives signal R1 and whose negative input receives a long-term estimate obtained by delaying the signal R1 of T samples and amplifying it.
- the R1 signal as well that the long-term estimate are provided to a unit 13 which maximizes the correlation between these two signals to determine the optimal delay T and gain b.
- Unit 13 explores all the whole and / or fractional values of the delay T between two terminals to select the one which maximizes the normalized correlation.
- the gain b is deducted from the value of T, and is quantified by discretization, which leads to a quantization value q (b); the value quantized b and corresponding to this quantization value q (b) is that supplied as gain of the amplifier of the filter 11.
- Speech synthesis in the coder is carried out in a closed loop comprising an excitation generator 12, a filter 14 having the same transfer function as the filter 10, a correlator 15, and a maximization unit 19 of the normalized correlation.
- the nature of the excitation generator 12 makes it possible to distinguish between different types of coders with analysis by synthesis, according to the form of excitement.
- MPLPC prediction analysis linear and multi-pulse excitation
- CELP linear prediction analysis and vector excitation
- the plaintiff used sequence excitation regular pulses or RPCELP, as described in his European patent application No. 0 347 307.
- the excitation is represented by a input address k in a vector dictionary of excitation, and by an associated gain G.
- the selected and amplified excitation vector is subjected to the filter 14 of transfer function 1 / A (z / ⁇ ), whose coefficients â i (1 ⁇ i ⁇ p) are provided by the short-term prediction unit 8 .
- the resulting signal S W * is supplied to an input of the correlator 15, the other input of which receives the output signal S W of the filter 10.
- the output of the correlator 15 consists of the normalized correlation which is maximized by the unit 19 , which amounts to minimizing the coding error.
- the unit 19 selects the address k and the gain G of the excitation generator which maximize the correlation resulting from the correlator 15.
- the maximization consists in determining the optimal address k, the gain G deducing from k.
- the unit 19 operates a quantization by discretization of the digital value of the gain G, which leads to a quantization value q (G).
- the quantized value G and corresponding to this quantization value q (G) is that which is supplied as the gain of the amplifier of the excitation generator 12.
- the excitation vector selected in the dictionary of the generator 12, the associated gain G, the parameters b and T of the long-term filter 13, and the coefficients a i of the short-term prediction filter, to which is added a bit d state Y which will be described later, constitute the synthesis parameters, the quantization values of k, q (G), q (b), T, q (a i ), Y are transmitted to the receiver in order to reconstruct a estimation of the speech signal S I. These quantization values are combined on the same channel by the multiplexer 21 for transmission.
- the associated decoder illustrated in FIG. 1B comprises a unit 50 which restores the quantized values k, G and, T, b and, â i on the basis of the quantization values received.
- An excitation generator 52 identical to the generator 12 of the encoder receives the quantized values of the parameters k and G.
- the output R and2 of the generator 52 (which is an estimate of R2) is subjected to the long-term prediction filter 53 of function of transfer 1 / B (z) whose coefficients are the quantized values of the parameters T and b.
- the output R and1 of the filter 53 (which is an estimate of R1) is subjected to the short-term prediction filter 54 of transfer function 1 / A (z) whose coefficients are the quantized values of the parameters a i .
- the resulting signal S and is the estimate of the input signal S I of the coder.
- FIG. 2 shows an example of the constitution of the short-term prediction unit 8 of the coder.
- the modeling coefficients a i are calculated for each frame, for example for the autocorrelation method.
- Block 40 calculates the autocorrelations for 0 ⁇ j ⁇ p, n denoting the index of a sample of the current frame, and L the number of samples per frame.
- the representation parameters thus obtained are quantized to reduce the number of bits necessary for their identification.
- the two solid lines correspond to the framework of the IRS template, defined for microphones in CCITT Recommendation P48.
- an IRS type microphone signal has a strong attenuation in the lower part of the spectrum (between 0 and 300 Hz) and a relative emphasis in the high frequencies.
- a linear type signal provided by example through the microphone of a hands-free system presents a flatter spectrum, notably not having the strong attenuation at low frequencies (a typical example of such a linear type signal is illustrated by a line dashed on the diagram in Figure 3).
- the detection device 20 comprises a high-pass filter 16 receiving the input acoustic signal S I and delivering a filtered signal S I '.
- the filter 16 is typically a digital filter of the bi-quad type having an abrupt cutoff at 400 Hz.
- the energies E1 and E2 contained in each frame of the acoustic input signal S I and of the filtered signal S I ' are calculated by two units 17, 18 each carrying out the sum of the squares of the samples of each frame which it receives.
- the energy E1 of each frame of the input signal S I is sent to the input of a threshold comparator 25 which delivers a bit Z of value 0 when the energy E1 is less than a predetermined energy threshold, and of value 1 when the energy E1 is greater than the threshold.
- the energy threshold is typically of the order of -38 dB relative to the signal saturation energy.
- the comparator 25 serves to inhibit the determination of the state of the signal when the latter contains too little energy to be representative of the characteristics of the source. In this case, the determined state of the signal remains unchanged.
- the energies E1 and E2 are sent to a digital divider 26 which calculates the ratio E2 / E1 for each frame.
- This E2 / E1 ratio is sent to another threshold comparator 27 which delivers a bit X of value 0 when the E2 / E1 ratio is greater than a predetermined threshold, and of value 1 when the E2 / E1 ratio is less than the threshold.
- This threshold on the E2 / E1 ratio is typically of the order of 0.93.
- Bit X is representative of a signal condition on each frame.
- the state bit Y is not taken directly equal to the condition bit X, but it results from a processing of successive condition bits X by a state determination circuit 29, which makes it possible to modify the determined state Y only after several successive frames show a signal condition X different from that corresponding to the previously determined state .
- the operation of the state determination circuit 29 is illustrated in FIG. 5, where the upper timing diagram illustrates an example of evolution of the bit X supplied by the comparator 27.
- the status bit Y (lower timing diagram) is initialized to 0 , because IRS characteristics are most frequently encountered.
- variable V As soon as the variable V reaches a predetermined threshold (8 in the example considered), it is reset to 0 and the value of the bit Y is changed, so that it is determined that the signal has changed state.
- a predetermined threshold 8 in the example considered
- the signal is in state Y A up to frame M, in being Y B between frames M and N (change of signal source), then again in state Y A from frame N.
- the above counting mode can for example be obtained by circuit 29 shown in FIG. 4.
- This circuit includes a counter 32 on four bits, of which the bit most significant corresponds to the status bit Y, of which the three Least significant bits represent the counting variable V.
- X and Y bits are supplied at the input of an OR gate EXCLUSIVE 33 whose output is addressed to the input incrementation of the counter 32 via a AND gate 34 whose other input receives the Z bit supplied by the threshold comparator 25.
- the inverted output of the gate 33 is supplied to a decrementing input of the counter 32 via another AND gate 35 whose the other two inputs receive the Z bit respectively provided by comparator 25, and the output of an OR gate to three inputs 36 receiving the three least significant bits of the counter 32.
- Counter 32 is arranged to split the pulses received on its decrement input when its least significant bit is 0 or when one at less than the next two bits is 1, as shown by the OR gate 37 in FIG. 4.
- the circuit of determination 29 is not activated because AND gates 34, 35 prevent the value of counter 32 from being changed.
- the status bit Y thus determined is supplied to the unit 8 short-term linear prediction to choose the mode for quantifying the coefficients of the synthesis filter at short term.
- the parameters used to represent the coefficients a i of the short-term synthesis filter are the frequencies of spectral lines (LSF), or pairs of spectral lines (LSP). These parameters are known as having good statistical properties and as easily ensuring the stability of the synthesized filter (see N. Sugamura and F. Itakura: "Speech Analysis And Synthesis Method Developed At ECL In NTT: From LPC to LSP", Speech Communication, North Holland, Vol. 5, No. 2, 1986, pp. 199-215).
- the LSP parameters are calculated by block 42 from the prediction coefficients a i obtained by block 41 by means of Chebyshev polynomials (see P. Kabal and RP Ramachandran: "The Computation Of Line Spectral Frequencies Using Chebyshev Polynomials", IEEE Trans. ASSP, Vol. 34, N ° 6, 1986 pp. 1419-1426). They can also be obtained directly from the autocorrelations of the signal, by the exploded Levinson algorithm (see P. Delsarte and Y. Genin: “The Split Levinson Algorithm", IEEE Trans. ASSP, Vol. 34, N ° 3, 1986).
- Block 43 quantizes the LSF frequencies, or more precisely the cos2 ⁇ f i values, hereinafter called LSP parameters, comprised between -1 and +1, which simplifies the dynamic problems.
- LSP parameters comprised between -1 and +1, which simplifies the dynamic problems.
- the LSF frequency calculation method makes it possible to obtain them in the order of increasing frequencies, that is to say decreasing cosines.
- m 3 independent vector quantifications, of dimensions respectively 3, 3 and 4, defining the LSP I groups (1,2,3), II (4,5,6) and III (7,8,9,10).
- Each group is quantified in selecting from a respective quantification table prerecorded the vector presenting the Euclidean distance minimal with the parameters of this group.
- For group I we define two quantization tables T I, 1 and T I, 2 disjoint with respective sizes 2 n1 and 2 n2 .
- For group II two quantization tables T II, 1 and T II, 2 of respective sizes 2 p1 and 2 p2 having a common part are defined to reduce the necessary memory space.
- For Group III defining a single quantization table T III size 2 q AD addresses I, AD II, AD III of the three vectors from three quantization tables for the three groups are the quantization values q (a i ) coefficients of the short-term synthesis filter, which are addressed to the multiplexer 21.
- block 43 selects the tables T I, 2 and T II, 2 , the statistics of which are established to be representative of an input signal of linear type.
- table T III is used in all cases, since the upper part of the spectrum is less sensitive to the differences between the IRS and linear characteristics.
- the status bit Y is also supplied to the multiplexer 21.
- a unit 45 calculates the estimates â i from the discretized values of the LSP parameters given by the three vectors selected.
- the estimates â i thus obtained are supplied by the unit 45 to the short-term filters 9, 10 and 14 of the coder.
- the same calculation is carried out by the restitution unit 50, the quantized cosine vectors being found from the quantization addresses AD I , AD II and AD III .
- the decoder contains the same quantization tables as the coder, and their selection is made as a function of the status bit Y received.
- the use of two families of quantification tables selected according to the spectral state Y has the advantage of provide better efficiency in terms of number of bits coding required. Indeed, the total number of bits used, for equal performance, for the quantification of parameters LSP in each case is less than the number of bits necessary when only one family of tables is used regardless of detection of the spectral state.
- n1 8
- block 43 can be arranged to perform differential vector quantization.
- Each group of parameters I, II, III is then quantified differentially with respect to an average vector.
- group I two distinct mean vectors V I, 1 and V I, 2 are defined and a table for quantifying the differences TD I.
- group II two distinct mean vectors V II, 1 and V II, 2 are defined and a table for quantifying the differences TD II .
- group III we define a single mean vector V III and a table for quantifying the differences TD III .
- the average vectors V I, 1 and V II, 1 are established to be representative of an IRS type signal statistic, while the average vectors V I, 2 and V II, 2 are established to be representative of a statistic of linear type signals.
- the advantage of this differential quantization is that it makes it possible to store, in the coder and in the decoder, only one quantization table per group.
- the quantization values q (a i ) are the addresses of the three optimal difference vectors in the three tables, to which is added the bit Y determining which are the average vectors to be added to these difference vectors to restore the quantized LSP parameters.
- each parameter is represented separately by the nearest quantized value.
- cos2 ⁇ f i an upper bound M i and a lower bound m i , such that, on a large number of speech samples, approximately 90% of the encountered values of cos2 ⁇ f i are between m i and M i .
- the reference interval between the two terminals is divided into 2 Ni equal segments, where Ni is the number of coding bits devoted to the quantization of the parameter cos2 ⁇ f i .
- the frequency scheduling property f i is used to replace in certain cases the upper limit M i by the quantized value of the previous cosine c andos2 ⁇ f i-1 .
- the quantization of cos2 ⁇ f i is carried out by subdividing the variation interval [m i , min ⁇ M i , c andos2 ⁇ f i-1 ⁇ ] into 2 Ni equal segments.
- the detection of the spectral state of the signal makes it possible to define two families of reference intervals [m i, 1 , M i, 1 ] and [m i, 2 , M i, 2 ] for the first r parameters (1 ⁇ i ⁇ r ⁇ p).
- Another possibility, which can supplement or replace the previous one, is to define for some of the parameters of the numbers of different Ni coding bits depending on whether the signal is IRS or linear. For a same total number of coding bits, one can in particular take lower Ni numbers in the IRS case than in the linear case for the first LSP parameters (the cosines the largest), since the dynamics of the first LSP parameters is reduced in the IRS case, the decrease in first Ni being offset by an increase in Ni relative to the last LSP parameters, which increases the fineness quantification of these latter parameters.
- These different allocations of coding bits are stored at the both in the encoder and in the decoder, the LSP parameters can thus be found by examining the status bit Y.
- the calculated LSP parameters can directly give a fairly precise idea of the spectral envelope of the speech signal.
- the amplitude of the resonances located in the lower part of the spectrum is weaker than in the case linear. So, by analyzing the differences between the first consecutive LSF frequencies, it can be determined whether the signal input is more like IRS (large deviations) or linear (smaller deviations). This determination can be made for each signal frame to get the condition bit X which is then processed by a state determination circuit similar to circuit 29 of figure 4 to obtain the bit of state Y used by the quantization block 43.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Claims (13)
- Verfahren zur Sprachkodierung mittels linearer Vorhersage, bei welchem ein Sprachsignal (SI), das in aufeinanderfolgenden Rahmen digitalisiert ist, einer Syntheseanalyse unterzogen wird, um für jeden Rahmen Quantifikationswerte von Syntheseparametern (ai b, T, k G) zu erhalten, die es ermöglichen, eine Abschätzung (S and) des Sprachsignals zu erhalten, und bei welchem die Quantifikationswerte ausgegeben werden, wobei die Syntheseanalyse eine lineare Kurzzeit-Vorhersage des Sprachsignals umfaßt, um die Quantifikationswerte der Koeffizienten eines Kurzzeit-Synthesefilters zu bestimmen, dadurch gekennzeichnet, daß ein spektraler Status (Y) des Sprachsignals unter ersten und zweiten Stati (YA, YB) derart bestimmt wird, daß das Signal proportional weniger Energie bei tiefen Frequenzen in dem ersten Status enthält als in dem zweiten Status, und der eine oder der andere von zwei Quantifikationsmodi angewendet wird, um die Quantifikationswerte der Koeffizienten des Kurzzeit-Synthesefilters gemäß dem bestimmten spektralen Status (Y) des Sprachsignals zu erhalten.
- Verfahren nach Anspruch 1, dadurch gekennzeichnet, daß der bestimmte Status (Y) des Sprachsignals nicht modifiziert wird, solange er eine Energie unterhalb einer vorbestimmten Schwelle aufweist.
- Verfahren nach Anspruch 1 oder 2, dadurch gekennzeichnet, daß Rahmen für Rahmen ermittelt wird, ob das Signal in einem ersten Zustand ist, der dem ersten Status (YA) entspricht, oder in einem zweiten Zustand, der dem zweiten Status (YB) entspricht, und der Status (Y) des Signals auf der Basis der Zustände Rahmen für Rahmen (X) ermittelt wird, wobei der bestimmte Status nur modifiziert wird, nachdem mehrere nachfolgende Rahmen einen Signalzustand zeigen, der sich von demjenigen unterscheidet, der dem vorhergehend bestimmten Status entspricht.
- Verfahren nach Anspruch 3, dadurch gekennzeichnet, daß eine Zählvariable (V) inkrementiert wird, wenn der Zustand (X) des Signals in einem Rahmen sich von demjenigen unterscheidet, der dem bestimmten Status (Y) des Signals entspricht, daß diese Zählvariable (V) dekrementiert wird, wenn der Zustand des Signals in einem Rahmen derjenige ist, der dem bestimmten Status des Signals entspricht, außer wenn diese Variable 0 ist, und dadurch daß dann, wenn die Zählvariable (V) eine vorbestimmte Schwelle erreicht, diese auf 0 zurückgesetzt wird und festgestellt wird, daß das Signal den Status gewechselt hat.
- Verfahren nach Anspruch 3 oder 4, dadurch gekennzeichnet, daß das Sprachsignal (SI) einer Hochpaßfilterung unterzogen wird, die Energie (E2) des Signals (SI'), das den Hochpaßfilter durchlaufen hat, mit derjenigen (E1) des nicht gefilterten Signals verglichen wird, um Rahmen für Rahmen zu bestimmen, ob das Signal in dem ersten Zustand ist, für den die Energie des Hochpaß-gefilterten Signals größer ist als ein vorbestimmter Teil der Energie des nicht gefilterten Signals, oder ob das Signal in dem zweiten Zustand ist, für den die Energie des Hochpaß-gefilterten Signals geringer ist als der vorbestimmte Teil der Energie des nicht gefilterten Signals.
- Verfahren nach Anspruch 3 oder 4, dadurch gekennzeichnet, daß die Koeffizienten (aI) des Kurzzeit-Synthesefilters durch eine Menge von Frequenzen von Spektrallinien (fI) dargestellt sind und dadurch, daß die Verteilung der Frequenzen der Spektrallinien in jedem Rahmen des Sprachsignals (SI) analysiert wird, um zu ermitteln, ob das Signal in dem ersten oder dem zweiten Zustand ist.
- Verfahren nach einem der Ansprüche 1 bis 6, dadurch gekennzeichnet, daß die Koeffizienten (aI) des Kurzzeit-Synthesefilters durch eine Menge von p geordneten Frequenzparametern von Spektrallinien (cos2πfi) dargestellt werden, und zwar unterteilt in m Gruppen von aufeinanderfolgenden Frequenzparametern, wobei p die Ordnung der linearen Kurzzeitvorhersage ist und m eine ganze Zahl größer oder gleich 1 ist, und dadurch, daß wenigstens die erste Gruppe in Bezug auf einen mittleren Vektor differentiell quantifiziert wird, der aus zwei unterschiedlichen Vektoren (VI,1, VI,2) gemäß dem bestimmten spektralen Zustand (Y) des Sprachsignals ausgewählt wird.
- Verfahren nach Anspruch 7, dadurch gekennzeichnet, daß die Anzahl m gleich 3 ist und dadurch, daß jede der ersten drei Gruppen der aufeinanderfolgenden Frequenzparameter in Bezug auf einen entsprechenden mittleren Vektors differentiell quantifiziert wird, der aus zwei unterschiedlichen entsprechenden Vektoren gemäß dem bestimmten spektralen Zustand (Y) des Sprachsignals ausgewählt wird.
- Verfahren nach einem der Ansprüche 1 bis 6, dadurch gekennzeichnet, daß die Koeffizienten (ai) des Kurzzeit-Synthesefilters durch eine Menge von p geordneten Frequenzparametern von Spektrallinien (cos2πfi) bestimmt werden, wobei die Menge in m Gruppen von aufeinanderfolgenden Frequenzparametern unterteilt ist, wobei p die Ordnung der linearen Kurzzeit-Vorhersage ist und m eine ganze Zahl größer oder gleich 1 ist, und dadurch, daß wenigstens die erste Gruppe quantifiziert wird, indem in einer Quantifizierungstabelle ein Vektor ausgewählt wird, der einen minimalen Abstand zu den Frequenzparametern der Gruppe aufweist, wobei diese Quantifizierungstabelle aus zwei unterschiedlichen Tabellen (TI,1, TI,2) gemäß dem bestimmten spektralen Zustand (Y) des Sprachsignals ausgewählt wird.
- Verfahren nach Anspruch 9, dadurch gekennzeichnet, daß die Anzahl gleich 3 ist und dadurch, daß jede der beiden ersten Gruppen der aufeinanderfolgenden Frequenzparameter quantifiziert wird, indem in einer entsprechenden Quantifizierungstabelle ein Vektor ausgewählt wird, der einen minimalen Abstand zu den Frequenzparametern der Gruppe darstellt, wobei jede der beiden Quantifizierungstabellen in Bezug auf die beiden ersten Gruppen aus zwei jeweils unterschiedlichen Tabellen gemäß dem bestimmten spektralen Status (Y) des Sprachsignals ausgewählt wird.
- Verfahren nach Anspruch 10, dadurch gekennzeichnet, daß die zwei unterschiedlichen Quantifizierungstabellen (TI,1, TI,2) in Bezug auf die Gruppe disjunkt sind und dadurch, daß die zwei unterschiedlichen Quantifizierungstabellen (TII,1, TII,2) in Bezug au die zweite Gruppe einen gemeinsamen Teil aufweisen.
- Verfahren nach einem der vorhergehenden Ansprüche 1 bis 6, dadurch gekennzeichnet, daß die Koeffizienten (ai) des Kurzzeit-Synthesefilters durch eine Menge von p geordneten Frequenzparametern von Spektrallinien (cos2πfi) dargestellt werden, wobei p die Ordnung der linearen Kurzzeit-Vorhersage darstellt, dadurch daß jeder der p Parameter quantifiziert wird, indem ein Variationsinterval das in einem jeweiligen Referenzinterval ([mi,Mi]) enthalten ist, in 2Ni Segmente unterteilt wird, wobei Ni die Zahl der Codierbits ist, die für die Quantifizierung dieses Parameters verwendet wird, und dadurch, daß wenigstens für die ersten Ordnungsparameter Referenzintervalle verwendet werden, wobei jedes aus zwei unterschiedlichen Intervallen ([mi,1, Mi,1],[mi,2,Mi,2]) gemäß dem bestimmten spektralen Status (Y) des Sprachsignals ausgewählt wird.
- Verfahren nach einem der Ansprüche 1 bis 6 oder nach Anspruch 12, dadurch gekennzeichnet, daß die Koeffizienten (ai) des Kurzzeit-Synthesefilters durch eine Menge aus p geordneten Frequenzparametern von Spektrallinien (cos2πfi) dargestellt werden, wobei p die Ordnung der linearen Kurzzeit-Vorhersage ist, dadurch, daß jeder der p Parameter quantifiziert wird, indem ein Variationsinterval ([mi,min{Mi,c andos2πfi-1}]), das in einem jeweiligen Referenzintervall ([mi,Mi]) enthalten ist, in 2Ni Segmente unterteilt wird, wobei Ni die Anzahl der Codierbits ist, die zur Quantifizierung der Parameter verwendet wird, und dadurch, daß wenigstens bestimmten der Anzahlen von Codierbits Ni der eine oder der andere der zwei unterschiedlichen Werte gemäß dem bestimmten spektralen Status (Y) des Sprachsignals zugewiesen wird.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR9406825A FR2720850B1 (fr) | 1994-06-03 | 1994-06-03 | Procédé de codage de parole à prédiction linéaire. |
FR9406825 | 1994-06-03 |
Publications (2)
Publication Number | Publication Date |
---|---|
EP0685833A1 EP0685833A1 (de) | 1995-12-06 |
EP0685833B1 true EP0685833B1 (de) | 2000-04-26 |
Family
ID=9463861
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP95401262A Expired - Lifetime EP0685833B1 (de) | 1994-06-03 | 1995-05-31 | Verfahren zur Sprachkodierung mittels linearer Prädiktion |
Country Status (4)
Country | Link |
---|---|
US (1) | US5642465A (de) |
EP (1) | EP0685833B1 (de) |
DE (1) | DE69516455T2 (de) |
FR (1) | FR2720850B1 (de) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08179796A (ja) * | 1994-12-21 | 1996-07-12 | Sony Corp | 音声符号化方法 |
FR2729247A1 (fr) * | 1995-01-06 | 1996-07-12 | Matra Communication | Procede de codage de parole a analyse par synthese |
JP3196595B2 (ja) * | 1995-09-27 | 2001-08-06 | 日本電気株式会社 | 音声符号化装置 |
JPH09230896A (ja) * | 1996-02-28 | 1997-09-05 | Sony Corp | 音声合成装置 |
JP3094908B2 (ja) * | 1996-04-17 | 2000-10-03 | 日本電気株式会社 | 音声符号化装置 |
US6253172B1 (en) * | 1997-10-16 | 2001-06-26 | Texas Instruments Incorporated | Spectral transformation of acoustic signals |
US6094629A (en) * | 1998-07-13 | 2000-07-25 | Lockheed Martin Corp. | Speech coding system and method including spectral quantizer |
US7379865B2 (en) * | 2001-10-26 | 2008-05-27 | At&T Corp. | System and methods for concealing errors in data transmission |
KR20050049103A (ko) * | 2003-11-21 | 2005-05-25 | 삼성전자주식회사 | 포만트 대역을 이용한 다이얼로그 인핸싱 방법 및 장치 |
WO2009081569A1 (ja) * | 2007-12-25 | 2009-07-02 | Panasonic Corporation | 超音波診断装置 |
ES2924180T3 (es) * | 2009-12-14 | 2022-10-05 | Fraunhofer Ges Forschung | Dispositivo de cuantificación vectorial, dispositivo de codificación de habla, procedimiento de cuantificación vectorial y procedimiento de codificación de habla |
EP2551848A4 (de) * | 2010-03-23 | 2016-07-27 | Lg Electronics Inc | Verfahren und vorrichtung zur verarbeitung eines tonsignals |
CN105551497B (zh) * | 2013-01-15 | 2019-03-19 | 华为技术有限公司 | 编码方法、解码方法、编码装置和解码装置 |
CN112927724B (zh) | 2014-07-29 | 2024-03-22 | 瑞典爱立信有限公司 | 用于估计背景噪声的方法和背景噪声估计器 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
NL8500843A (nl) | 1985-03-22 | 1986-10-16 | Koninkl Philips Electronics Nv | Multipuls-excitatie lineair-predictieve spraakcoder. |
-
1994
- 1994-06-03 FR FR9406825A patent/FR2720850B1/fr not_active Expired - Fee Related
-
1995
- 1995-05-31 DE DE69516455T patent/DE69516455T2/de not_active Expired - Fee Related
- 1995-05-31 EP EP95401262A patent/EP0685833B1/de not_active Expired - Lifetime
- 1995-06-05 US US08/465,263 patent/US5642465A/en not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
FR2720850B1 (fr) | 1996-08-14 |
FR2720850A1 (fr) | 1995-12-08 |
DE69516455T2 (de) | 2001-01-25 |
US5642465A (en) | 1997-06-24 |
DE69516455D1 (de) | 2000-05-31 |
EP0685833A1 (de) | 1995-12-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0685833B1 (de) | Verfahren zur Sprachkodierung mittels linearer Prädiktion | |
EP0782128B1 (de) | Verfahren zur Analyse eines Audiofrequenzsignals durch lineare Prädiktion, und Anwendung auf ein Verfahren zur Kodierung und Dekodierung eines Audiofrequenzsignals | |
EP0127718B1 (de) | Verfahren zur Aktivitätsdetektion in einem Sprachübertragungssystem | |
EP2419900B1 (de) | Verfahren und einrichtung zur objektiven evaluierung der sprachqualität eines sprachsignals unter berücksichtigung der klassifikation der in dem signal enthaltenen hintergrundgeräusche | |
EP0768770B1 (de) | Verfahren und Vorrichtung zur Erzeugung von Hintergrundrauschen in einem digitalen Übertragungssystem | |
EP2415047B1 (de) | Klassifizieren von in einem Tonsignal enthaltenem Hintergrundrauschen | |
EP1593116B1 (de) | Verfahren zur differenzierten digitalen Sprach- und Musikbearbeitung, Rauschfilterung, Erzeugung von Spezialeffekten und Einrichtung zum Ausführen des Verfahrens | |
EP1692689B1 (de) | Optimiertes mehrfach-codierungsverfahren | |
FR2522179A1 (fr) | Procede et appareil de reconnaissance de paroles permettant de reconnaitre des phonemes particuliers du signal vocal quelle que soit la personne qui parle | |
FR2639459A1 (fr) | Procede de traitement du signal et appareil de formation de donnees issues d'une source sonore | |
EP0428445B1 (de) | Verfahren und Einrichtung zur Codierung von Prädiktionsfiltern in Vocodern mit sehr niedriger Datenrate | |
FR2690551A1 (fr) | Procédé de quantification d'un filtre prédicteur pour vocodeur à très faible débit. | |
EP0195441B1 (de) | Verfahren zur Sprachcodierung mit niedriger Bitrate unter Verwendung eines Mehrimpulsanregungssignals | |
FR2984580A1 (fr) | Procede de detection d'une bande de frequence predeterminee dans un signal de donnees audio, dispositif de detection et programme d'ordinateur correspondant | |
EP0616315A1 (de) | Vorrichtung zur digitalen Sprachkodierung und -dekodierung, Verfahren zum Durchsuchen eines pseudologarithmischen LTP-Verzögerungskodebuchs und Verfahren zur LTP-Analyse | |
EP2589045B1 (de) | Adaptive lineare prädiktive codierung/decodierung | |
EP0685836B1 (de) | Verfahren und Gerät zur Vorverarbeitung eines akustischen Signals vor der Sprachcodierung | |
EP1192619B1 (de) | Audio-kodierung, dekodierung zur interpolation | |
EP1192621B1 (de) | Audiokodierung mit harmonischen komponenten | |
Moreau | Predictive speech coding at low bit rates: a unified approach | |
EP1605440A1 (de) | Verfahren zur Quellentrennung eines Signalgemisches | |
EP0454552A2 (de) | Verfahren und Einrichtung zur Sprachcodierung mit niedriger Bitrate | |
FR2737360A1 (fr) | Procedes de codage et de decodage de signaux audiofrequence, codeur et decodeur pour la mise en oeuvre de tels procedes | |
FR2760285A1 (fr) | Procede et dispositif de generation d'un signal de bruit pour la sortie non vocale d'un signal decode de la parole | |
FR2741743A1 (fr) | Procede et dispositif pour l'amelioration de l'intelligibilite de la parole dans les vocodeurs a bas debit |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): DE ES GB IT NL SE |
|
17P | Request for examination filed |
Effective date: 19951228 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: MATRA NORTEL COMMUNICATIONS |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
17Q | First examination report despatched |
Effective date: 19990803 |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE ES GB IT NL SE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20000426 Ref country code: ES Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY Effective date: 20000426 |
|
RIC1 | Information provided on ipc code assigned before grant |
Free format text: 7G 10L 19/04 A, 7G 10L 101:10 Z |
|
REF | Corresponds to: |
Ref document number: 69516455 Country of ref document: DE Date of ref document: 20000531 |
|
GBT | Gb: translation of ep patent filed (gb section 77(6)(a)/1977) |
Effective date: 20000601 |
|
ITF | It: translation for a ep patent filed | ||
NLV1 | Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: SE Payment date: 20010419 Year of fee payment: 7 |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: IF02 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20020601 |
|
EUG | Se: european patent has lapsed | ||
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20040528 Year of fee payment: 10 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20050414 Year of fee payment: 11 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED. Effective date: 20050531 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20051201 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20060531 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20060531 |