[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

EP0734013B1 - Determination of an excitation vector in a CELP coder - Google Patents

Determination of an excitation vector in a CELP coder Download PDF

Info

Publication number
EP0734013B1
EP0734013B1 EP96410028A EP96410028A EP0734013B1 EP 0734013 B1 EP0734013 B1 EP 0734013B1 EP 96410028 A EP96410028 A EP 96410028A EP 96410028 A EP96410028 A EP 96410028A EP 0734013 B1 EP0734013 B1 EP 0734013B1
Authority
EP
European Patent Office
Prior art keywords
excitation
vector
code
subset
criterion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP96410028A
Other languages
German (de)
French (fr)
Other versions
EP0734013A3 (en
EP0734013A2 (en
Inventor
Mustapha Bouraoui
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
STMicroelectronics SA
Original Assignee
STMicroelectronics SA
SGS Thomson Microelectronics SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by STMicroelectronics SA, SGS Thomson Microelectronics SA filed Critical STMicroelectronics SA
Publication of EP0734013A2 publication Critical patent/EP0734013A2/en
Publication of EP0734013A3 publication Critical patent/EP0734013A3/en
Application granted granted Critical
Publication of EP0734013B1 publication Critical patent/EP0734013B1/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/10Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0013Codebook search algorithms
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0013Codebook search algorithms
    • G10L2019/0014Selection criteria for distances

Definitions

  • the present invention relates to the compression of speech signals to be transmitted over a telephone line and more particularly the determination of an excitation vector as part of a compression using the Prediction method Linear Excited by Code (CELP).
  • CELP Prediction method Linear Excited by Code
  • Figure 1 very schematically represents a circuit CELP compression.
  • Such a circuit is based on modeling vocal cords and the built-up resonance chamber through the oral cavity, throat and larynx. Such a compression method is therefore optimized for the treatment of speech signals.
  • the oral, throat and larynx cavities are modeled by a "linear prediction" filter 10 whose transfer function generally has ten poles.
  • the ropes are modeled by an excitation E treated by a comb filter 12.
  • a digitized speech signal S is analyzed in frames by an analysis circuit 14.
  • the analysis circuit 14 determines coefficients a 1 to a 10 of the transfer function of the filter 10, the step p of the filter in comb 12, and a gain G applied to the excitation E at 16 at the input of the filter 12.
  • the values a i , p and G are calculated for each frame to take account respectively of the variations of the oral cavity, of the spectrum frequency of the vocal cords and the amplitude of the sound. In this way an attempt is made to obtain that the output of the filter 10 is equal to the signal S.
  • the coefficients a i , p, and G are transmitted so that a decoder which receives these coefficients reconstruct the corresponding frames of the signal S.
  • the decoder must also know the excitation E to use.
  • the determination of the coefficients ai, p and G does not pose any particular problem.
  • the procedure finding the optimal excitement remains the heaviest in terms of computational load and it's still a big interest to simplify it even at the cost of a significant reduction the quality of the compression.
  • the excitation E was selected in a table 18 (called "codebook") containing several possible excitations which in fact represented pieces of white noise.
  • codebook a table 18 containing several possible excitations which in fact represented pieces of white noise.
  • a control circuit 20 scan table 18 until the difference e, formed at 22, between the current frame of signal S and the corresponding frame at the output of the filter 10 is minimal. (Of course, instead of compare the signal S to the output of the filter 10, we can also compare the excitation E to the frame of the signal S having undergone the reverse processing of filters 10 and 12.)
  • Each excitation contained in table 18 is a sequence of corresponding digital samples respectively to the samples of each of the frames of the signal to be compressed. For compression to be of acceptable quality, it is necessary to store a relatively large number of around 1000, excitation sequences.
  • each sample of an excitation sequence can only take three values, namely, 0, 1 or -1 (ternary excitation sequence).
  • 0 1 or -1 ternary excitation sequence
  • Figure 2 shows an example of excitation sequence E which has been proposed to further reduce the complexity of research.
  • This excitation sequence is called binary. It includes several non-zero samples with values 1 and -1, two non-zero samples, or pulses, being separated by a constant number of null samples, here 3.
  • Such a sequence excitation can be represented by a binary number (or excitation code) C whose bits are associated with the pulses and correspond to the polarity of these.
  • the code C supplied by the control circuit 20 corresponds directly to an excitation sequence and table 18 is deleted.
  • the complexity is also reduced by the fact that the samples to be taken into account are reduced to pulses, whose number is, in the example of figure 2, four times less than the total number of samples in a sequence.
  • the structure of filters 10 and 12 is simplified.
  • Each C code is associated with an excitation vector C having as components the values 1 and -1 corresponding to the bits at 0 and 1 of code C.
  • m scal 2 (T, C i ) / mod 2 (FC i ) where C i is the excitation vector tested; where T is a target vector formed by samples of the frame being analyzed of the signal S having undergone the reverse processing of filters 10 and 12, these samples being those corresponding to the values 1 and -1 of the vector C i ; and where F denotes the matrix representing the transfer function of filters 10 and 12, in which only the rows corresponding to the values 1 and -1 of the vector C i have been kept.
  • the notations scal (.,.) And mod (.,.) Denote the scalar product and the module respectively.
  • the denominator of the criterion m is approximately constant, whatever the excitation vector C i .
  • the criterion m is approximately maximized by maximizing the numerator. This numerator is maximized when each component of the excitation vector C i is that of the same sign as the corresponding sample of the target vector T.
  • an approximate optimal excitation code is obtained directly by taking for its bits the sign bits of the target vector samples (or the inverse of the sign bits).
  • An object of the present invention is to provide a method for limiting the number of calculations required to maximize the aforementioned criterion m in the case where the usable excitation codes belong to a subset representative of a larger whole.
  • the present invention provides a method for determining an excitation vector associated with a frame of a speech signal to be compressed according to claim 1.
  • the error correction bits are bits of a correction code from Hamming.
  • H Hamming correcting code denoted H (N, n, 3) is used below, where 3 denotes the minimum Hamming distance separating two elements belonging to the representative subset.
  • the Hamming distance between two values is defined as the number of bitwise differences between these two values.
  • a group of excitation codes is formed including an initial code found by maximizing the simplified criterion m as well as all the other codes obtained at from there by changing a single bit.
  • This has the consequence, using a Hamming correcting code to correct a single bit (minimum Hamming distance 3), that each of the group's excitation codes is close to a code distinct from the usable subset.
  • the error correction function of Hamming which brings each of the group's codes back to the code closer to the subset.
  • we retain as approximate optimal code that which maximizes the complete criterion m that is, by calculating its numerator and denominator.
  • FIG. 3 represents in diagram form the method according to the invention which has just been described.
  • the analyzed frame of the signal S undergoes the reverse processing at 24 of that of the filters 10 and 12 of FIG. 1.
  • a target vector T is thus obtained, of which only the samples corresponding to the pulses of the excitation sequences are kept.
  • samples of the vector T are retained only the sign bits (or their inverses) to provide an initial excitation code C 0 .
  • This code C 0 is "corrupted" at 28 to form a group of codes comprising the code C 0 and all the other codes C 1 to C N obtained by modifying a single bit of the code C 0 .
  • Each code C 0 to C N undergoes an "error correction" at 30 to provide a group of corrected codes C ' 0 to C' N.
  • each of the vectors associated with the corrected codes is compared with the target vector T, and the optimal associated excitation code Copt is that associated with the vector which maximizes the complete criterion m.
  • the position of the first pulse of the excitation sequences E is variable.
  • this position can be one of the first four, which is determined by two additional bits to be transmitted to the decoder and which multiplies by four the number of excitation vectors to be tested.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Description

La présente invention concerne la compression des signaux de parole à transmettre sur une ligne téléphonique et plus particulièrement la détermination d'un vecteur d'excitation dans le cadre d'une compression selon la méthode de la Prédiction Linéaire Excitée par Code (CELP).The present invention relates to the compression of speech signals to be transmitted over a telephone line and more particularly the determination of an excitation vector as part of a compression using the Prediction method Linear Excited by Code (CELP).

La figure 1 représente très schématiquement un circuit de compression CELP. Un tel circuit est basé sur une modélisation des cordes vocales et de la chambre de résonance constituée par les cavités buccale, de la gorge et du larynx. Une telle méthode de compression est donc optimisée pour le traitement des signaux de parole.Figure 1 very schematically represents a circuit CELP compression. Such a circuit is based on modeling vocal cords and the built-up resonance chamber through the oral cavity, throat and larynx. Such a compression method is therefore optimized for the treatment of speech signals.

Les cavités buccale, de la gorge et du larynx sont modélisées par un filtre de "prédiction linéaire" 10 dont la fonction de transfert comporte en général dix pôles. Les cordes vocales sont modélisées par une excitation E traitée par un filtre en peigne 12.The oral, throat and larynx cavities are modeled by a "linear prediction" filter 10 whose transfer function generally has ten poles. The ropes are modeled by an excitation E treated by a comb filter 12.

Un signal de parole numérisé S est analysé par trames par un circuit d'analyse 14. A chaque trame, le circuit d'analyse 14 détermine des coefficients a1 à a10 de la fonction de transfert du filtre 10, le pas p du filtre en peigne 12, et un gain G appliqué à l'excitation E en 16 à l'entrée du filtre 12. Les valeurs ai, p et G sont calculées à chaque trame pour tenir compte respectivement des variations de la cavité buccale, du spectre de fréquence des cordes vocales et de l'amplitude du son. On tente de cette manière d'obtenir que la sortie du filtre 10 soit égale au signal S. Alors, au lieu de transmettre les échantillons du signal S, on transmet les coefficients ai, p, et G afin qu'un décodeur qui reçoit ces coefficients reconstitue les trames correspondantes du signal S.A digitized speech signal S is analyzed in frames by an analysis circuit 14. At each frame, the analysis circuit 14 determines coefficients a 1 to a 10 of the transfer function of the filter 10, the step p of the filter in comb 12, and a gain G applied to the excitation E at 16 at the input of the filter 12. The values a i , p and G are calculated for each frame to take account respectively of the variations of the oral cavity, of the spectrum frequency of the vocal cords and the amplitude of the sound. In this way an attempt is made to obtain that the output of the filter 10 is equal to the signal S. Then, instead of transmitting the samples of the signal S, the coefficients a i , p, and G are transmitted so that a decoder which receives these coefficients reconstruct the corresponding frames of the signal S.

Bien entendu, le décodeur doit également connaítre l'excitation E à utiliser. La détermination des coefficients ai, p et G ne pose pas de problème particulier. Toutefois, la procédure de recherche de l'excitation optimale reste la plus lourde en termes de charge de calcul et il est toujours d'un grand intérêt de la simplifier même au prix d'une sensible réduction de la qualité de la compression.Of course, the decoder must also know the excitation E to use. The determination of the coefficients ai, p and G does not pose any particular problem. However, the procedure finding the optimal excitement remains the heaviest in terms of computational load and it's still a big interest to simplify it even at the cost of a significant reduction the quality of the compression.

A l'origine du codage CELP, l'excitation E était sélectionnée dans une table 18 (appelée "codebook") contenant plusieurs excitations possibles qui représentaient, en fait, des morceaux de bruit blanc. Dans ce cas, un circuit de commande 20 balaye la table 18 jusqu'à ce que la différence e, formée en 22, entre la trame courante du signal S et la trame correspondante en sortie du filtre 10 soit minimale. (Bien entendu, au lieu de comparer le signal S à la sortie du filtre 10, on peut également comparer l'excitation E à la trame du signal S ayant subi le traitement inverse des filtres 10 et 12.)At the origin of the CELP coding, the excitation E was selected in a table 18 (called "codebook") containing several possible excitations which in fact represented pieces of white noise. In this case, a control circuit 20 scan table 18 until the difference e, formed at 22, between the current frame of signal S and the corresponding frame at the output of the filter 10 is minimal. (Of course, instead of compare the signal S to the output of the filter 10, we can also compare the excitation E to the frame of the signal S having undergone the reverse processing of filters 10 and 12.)

Avec cette technique, outre les coefficients ai, p et G, on fournit l'adresse C sélectionnant la meilleure excitation E dans la table 18 à un décodeur muni d'une table homologue.With this technique, in addition to the coefficients ai, p and G, we provide the address C selecting the best excitation E in table 18 to a decoder provided with a peer table.

Chaque excitation contenue dans la table 18 est une séquence d'échantillons numériques correspondant respectivement aux échantillons de chacune des trames du signal à comprimer. Pour que la compression soit de qualité acceptable, il est nécessaire de stocker un nombre relativement important, de l'ordre de 1000, de séquences d'excitation. Each excitation contained in table 18 is a sequence of corresponding digital samples respectively to the samples of each of the frames of the signal to be compressed. For compression to be of acceptable quality, it is necessary to store a relatively large number of around 1000, excitation sequences.

Pour limiter la complexité de la procédure de recherche, on a proposé que chaque échantillon d'une séquence d'excitation ne puisse prendre que trois valeurs, à savoir, 0, 1 ou -1 (séquence d'excitation ternaire). On s'est aperçu que cela ne modifiait pas la qualité de la compression de manière perceptible.To limit the complexity of the search procedure, it has been proposed that each sample of an excitation sequence can only take three values, namely, 0, 1 or -1 (ternary excitation sequence). We realized that not noticeably change the quality of the compression.

La figure 2 représente un exemple de séquence d'excitation E que l'on a proposé pour réduire davantage la complexité de la recherche. Cette séquence d'excitation est dite binaire. Elle comprend plusieurs échantillons non-nuls de valeurs 1 et -1, deux échantillons non-nuls, ou impulsions, étant séparés par un nombre constant d'échantillons nuls, ici 3. Une telle séquence d'excitation peut être représentée par un nombre binaire (ou code d'excitation) C dont les bits sont associés aux impulsions et correspondent à la polarité de celles-ci. En procédant ainsi, le code C fourni par le circuit de commande 20 correspond directement à une séquence d'excitation et la table 18 est supprimée. La complexité est par ailleurs réduite du fait que les échantillons à prendre en compte se réduisent aux impulsions, dont le nombre est, dans l'exemple de la figure 2, quatre fois inférieur au nombre total d'échantillons d'une séquence. En outre, la structure des filtres 10 et 12 est simplifiée.Figure 2 shows an example of excitation sequence E which has been proposed to further reduce the complexity of research. This excitation sequence is called binary. It includes several non-zero samples with values 1 and -1, two non-zero samples, or pulses, being separated by a constant number of null samples, here 3. Such a sequence excitation can be represented by a binary number (or excitation code) C whose bits are associated with the pulses and correspond to the polarity of these. By proceeding thus, the code C supplied by the control circuit 20 corresponds directly to an excitation sequence and table 18 is deleted. The complexity is also reduced by the fact that the samples to be taken into account are reduced to pulses, whose number is, in the example of figure 2, four times less than the total number of samples in a sequence. In in addition, the structure of filters 10 and 12 is simplified.

Cette technique dégrade très légèrement la qualité de la compression, mais cette dégradation est compensée de manière simple par un traitement destiné à supprimer les effets de la régularité de l'espacement des échantillons non-nuls.This technique slightly degrades the quality of compression, but this degradation is compensated so simple treatment to suppress the effects of regularity of spacing of non-zero samples.

A chaque code C est associé un vecteur d'excitation C ayant pour composantes les valeurs 1 et -1 correspondant aux bits à 0 et 1 du code C. Les termes "vecteur" et "code" sont confondus dans la suite de la description.Each C code is associated with an excitation vector C having as components the values 1 and -1 corresponding to the bits at 0 and 1 of code C. The terms "vector" and "code" are confused in the following description.

Afin de réduire davantage le nombre d'essais pour minimiser l'erreur, on a proposé de limiter le nombre de vecteurs ou codes d'excitation possibles à un sous-ensemble représentatif d'un ensemble plus grand. L'article intitulé "A Comparison of some Algebraic Structures for CELP Coding of Speech" de J.P. Adoul et C. Lamblin dans Proc. ICASSP, 1987, décrit une telle méthode. Pour créer un sous-ensemble représentatif de tous les codes C de N bits, on forme l'ensemble des valeurs de n bits (n<N), dont chacune est complétée par N-n bits de correction d'erreur.To further reduce the number of trials for minimize the error, we proposed to limit the number of vectors or possible excitation codes to a representative subset of a larger whole. The article entitled "A Comparison of some Algebraic Structures for CELP Coding of Speech "by J.P. Adoul and C. Lamblin in Proc. ICASSP, 1987, describes such a method. To create a representative subset of all C codes of N bits, we form the set of values of n bits (n <N), each of which is supplemented by N-n bits error correction.

Pour trouver le meilleur vecteur d'excitation C, on cherche généralement à maximiser un critère de sélection défini par : m = scal2(T, Ci)/mod2(FCi), où Ci est le vecteur d'excitation essayé ; où T est un vecteur cible formé par des échantillons de la trame en cours d'analyse du signal S ayant subi le traitement inverse des filtres 10 et 12, ces échantillons étant ceux correspondant aux valeurs 1 et -1 du vecteur Ci ; et où F désigne la matrice représentant la fonction de transfert des filtres 10 et 12, dans laquelle on n'a conservé que les rangées correspondant aux valeurs 1 et -1 du vecteur Ci. Les notations scal(.,.) et mod(.,.) désignent respectivement le produit scalaire et le module.To find the best excitation vector C, we generally seek to maximize a selection criterion defined by: m = scal 2 (T, C i ) / mod 2 (FC i ) where C i is the excitation vector tested; where T is a target vector formed by samples of the frame being analyzed of the signal S having undergone the reverse processing of filters 10 and 12, these samples being those corresponding to the values 1 and -1 of the vector C i ; and where F denotes the matrix representing the transfer function of filters 10 and 12, in which only the rows corresponding to the values 1 and -1 of the vector C i have been kept. The notations scal (.,.) And mod (.,.) Denote the scalar product and the module respectively.

L'essai de tous les vecteurs d'excitation Ci selon ce critère représente un nombre de calculs considérable à effectuer entre les arrivées de deux trames du signal S.The test of all the excitation vectors C i according to this criterion represents a considerable number of calculations to be carried out between the arrivals of two frames of the signal S.

Il a été constaté que le dénominateur du critère m est approximativement constant, quel que soit le vecteur d'excitation Ci. Ainsi, le critère m est approximativement maximisé en maximisant le numérateur. Ce numérateur est maximisé lorsque chaque composante du vecteur d'excitation Ci est celle de même signe que l'échantillon correspondant du vecteur cible T. En d'autres termes, un code d'excitation optimal approximatif est obtenu directement en prenant pour ses bits les bits de signe des échantillons du vecteur cible (ou les inverses des bits de signe).It was found that the denominator of the criterion m is approximately constant, whatever the excitation vector C i . Thus, the criterion m is approximately maximized by maximizing the numerator. This numerator is maximized when each component of the excitation vector C i is that of the same sign as the corresponding sample of the target vector T. In other words, an approximate optimal excitation code is obtained directly by taking for its bits the sign bits of the target vector samples (or the inverse of the sign bits).

Cette solution n'est pas applicable dans le cas où les codes d'excitation utilisables sont limités à un sous-ensemble représentatif d'un ensemble plus grand obtenu, par exemple, à l'aide d'un code correcteur d'erreurs.This solution is not applicable in the case where the usable excitation codes are limited to a subset representative of a larger set obtained, for example, at using an error correction code.

Un objet de la présente invention est de prévoir un procédé permettant de limiter le nombre de calculs nécessaires pour maximiser le critère m susmentionné dans le cas où les codes d'excitation utilisables appartiennent à un sous-ensemble représentatif d'un ensemble plus grand.An object of the present invention is to provide a method for limiting the number of calculations required to maximize the aforementioned criterion m in the case where the usable excitation codes belong to a subset representative of a larger whole.

Pour atteindre cet objet, la présente invention prévoit un procédé de détermination d'un vecteur d'excitation associé à une trame d'un signal de parole à comprimer conformément à la revendication 1. To achieve this object, the present invention provides a method for determining an excitation vector associated with a frame of a speech signal to be compressed according to claim 1.

Selon un mode de réalisation de la présente invention, les bits de correction d'erreurs sont des bits d'un code correcteur de Hamming.According to an embodiment of the present invention, the error correction bits are bits of a correction code from Hamming.

Ces objets, caractéristiques et avantages ainsi que d'autres de la présente invention seront exposés en détail dans la description suivante de modes de réalisation particuliers, faite à titre non-limitatif à l'aide des figures jointes parmi lesquelles :

  • la figure 1, précédemment décrite, illustre une méthode de compression CELP ;
  • la figure 2, précédemment décrite, représente un exemple de séquence d'excitation et du code correspondant ; et
  • la figure 3 illustre des traitements à effectuer selon la présente invention pour sélectionner un vecteur d'excitation optimal dans le cas où celui-ci appartient à un sous-ensemble obtenu en utilisant un code correcteur d'erreurs.
  • These objects, characteristics and advantages as well as others of the present invention will be explained in detail in the following description of particular embodiments, given without limitation by means of the attached figures, among which:
  • Figure 1, previously described, illustrates a CELP compression method;
  • FIG. 2, previously described, shows an example of excitation sequence and the corresponding code; and
  • FIG. 3 illustrates the processing operations to be carried out according to the present invention for selecting an optimal excitation vector in the case where it belongs to a subset obtained using an error correcting code.
  • Pour maximiser le critère m susmentionné, on a constaté que le dénominateur de ce critère, le carré du module du vecteur FCi, est approximativement constant, quel que soit le vecteur d'excitation Ci. Cette approximation est relativement bonne puisque le module du vecteur Ci est constant. Ainsi, pour maximiser le critère m de manière approchée, il suffit de maximiser un critère simplifié qui est le produit scalaire du vecteur cible T par le vecteur d'excitation Ci. Ce produit scalaire est maximal lorsque chaque composante (1 ou -1) du vecteur Ci est celle de même signe que l'échantillon correspondant du vecteur cible T. On obtient ainsi directement un vecteur d'excitation optimal approché Copt à partir du vecteur cible T.To maximize the above-mentioned criterion m, it has been found that the denominator of this criterion, the square of the module of the vector FC i , is approximately constant, whatever the excitation vector C i . This approximation is relatively good since the modulus of the vector C i is constant. Thus, in order to maximize the criterion m, it suffices to maximize a simplified criterion which is the scalar product of the target vector T by the excitation vector C i . This scalar product is maximum when each component (1 or -1) of the vector C i is that of the same sign as the corresponding sample of the target vector T. We thus obtain directly an optimal excitation vector approximated Copt from the target vector T.

    Ce procédé n'est pas directement applicable dans le cas où les codes d'excitation possibles appartiennent à un sous-ensemble représentatif d'un ensemble plus grand, par exemple lorsque ce sous-ensemble est formé à partir de valeurs de n bits auxquels on ajoute N-n bits d'un code correcteur d'erreurs. En effet, il est alors très probable que le code d'excitation trouvé n'appartienne pas au sous-ensemble. Dans ce cas, on pourrait envisager de ramener le code d'excitation trouvé à un code d'excitation appartenant au sous-ensemble en lui appliquant une fonction de correction d'erreur associée au code correcteur. On tombe alors sur le code d'excitation du sous-ensemble qui est le plus proche du code d'excitation trouvé. Cette "correction d'erreur" a pour conséquence de modifier au moins un bit du code d'excitation, ce bit pouvant avoir dans certains cas une grande influence sur la valeur du critère m, de manière que le code d'excitation final fournisse de mauvais résultats.This process is not directly applicable in the cases where the possible excitation codes belong to a subset representative of a larger whole, for example when this subset is formed from values of n bits to which we add N-n bits of an error correcting code. In effect then it is very likely that the excitation code found does not belong to the subset. In this case, we might consider reducing the excitation code found to a excitation code belonging to the subset by applying it an error correction function associated with the correction code. We then fall on the excitation code of the subset which is closest to the excitation code found. This "error correction" has the effect of modifying at least one bit of the excitation code, this bit can have in some cases a great influence on the value of the criterion m, so that the final excitation code gives poor results.

    A titre d'exemple, on utilise ci-après un code correcteur de Hamming noté H(N, n, 3), où 3 désigne la distance minimale de Hamming séparant deux éléments appartenant au sous-ensemble représentatif. La distance de Hamming entre deux valeurs est définie comme étant le nombre de différences bit à bit entre ces deux valeurs. Avec cette solution, on crée un sous-ensemble de 2n vecteurs d'excitation de N bits.By way of example, a Hamming correcting code denoted H (N, n, 3) is used below, where 3 denotes the minimum Hamming distance separating two elements belonging to the representative subset. The Hamming distance between two values is defined as the number of bitwise differences between these two values. With this solution, we create a subset of 2 n excitation vectors of N bits.

    Selon l'invention, on forme un groupe de codes d'excitation comprenant un code initial trouvé en maximisant le critère m simplifié ainsi que tous les autres codes obtenus à partir de celui-ci en modifiant un seul bit. Ceci a pour conséquence, en utilisant un code correcteur de Hamming permettant de corriger un bit unique (distance de Hamming minimale 3), que chacun des codes d'excitation du groupe est proche d'un code distinct du sous-ensemble utilisable. Ensuite, on applique à chacun des codes du groupe la fonction de correction d'erreur de Hamming, ce qui ramène chacun des codes du groupe au code le plus proche du sous-ensemble. On obtient un groupe de codes "corrigés" appartenant au sous-ensemble et qui "entoure" le code initialement trouvé. Parmi les codes corrigés, on retient comme code optimal approché celui qui maximise le critère m complet, c'est-à-dire en calculant son numérateur et son dénominateur.According to the invention, a group of excitation codes is formed including an initial code found by maximizing the simplified criterion m as well as all the other codes obtained at from there by changing a single bit. This has the consequence, using a Hamming correcting code to correct a single bit (minimum Hamming distance 3), that each of the group's excitation codes is close to a code distinct from the usable subset. Then we apply to each of the group codes the error correction function of Hamming, which brings each of the group's codes back to the code closer to the subset. We get a group of codes "corrected" belonging to the subset and which "surrounds" the code originally found. Among the corrected codes, we retain as approximate optimal code that which maximizes the complete criterion m, that is, by calculating its numerator and denominator.

    La figure 3 représente sous forme de schéma le procédé selon l'invention qui vient d'être décrit. La trame analysée du signal S subit en 24 le traitement inverse de celui des filtres 10 et 12 de la figure 1. On obtient ainsi un vecteur cible T, dont on ne garde que les échantillons correspondant aux impulsions des séquences d'excitation. En 26, on ne retient des échantillons du vecteur T que les bits de signe (ou leurs inverses) pour fournir un code d'excitation initial C0. Ce code C0 est "corrompu" en 28 pour former un groupe de codes comprenant le code C0 et tous les autres codes C1 à CN obtenus en modifiant un seul bit du code C0. Chaque code C0 à CN subit en 30 une "correction d'erreur" pour fournir un groupe de codes corrigés C'0 à C'N. En 32, chacun des vecteurs associés aux codes corrigés est comparé au vecteur cible T, et on retient comme code d'excitation optimal approché Copt celui associé au vecteur qui maximise le critère m complet.FIG. 3 represents in diagram form the method according to the invention which has just been described. The analyzed frame of the signal S undergoes the reverse processing at 24 of that of the filters 10 and 12 of FIG. 1. A target vector T is thus obtained, of which only the samples corresponding to the pulses of the excitation sequences are kept. In 26, samples of the vector T are retained only the sign bits (or their inverses) to provide an initial excitation code C 0 . This code C 0 is "corrupted" at 28 to form a group of codes comprising the code C 0 and all the other codes C 1 to C N obtained by modifying a single bit of the code C 0 . Each code C 0 to C N undergoes an "error correction" at 30 to provide a group of corrected codes C ' 0 to C' N. In 32, each of the vectors associated with the corrected codes is compared with the target vector T, and the optimal associated excitation code Copt is that associated with the vector which maximizes the complete criterion m.

    Généralement, pour obtenir de meilleurs résultats, la position de la première impulsion des séquences d'excitation E est variable. Dans l'exemple de la figure 2, cette position peut être l'une des quatre premières, ce qui est déterminé par deux bits supplémentaires à transmettre au décodeur et qui multiplie par quatre le nombre de vecteurs d'excitation à essayer. Dans ce cas, on commence par former, pour chacune des quatre positions possibles, un vecteur cible et un vecteur d'excitation dont les composantes sont celles de même signe que celles du vecteur cible correspondant. On retient comme vecteur d'excitation optimal approché, parmi les quatre vecteurs ainsi obtenus, celui qui maximise le critère m complet.Generally, for best results, the position of the first pulse of the excitation sequences E is variable. In the example in Figure 2, this position can be one of the first four, which is determined by two additional bits to be transmitted to the decoder and which multiplies by four the number of excitation vectors to be tested. In this case, we start by forming, for each of the four positions possible, a target vector and an excitation vector whose components are those of the same sign as those of the vector corresponding target. We use as excitation vector optimal approached, among the four vectors thus obtained, that which maximizes the complete criterion m.

    Claims (2)

    1. A method for determining an excitation vector (Copt) associated with a frame of a speech signal (S) to compress, said vector belonging to a subset associated with a larger set of excitation vectors likely to maximize a criterion (m), and having as components values 1 and -1 corresponding to a sequence of excitation samples (E) of a linear prediction filter (10), said criterion being equal to the square of the ratio between, on the one hand, the scalar product of the excitation vector by a target vector (T) formed by samples of the frame submitted to an inverse linear prediction filtering and, on the other hand, the module of the excitation vector submitted to a direct linear prediction filtering, the excitation vectors being associated with excitation codes having bits corresponding to the signs of the components of the excitation vectors, the excitation code subset associated with said vector subset being formed by binary values completed by error correction bits, any excitation code being associated with an excitation code of the subset through an error correction function, characterized in that it includes the following steps:
      preselecting an excitation vector (C0) having as components those with the same signs as the corresponding samples of the target vector, or those with the opposite signs; and
      if the preselected excitation vector (C0) does not belong to said subset, selecting as an excitation vector the vector which maximizes said criterion (m) among the vectors (C'0... C'N) of the subset, which are respectively obtained by submitting the error correction function the preselected associated with the preselected vector (Co) and the group of codes (C1... CN) the closest therefrom, formed so that each of the closest codes differs from the preselected code by a single bit; and
      selecting as the excitation code (Copt), among the corrected codes, the one associated with the vector which maximizes said criterion (m).
    2. A method according to claim 1, wherein the error correction bits are the bits of a Hamming correcting code.
    EP96410028A 1995-03-24 1996-03-19 Determination of an excitation vector in a CELP coder Expired - Lifetime EP0734013B1 (en)

    Applications Claiming Priority (2)

    Application Number Priority Date Filing Date Title
    FR9503735A FR2732148B1 (en) 1995-03-24 1995-03-24 DETERMINATION OF AN EXCITATION VECTOR IN A CELP ENCODER
    FR9503735 1995-03-24

    Publications (3)

    Publication Number Publication Date
    EP0734013A2 EP0734013A2 (en) 1996-09-25
    EP0734013A3 EP0734013A3 (en) 1997-05-28
    EP0734013B1 true EP0734013B1 (en) 2001-08-22

    Family

    ID=9477567

    Family Applications (1)

    Application Number Title Priority Date Filing Date
    EP96410028A Expired - Lifetime EP0734013B1 (en) 1995-03-24 1996-03-19 Determination of an excitation vector in a CELP coder

    Country Status (5)

    Country Link
    US (1) US5719994A (en)
    EP (1) EP0734013B1 (en)
    JP (1) JPH0990996A (en)
    DE (1) DE69614594D1 (en)
    FR (1) FR2732148B1 (en)

    Families Citing this family (1)

    * Cited by examiner, † Cited by third party
    Publication number Priority date Publication date Assignee Title
    DE69516522T2 (en) * 1995-11-09 2001-03-08 Nokia Mobile Phones Ltd., Salo Method for synthesizing a speech signal block in a CELP encoder

    Family Cites Families (3)

    * Cited by examiner, † Cited by third party
    Publication number Priority date Publication date Assignee Title
    US4791654A (en) * 1987-06-05 1988-12-13 American Telephone And Telegraph Company, At&T Bell Laboratories Resisting the effects of channel noise in digital transmission of information
    US5138661A (en) * 1990-11-13 1992-08-11 General Electric Company Linear predictive codeword excited speech synthesizer
    FI98104C (en) * 1991-05-20 1997-04-10 Nokia Mobile Phones Ltd Procedures for generating an excitation vector and digital speech encoder

    Also Published As

    Publication number Publication date
    US5719994A (en) 1998-02-17
    JPH0990996A (en) 1997-04-04
    DE69614594D1 (en) 2001-09-27
    EP0734013A3 (en) 1997-05-28
    FR2732148B1 (en) 1997-06-13
    EP0734013A2 (en) 1996-09-25
    FR2732148A1 (en) 1996-09-27

    Similar Documents

    Publication Publication Date Title
    EP0782128B1 (en) Method of analysing by linear prediction an audio frequency signal, and its application to a method of coding and decoding an audio frequency signal
    EP0608174B1 (en) System for predictive encoding/decoding of a digital speech signal by an adaptive transform with embedded codes
    EP0749626B1 (en) Speech coding method using linear prediction and algebraic code excitation
    EP0481895B1 (en) Method and apparatus for low bit rate transmission of a speech signal using CELP coding
    EP0428445B1 (en) Method and apparatus for coding of predictive filters in very low bitrate vocoders
    FR2702075A1 (en) A method of generating a spectral weighting filter of noise in a speech coder.
    EP0685833B1 (en) Method for speech coding using linear prediction
    CA2340028C (en) Neural network and its use for speech recognition
    WO2006075078A1 (en) Method and device for carrying out optimal coding between two long-term prediction models
    EP0734013B1 (en) Determination of an excitation vector in a CELP coder
    EP0616315A1 (en) Digital speech coding and decoding device, process for scanning a pseudo-logarithmic LTP codebook and process of LTP analysis
    EP0573358B1 (en) Variable speed voice synthesizer method and apparatus
    EP1383109A1 (en) Method and device for wide band speech coding
    EP0347307B1 (en) Coding method and linear prediction speech coder
    EP1192619B1 (en) Audio coding and decoding by interpolation
    EP1192618B1 (en) Audio coding with adaptive liftering
    EP0796490B1 (en) Signal prediction method and device for a speech coder
    EP1192621B1 (en) Audio encoding with harmonic components
    EP1605440A1 (en) Method for signal source separation from a mixture signal
    WO2001003121A1 (en) Encoding and decoding with harmonic components and minimum phase
    CA2079884A1 (en) Method and device for low-speed speech coding
    WO2001003119A1 (en) Audio encoding and decoding including non harmonic components of the audio signal
    EP1383110A1 (en) Method and device for wide band speech coding, particularly allowing for an improved quality of voised speech frames
    EP1383112A2 (en) Method and device for enlarged bandwidth speech coding, allowing in particular an improved quality of voiced frames
    WO2001003116A1 (en) Methods and device for audio analysis and synthesis

    Legal Events

    Date Code Title Description
    PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

    Free format text: ORIGINAL CODE: 0009012

    AK Designated contracting states

    Kind code of ref document: A2

    Designated state(s): DE FR GB IT

    PUAL Search report despatched

    Free format text: ORIGINAL CODE: 0009013

    AK Designated contracting states

    Kind code of ref document: A3

    Designated state(s): DE FR GB IT

    17P Request for examination filed

    Effective date: 19971030

    RAP3 Party data changed (applicant data changed or rights of an application transferred)

    Owner name: STMICROELECTRONICS S.A.

    17Q First examination report despatched

    Effective date: 19991001

    RIC1 Information provided on ipc code assigned before grant

    Free format text: 7G 10L 19/10 A

    GRAG Despatch of communication of intention to grant

    Free format text: ORIGINAL CODE: EPIDOS AGRA

    GRAG Despatch of communication of intention to grant

    Free format text: ORIGINAL CODE: EPIDOS AGRA

    GRAH Despatch of communication of intention to grant a patent

    Free format text: ORIGINAL CODE: EPIDOS IGRA

    GRAH Despatch of communication of intention to grant a patent

    Free format text: ORIGINAL CODE: EPIDOS IGRA

    GRAA (expected) grant

    Free format text: ORIGINAL CODE: 0009210

    AK Designated contracting states

    Kind code of ref document: B1

    Designated state(s): DE FR GB IT

    PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

    Ref country code: IT

    Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRE;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED.SCRIBED TIME-LIMIT

    Effective date: 20010822

    REF Corresponds to:

    Ref document number: 69614594

    Country of ref document: DE

    Date of ref document: 20010927

    GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)

    Effective date: 20011005

    PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

    Ref country code: DE

    Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

    Effective date: 20011123

    RAP2 Party data changed (patent owner data changed or rights of a patent transferred)

    Owner name: STMICROELECTRONICS S.A.

    REG Reference to a national code

    Ref country code: GB

    Ref legal event code: IF02

    PLBE No opposition filed within time limit

    Free format text: ORIGINAL CODE: 0009261

    STAA Information on the status of an ep patent application or granted ep patent

    Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

    26N No opposition filed
    PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

    Ref country code: GB

    Payment date: 20060816

    Year of fee payment: 11

    PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

    Ref country code: FR

    Payment date: 20060825

    Year of fee payment: 11

    GBPC Gb: european patent ceased through non-payment of renewal fee

    Effective date: 20070319

    REG Reference to a national code

    Ref country code: FR

    Ref legal event code: ST

    Effective date: 20071130

    PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

    Ref country code: GB

    Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

    Effective date: 20070319

    PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

    Ref country code: FR

    Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

    Effective date: 20070402