MXPA99007002A - Method and apparatus for using state determination to control functional elements in digital telephone systems - Google Patents
Method and apparatus for using state determination to control functional elements in digital telephone systemsInfo
- Publication number
- MXPA99007002A MXPA99007002A MXPA/A/1999/007002A MX9907002A MXPA99007002A MX PA99007002 A MXPA99007002 A MX PA99007002A MX 9907002 A MX9907002 A MX 9907002A MX PA99007002 A MXPA99007002 A MX PA99007002A
- Authority
- MX
- Mexico
- Prior art keywords
- speech
- echo
- signal
- status information
- noise
- Prior art date
Links
- 238000002592 echocardiography Methods 0.000 claims abstract description 193
- 230000001629 suppression Effects 0.000 claims abstract description 58
- 230000005540 biological transmission Effects 0.000 claims abstract description 34
- 230000003044 adaptive Effects 0.000 claims abstract description 16
- 238000004364 calculation method Methods 0.000 claims description 10
- 230000004044 response Effects 0.000 claims description 8
- 230000003595 spectral Effects 0.000 claims description 6
- 238000001514 detection method Methods 0.000 claims description 5
- 230000003213 activating Effects 0.000 claims 9
- 230000001743 silencing Effects 0.000 claims 4
- RZVAJINKPMORJF-UHFFFAOYSA-N p-acetaminophenol Chemical compound CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJF-UHFFFAOYSA-N 0.000 description 22
- 230000000694 effects Effects 0.000 description 12
- 241001442055 Vipera berus Species 0.000 description 11
- 238000010586 diagram Methods 0.000 description 11
- 230000000051 modifying Effects 0.000 description 11
- 238000000034 method Methods 0.000 description 8
- 230000003584 silencer Effects 0.000 description 5
- 238000001914 filtration Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000004059 degradation Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000002688 persistence Effects 0.000 description 2
- 229920003258 poly(methylsilmethylene) Polymers 0.000 description 2
- 208000002161 Echolalia Diseases 0.000 description 1
- 210000001260 Vocal Cords Anatomy 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000001808 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000002452 interceptive Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006011 modification reaction Methods 0.000 description 1
- 230000003071 parasitic Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000001755 vocal Effects 0.000 description 1
Abstract
A method and apparatus for controlling various functional elements (34, 38, 42, 44) in a digital telephone system using state determination from an echo canceller (10). An echo canceller (10) is used to evaluate which one of five talk states two speakers are engaged in during a telephone conversation. This state determination information is used to control a tone detector function (34), a noise suppressor function (38), an adaptive equalizer function, a transmission mute function (42), and a vocoder encoder function (44) within a vocoder (12). During the talk state where the far-end speaker is active and the near-end speaker is inactive, the echo canceller (10) provides a signal which disables background noise estimates from being performed in the noise suppressor (38) and the vocoder (10) encoder. The same signal is used to disable the tone detector (34) and to enable the transmission mute function (42) during this talk state.
Description
METHOD AND APPARATUS FOR USING DETERMINATION OF STATE TO CONTROL FUNCTIONAL ELEMENTS IN
DIGITAL TELEPHONE SYSTEMS
FIELD OF THE INVENTION The present invention relates to digital telephone systems. More particularly, the present invention relates to a novel and improved method and apparatus for using the state determination of an echo canceller to control various functional blocks of a digital telephone system.
BACKGROUND OF THE INVENTION The transmission of speech by digital techniques has been widely disseminated, particularly in cell phone and PCS applications. This in turn has created an interest in improving speech processing techniques. Three such techniques include the addition of echo cancellers, noise suppressors and voice coders / decoders or vocoders to the existing elements of digital telephone systems. Echo cancellers are used to decrease the undesirable echo signals caused by differences in the impedance of land-based telephone networks or, in the case of
P1454 / 99 X mobile phones, the echo caused by the acoustic coupling between the speaker and the microphone in the hands-free telephones. Vocoders are used to eliminate natural speech redundancies in a digitized signal in order to reduce the data transmission rates and, consequently, the amount of information that will be transmitted in a given transmission channel. Noise suppressors are used to minimize background noise. Currently echo cancellers, vocoders and noise suppressors are used in digital telephone systems in both land-based and mobile applications. There are two types of echo cancellers, the network echo canceller and the acoustic echo canceller. An example of a typical echo canceller is disclosed in U.S. Patent No. 5,307,405 entitled "NETWORK ECHO CANCELLER", which is assigned to the assignee of the present invention and is incorporated by reference herein. A network echo canceller cancels the echo produced in a telephone network. A land-based telephone is connected to a telephone exchange via a line of two wires to support transmission in both directions. For more distant calls of approximately 35 miles, the two transmission directions must be segregated in
P1454 / 99 X physically separated wires, resulting in a four-wire wire. The device that interconnects the segments of two wires and four wires is known as a hybrid. The difference in impedance in the hybrid results in an echo which must be eliminated by a network echo canceller. Acoustic echo cancellers are used in teleconferencing and hands-free telephony applications. An acoustic echo canceller eliminates the acoustic echo that results from feedback between the speaker and the microphone. In a typical digital telephone system, speech is converted from an analog signal into digital PCM samples by an A / D converter. In a normal mode, a data rate of 64 kbps is chosen in order to preserve good speech quality. Once the speech signal has been digitized, it can be manipulated to achieve certain benefits, such as, for example, maximizing the capacity of the system, improving speech quality, suppressing noise and reducing noise. minimizing transmission errors. After the speech signal has been converted into PCM samples, the undesirable echo can be eliminated by an echo canceller, the background noise can be minimized by a noise suppressor and the compression of the data can be performed by a vocoder before the
P1454 / 99MX modulation and overconversion for its transmission. An example of a variable speed vocoder is disclosed in U.S. Patent No. 5414,796 entitled "VARIABLE RATE VOCODER", which is assigned to the assignee of the present invention and which is incorporated herein by reference. The encoded speech signal can be modulated by any of several techniques, including TDMA, CDMA or analog modulation. The use of CDMA techniques in a multiple access communication system is disclosed in U.S. Patent No. 4,901,307 entitled "SPREAD SPECTRUM MULTIPLE ACCESS COMMUNICATION SYSTEM USING SATELLITE OR TERRESTRIAL REPEATERS", which is assigned to the assignee of the present invention and which is incorporated herein by reference. The combination of the echo canceller with the vocoder and the noise suppressor has certain benefits as well as problems associated with it. A problem with the introduction of an echo canceller in the electronic components of the front end of a digital telephone system is that it alters the speech signal towards the other functional blocks, due to its location in the system with respect to the other functional blocks . When placing the echo canceller first in the chain of functional blocks, the noise suppressor and the vocoder must perform calculations of the
P1454 / 99MX background noise based on a canceled echo signal instead of the actual background noise. If the echo canceller does not remove all the echo from the speech signal, the residual echo may cause errors in the calculations of the background noise made by the noise suppressor and the vocoder. In the present, a mobile user is referred to as the near-end speaker and the ground-based user is referred to as the far-end speaker. A typical vocoder may contain a noise suppressor whose function is to eliminate the background noise of the near-end speech signal. An example of a typical noise suppressor is disclosed in U.S. Patent No. 4,811,404 entitled "NOISE SUPPRESSION SYSTEM", which is assigned to Motorola, Inc. and incorporated by reference herein. Noise suppression is performed by calculating an estimate of the actual background noise energy during periods when the near-end speaker is silent. A problem occurs if the near-end speaker is silent and the far-end speaker is talking. On the mobile phone, speech from the far-end speaker can be acoustically coupled from the speaker to the microphone, resulting in an echo that will be heard by the far-end speaker, unless it is eliminated. In a ground based system, speech
P1454 / 99MX from the near end can be coupled to the speech signal from the far-end speaker, due to the impedance difference in the aforementioned hybrid. An echo canceller is used to eliminate the echo but, due to the limitations of the echo canceller, the echo will not be completely eliminated. A noise suppressor placed after the echo canceller can interpret the residual echo as background noise and update the background noise estimate based on the residual echo. This alters the estimated background noise, resulting in a suppression of degraded noise. The vocoder will suffer from providing an inadequate estimate of background noise to the synthesized noise generator of the system. In addition, vocoder coding rate decisions will be adversely affected. Therefore, an object of the present invention is to avoid erroneous updates of the background noise in the noise suppressor and in the vocoder encoder when the near-end speaker is silent and the far-end speaker is active. It is another object of the present invention to use the state determination signal from the echo canceller to control other functional elements within a digital telephone system, such as a tone detector,
P1454 / 99MX a mute transmission function and an adaptive equalizer.
SUMMARY OF THE INVENTION The present invention is a novel and improved combination of the functional elements within a digital telephone system. In accordance with the present invention, an echo canceller in combination with a vocoder is used, wherein the echo canceller provides information to various functional blocks within the vocoder for the purposes of noise suppression, DTMF tone detection, silence of the transmission and speech coding. An immediate benefit of the combination of an echo canceller with a vocoder is the savings in cost, weight and space that result from combining two integrated circuits into a single integrated circuit. In the inventive mode of the present invention, an echo canceller is used that determines, among other things, in which mode of conversation the two speakers are located. In the exemplary mode, five different modes or states of conversation are possible: it only speaks of the near end, it only speaks of the far end, conversation of both speakers, there is no conversation of any of the speakers and the
P1454 / 99MX persistence, which is the short period of time that immediately follows a pause in the conversation. The present invention utilizes the state determination of the echo canceller in various functional blocks within the vocoder. Of particular importance is the use of the state determination signal in the function of the noise suppressor within the vocoder. In the modality ej plificativa, the noise suppressor operates by dividing the input signal into selected frequency bands, and by generating a signal-to-noise ratio for each frequency band and then amplifying each frequency band in accordance with a predefined gain table. The speech / noise determination is made as follows. The raw estimates of the signal-to-noise ratio for each frequency band is used to index a voice metric table to obtain metric voice values for each channel. A voice metric is a measurement of the voice-type global characteristics of the channel energy. The values of the individual channel voice metric are summed to create a final energy parameter and are then compared to an updated threshold of the background noise. If the sum of the voice metric does not satisfy the threshold, the input box is considered to be noise and the background noise is updated. If the sum of the voice metric exceeds the
P1454 / 99MX threshold, then that box is treated as voice and the estimated background noise is not updated. Problems may arise if the noise suppressor treats the residual echo from the echo canceller as background noise. In this case, the algorithm for noise estimation will recalculate the background noise based on the residual echo, which would alter the noise estimate. The present invention eliminates this problem by providing a status information signal from the echo canceller, which disables or disables updates of background noise in the noise suppressor when it is determined that the talk mode is only from the far end. Without the status information from the echo canceller, the noise suppressor will erroneously update the calculation of the background noise based on the residual echo signal from the echo canceller. In an alternative embodiment, a second signal from the echo canceller is supplied to the noise suppressor which indicates whether any amount of echo is actually present at the input to the echo canceller. The second signal will allow the background noise estimates to be made in the noise suppressor if there is no echo present in the echo canceller, even when the status information signal deactivates or disables the update in some way. P1454 / 99 X In addition, in the present invention, the state determination of the echo canceller is used to control the function of the tone detector within the vocoder. The tone detector checks the DTMF tones of the transmission signal. If the tones are detected, the normal transmission signal is silenced and a signaling message is sent by air or by air, which causes tones to be generated in the receiver. This is done because a sufficiently high erasure speed can degrade a vocoded tone enough that it will not be detected. The tone detector can be deactivated or disabled by the status determination signal from the echo canceller during the conversation state only from the far end, resulting in energy savings. In addition, the present invention utilizes the state determination from the echo canceller to control the transmission mute function within the vocoder. The transmission silence replaces the PCM samples with synthesized noise that is comparable to the spectral characteristics of the real background noise. The spectral information and the volume control of the synthesized noise is supplied by the analysis carried out by the vocoder coder. The transmission mute function is activated or
P1454 / 99MX enables when the state determination of the echo canceller indicates only speaks of the far end. In this way, the entire echo of the transmission signal is eliminated. The present invention also utilizes the status determination of the echo canceller to control an adaptive equalizer. This equalizer modifies the frequency response of the received near-end signal to compensate for the degradation of the frequency response of the path or transmission path. The equalizer estimates the frequency characteristics of the transmission path during speech or near-end conversation and uses this estimate to construct a filter that makes up the overall frequency response to a desired characteristic. Since this estimate of the received frequency response would be altered by the presence of an echo signal, the echo canceller only allows the equalizer to update its estimate of the frequency response during the near-end only speech state. Finally, the present invention utilizes the state determination of the echo canceller to control the function of the background noise estimate that is performed by the vocoder coder. This estimate of background noise is used in order to generate synthesized noise information thatP1454 / 99 X will be used by the aforementioned transmission silence block, and to generate the threshold information used to decide with what data rate to encode. The aim is to equate the synthesized noise with the background noise of actual suppressed noise, so that the far-end listener forgets the periods of synthesized noise replacement. The calculation of background noise is improved by providing the echo canceller information to the background noise estimation function. The echo canceller deactivates the estimation of the background noise during the periods of synthesized noise replacement, so that in the synthesized noise the update of the background noise is not performed.
BRIEF DESCRIPTION OF THE DRAWINGS The features, objects and advantages of the present invention will be more evident from the detailed description that is set forth below when considered together with the drawings, in which the reference characters are consistently identified herein and, where: Figure 1 is a functional block diagram of a mobile digital telephone; Figure 2 is a functional block diagram of an echo canceller and a vocoder; P1454 / 99 X Figure 3 is a functional block diagram of an echo canceller; Figure 4 is a functional block diagram of a noise suppressor; Figure 5 is a functional block diagram of a tone detector; Figure 6 is a functional block diagram of a transmission silence processor; and Figure 7 is a functional block diagram of a vocoder coder.
DETAILED DESCRIPTION OF THE PREFERRED MODALITIES Figure 1 is a total block diagram of a digital or PCS cell phone. For simplicity in explanation, only a subset of elements is shown. The digital cell phone consists of the headset 6, which includes the microphone 4 and the speaker 2; of the analog to digital converter (A / D) 8; of echo canceller 10; of the vocoder 12; of transceiver 14; and the antenna
16. It should be understood that for the system other architectures can be used with the mere change in the location or position of the various operative elements. During transmission, speech from the near end is received by the microphone 4 provided in the earphone 6. The near-end speech signal
P1454 / 99MX is transformed by microphone 4 into an electroacoustic signal represented by v (t), as shown in Figure 1. The received far end speech signal x (t) is acoustically coupled to the speech signal. (t) in adder 5, which is modeled as when passing x (t) through the unknown echo channel 7 produces the echo signal y (t). The output of adder 5 is shown to combine the speech / echo signal v (t) + and (t). The unknown echo channel 7 and the adder 5 are not elements included in the system itself but rather are parasitic results of the physical proximity of the microphone 4 and speaker 2. The speech / echo signal v (t) + y (t ) is then converted from an analog signal into PCM samples by the Analog to Digital converter 8. In an exemplary embodiment, the PCM samples are output by the A / D converter 8 at a rate of 64 kbits per second and are represented by the signal s (n), as shown in Figure 1. The echo canceller 10 removes the echo signal and (t) from the digitized speech / echo signal s (n). In the embodiment and emplificativa, the echo canceller 10 operates in accordance with the echo canceller described in the aforementioned U.S. Patent No. 5,307,405. In the exemplary embodiment, the echo canceller 10
P1454 / 99 X cancels the echo by determining in which of the various states of conversation the speakers are, the states are only speaking of the near end, only speaking of the far end, voices or simultaneous conversation of the two ends, the near and the far, there is no conversation on any of the speakers or persistence. Once the conversation state is determined by the echo canceller 10, the estimated echo signal y (n) is removed from the digitized speech / echo signal s (n). Because the echo signal can not be completely eliminated, a residual echo signal will remain as part of the digitized speech signal. This canceled echo speech signal, s' (n), is then processed by the vocoder 12. In the implicit mode, the vocoder 12 is a variable speed code driven linear prediction vocoder (CELP), as described in the aforementioned U.S. Patent No. 5,414,796. In the implicit mode, the vocoder 12 operates in conjunction with a noise suppression system, as described in detail in the aforementioned U.S. Patent No. 4,811,404. The vocoder 12 performs various functions on the signal s' (n), which include, but are not limited to, speech compression, noise suppression, transmission volume control and
P1454 / 99MX reception, DTMF tone detection and transmission silence. In the present invention, the vocoder 12 uses the state determination results from the echo canceller 10, shown as "status information" in Figure 1, of its algorithm to decide when to update its background noise estimate. . Additional details of the echo canceller 10 and the vocoder 12 are shown in Figure 2 and will be discussed later in the present in greater detail. The vocoded speech signal, s "(n), is then supplied to the transceiver 14, where it is modulated in accordance with a predetermined modulation format, such as, for example, the Code Division Multiple Access (CDMA), the Time Division Multiple Access (TDMA), Frequency Division Multiple Access (FDMA) or analog modulation In the exemplary embodiment, transceiver 14 modulates the signal in accordance with the CDMA modulation format, as described in the aforementioned U.S. Patent No. 4,901,307. The transceiver 14 then overconverts and amplifies the modulated signal.The modulated signal is then transmitted through the antenna 16 to the transceivers of the base station (not shown). reciprocal similar to the
P1454 / 99MX received speech. The signal with CDMA modulation is received on the antenna 16 and provided to the transceiver 14. The transceiver 14 amplifies, subverts and demodulates the received signal. In the implicit mode, the transceiver 14 demodulates the received signal in accordance with the demodulation format of CDMA, as described in the aforementioned U.S. Patent Nos. 5,103,459 and 4,901,307. The demodulated signal, z "(n), is supplied to the vocoder 12. In the exemplary embodiment, the vocoder 12 receives coded variable-length data packets every 20 ms at data rates ranging from 1200-9600 bps. decodes the packets in 64 kbps PCM samples, in accordance with the aforementioned U.S. Patent No. 5,414,796, Then, the decoded signal, z '(n), is supplied to echo canceller 10, where it is used as reference to eliminate the unwanted echo signal and (t) from the desired speech signal The decoded signal emitted from the echo canceller 10 is shown as z (n) in Figure 1. Finally, the decoded signal z (n) is converted into an analog waveform by the A / D converter 8, which is then converted into speech from the far acoustic end using the loudspeaker 2
P1454 / 99 X provided in the handset 6. Figure 2 is a functional block diagram of the echo canceller 10 and the vocoder 12. In a preferred embodiment, the echo canceller 10 and the vocoder 12 are configured as a processor digital, such as the ADSP-2181 model of the ADSP-2100 series of digital signal processors manufactured by Analog Devices of Norwood, Massachusetts. It should be understood that other digital signal processors may be programmed to function in accordance with the teachings herein. Alternatively, other implementations of echo canceller 10 and vocoder 12 may be configured differently to discrete processors or in the form of a specific application integrated circuit (ASIC). It should also be understood that the vocoder 12 can be configured using any combination of functional blocks shown in Figure 2. During transmission, the digitized speech / echo signal s (n) is received by the Tx 52 PCM filters of the A / A converter. D 8. The low frequency components are removed by filtration, because the echo canceller 10 can not synthesize a DC component. The filtered signal is supplied to the adder 32 within the echo canceller 10, where the estimated echo signal y '(n) is subtracted from it.
P1454 / 99MX The estimated estimated echo signal, y '(n), is produced by processing the digital speech signal z' (n) received using an adaptive filtering operation performed within the echo canceller 10. In the aforementioned Patent No. 5,307,405 an example of the echo canceller 10 is disclosed. The details of the echo canceller 10 will be described later in greater detail below. The output produced by the echo canceller
contains the desired digitized speech signal plus a residual signal that was left of the echo cancellation process. The residual signal will be present because the echo canceller can never completely eliminate the entire echo of the digitized speech signal. The emitted signal is then supplied to the tone detector 34, where the verification is made as to whether the signal contains DTMF tones. If the signal contains DTMF tones, the Tx silence 42 is activated by the tone detector 34 and the transceiver 14 is instructed to send DTMF tone signals. To save calculations, the tone detector 34 is derived or surrounded if the echo canceller 10 determines that the conversation state is only from the far end or if both speakers are silent. On the telephone, the output signal of the
P1454 / 99MX echo canceller 10 is then processed by noise suppressor 38, which attenuates strong background noise. Alternatively, in the base station, an adaptive equalizer is used in place of the noise suppressor 38 to dynamically control the frequency content of the digitized speech signal from the near-end user. In co-pending US Patent Application Serial No. 08 / 456,277, filed on April 28, 1995, entitled "METHOD AND APPARATUS FOR PERFORMING ADAPTIVE EQUALIZATION", assigned to the assignee of the present invention and incorporated herein by reference reference, an example of an adaptive equalizer is revealed. An example of the noise suppressor 38 is disclosed in the aforementioned U.S. Patent No. 4,811,404. It should be understood that other implementations than the noise suppressor 38 disclosed in U.S. Patent No. 4,811,404 may be used. The noise suppressor 38 updates its estimate of background noise characteristics by measuring the spectral characteristics of the input signal. The present invention provides the state determination signal of the echo canceller 10 to assist in the decision to update the background noise estimate. Allow the echo canceller to assist in enabling and disabling the
P1454 / 99MX update of the background noise estimate provides significant advantages that will be obvious later in the present. The suppressed noise speech signal from the noise suppressor 38 is then supplied to the Tx silencer 42, which, when enabled, replaces the digitized speech signal with synthesized noise, which in the preferred embodiment matches the spectral characteristics of the noise of real background. If the Tx silencer 42 is deactivated, the speech signal is supplied to the vocoder encoder 44 unchanged. The Tx muffler 42 is activated by echo canceller 10 during the far end only talk state. The speech signal is then transmitted from the Tx muffler 42 to the vocoder coder 44. In the aforementioned U.S. Patent No. 5,414,796, an example of vocoder coder 44 and vocoder decoder 46 is disclosed. In the e-modality, the vocoder 44 encoder accepts digitized speech samples at 64 kbps and compresses them to achieve a reduced data rate. This is achieved by eliminating all the inherent natural redundancies in speech. The basis of this technique is to calculate the parameters of a filter, called the LPC filter, which makes short-term predictions of the form of
P1454 / 99MX speech wave, using a model of the human vocal tract. In addition, the long-term effects, related to the speech rate, are modeled when calculating the parameters of a pitch filter, which essentially models the human vocal cords. Finally, these filters must be excited and this is done by determining which of the waveforms of various randomly excited waveforms in a codebook results in the closest approximation to the original speech when the waveform excites the two filters mentioned above. An estimate of background noise is also made within the vocoder 44 encoder, which estimates the energy of background noise during periods of silence. Since the estimate of the background noise should only be updated in the actual background noise, it is desirable to use the status information signal from the echo canceller 10 to determine when both the near-end and the near-end loudspeakers are silent. far end. Without this information from the echo canceller 10, the estimate of the background noise can be updated even when the synthesized noise is supplied by the Tx silencer 42, which is undesirable. Subsequently, additional details of the vocoder coder 44 will be provided herein. In the receiving direction, referring to
P1454 / 99MX again to Figure 2, the data of the transceiver 14 is accepted and processed in the vocoder decoder 46. In the exemplary embodiment, the vocoder decoder 46 accepts variable length data packets at data rates ranging from 1200 to 9600 bps or 1200 to 13000 bps and produces 64 kbps PCM samples, in accordance with the aforementioned U.S. Patent No. 5,414,796 and shown as z '(n). These PCM samples are then routed through the echo canceller 10 to the A / D converter 8. The Z '(n) is also used by the echo canceller 10 as a reference signal to cancel the echo in the Tx direction. The output of the echo canceller 10 in the direction of Rx is shown as z (n). To better understand the present invention, knowledge of the work of the various functional blocks is needed. Figure 3 is a detailed block diagram of echo canceller 10. In the aforementioned U.S. Patent No. 5,307,405 an example of echo canceller 10 is disclosed. It should be understood that in the exemplary embodiment, echo canceller 10 it is essentially a state machine that has defined functions for each of the five different conversation states described above. In Figure 3, as for Figure 2, the speech signal of the mobile station is labeled
P1454 / 99MX as the near-end speech s (n), while the far-end speech signal of the Rx 50 PCM filters is labeled as z '(n). Z '(n) is amplified by the variable gain stage 170 and is coupled as (n) in the adder 5, as modeled by passing through the unknown echo channel 7. To eliminate the low frequency background noise, the sum of the echo signal y (n) and the near-end speech signal s (n ") is subjected to high pass filtering by the Tx 52 PCM filters to produce the signal r (n). signal r (n) is supplied as an input to each of the adders 32 and 150 and to the control unit 152. The speech of the far input end z '(n) is fed to the variable gain stage 170 and stored then in the buffer memory 154 to enter it into a set of universal adaptive filters (initial filter 156, status filter 158 and echo canceller filter 160) and to the control unit 152. During the normal operation period of the? echo 10, the signal y1 (n) is output from the status filter 158 to an input of l adder 150, where it is subtracted from the signal r (n). The resulting output of the adder 150 is the signal ei (n) which is the input to the control unit 152. The output of the echo canceller filter 160, the replica signal of
? echo and (n), is supplied through the switch
P1454 / 99MX filter 162 to an input of adder 32, where it is subtracted from the signal r (n). The resulting residual echo signal e (n) leaving the adder 32 is again fed as an input to the control unit 152. The residual echo signal e (n) as issued from the adder 32 can be supplied directly as the output of the echo canceller 10, shown as s' (n) or through additional processing elements, not shown. To prevent large levels of background noise from interfering with the state determination, the echo canceller 10 executes a differential energy algorithm on the signals z1 (n) and e (n). This algorithm continuously monitors the level of background noise and compares it with the energy of the signal to determine if the speaker is conversing. Three thresholds, T? (Bi), T2 (Bj.) And T3 (B?), Were first calculated which are functions of the background noise level B. If the energy of the signal x (n) exceeds the three thresholds, it is determined that the speaker is conversing. If the signal energy exceeds Ti and T2 but not T3, it is determined that the speaker is probably pronouncing a voiceless sound, such as the "sp" sound of the word "speed". If the energy of the signal is smaller than the three thresholds, it is determined that the speaker is not conversing. As illustrated in Figure 3, two filters
P1454 / 99MX that adapt independently, filters 158 and 160, trace to the unknown echo channel. While the filter 160 real cancels the echo, the filter 158 is used by the control unit 152 to determine which of the various states of the echo canceller 10 should be operating. This status information is supplied to the various functional blocks within the vocoder 12, including the tone detector 34, the adaptive noise suppressor / equalizer 38, the Tx silencer 42 and the vocoder coder 44. Figure 4 is a flow chart. functional blocks of the noise suppressor 38. An example of the noise suppressor 38 is disclosed in the aforementioned U.S. Patent No. 4,811,404. It should be understood that other implementations other than the noise suppressor 38 may be used to that disclosed in U.S. Patent No. 4,811,404. The noise suppression system includes a mechanism 210 for separating the input signal into a plurality of pre-processed signals, representative of the selected frequency channels; a mechanism 310 generates an estimate of the signal to noise ratio (SNR) in each individual channel; a mechanism 830 calculates noise energy in each frequency channel; a 590 mechanism produces a gain value for each channel
Individual P1454 / 99MX by automatically selecting a value from a plurality of gain values of a particular gain table in response to the SNR estimates of the channel; a mechanism 250 modifies the gain of each of the signals of the plurality of pre-processed signals in response to the selected gain values to provide a plurality of suppressed and post-processed noise output signals; and a mechanism 260 combines the post-processed signals returned to the PCM data of the time domain. The voice metric calculator 810 is used to perform the speech / noise decision making process. First, the estimates of the raw SNR of the channel SNR estimator 310 are used to index a table of the voice metric to obtain values of the voice metric for each channel. Voice metric is a measurement of the voice-type global characteristics of the channel energy. The voice metric values of the individual channel are summed to create a first multichannel energy parameter and are then compared to a refresh threshold threshold in the threshold comparator 820. If the sum of the speech metric does not exceed At the threshold, the input frame is considered to be noise and the background noise is updated by enabling the 830 noise energy calculator to recalculate the noise energy in each channel. The
P1454 / 99MX estimated noise energy is used by the 590 gain table to select the appropriate gain for each channel. If the sum of the voice metric exceeds the update threshold, that frame is considered to be a voice frame and the noise energy calculator 830 is deactivated from updating to the noise energy estimate. The present invention provides an additional enable signal of the echo canceller 10 which deactivates the noise energy calculator 830 when the echo canceller 10 determines that only the far end speech is occurring. This enabling signal takes precedence over the enabling signal from the threshold comparator 820; that is, if the noise energy calculator 830 is deactivated by the signal of the echo canceller 10, it will remain disabled even when a threshold comparator 820 enabling signal is provided. Use the status information of the echo canceller 10 of this way, it prevents the estimate of background noise from being updated erroneously. In a second embodiment, echo canceller 10 provides an enable signal to energy calculator 830, which enables estimates of background noise when it is determined that the conversation state is muted for both loudspeakers. Without the echo canceling enable signal
P1454 / 99MX 10, no update of background noise will occur. In a third embodiment, a second signal from the echo canceller is supplied to the noise suppressor indicating whether in fact some echo amount is present at the input to the echo canceller. As shown in Figure 4, the second signal is labeled "echo preset?" and allow background noise estimates to be made if there is no echo present in the echo canceller input, even if the status information signal somehow disables the update. This mode is necessary if it is desirable to update the estimate of background noise during the far-end talk state only when far-end speech does not echo the transmission signal. In the base station, an adaptive filter is used in place of the noise suppressor 38. The purpose of the adaptive filter is to alter the near-end speech to compensate for the frequency degradation in the transmission from the near end to the far-end speaker. The coefficients of the adaptive filter are updated during periods of near-end only speech. The status information from the echo canceller 10 can be used to enable this update when it is detected only speaks at the near end. P1454 / 99MX The status determination information provided by the echo canceller 10 is also used to control the tone detector 34. As shown in Figure 5, the tone detector 34 is comprised of a functional block, the detector of DTMF tone 70. In the exemplary embodiment, the data with PCM coding is received by the DTMF tone detector 70 at 64 kbps, when this is operated every 105 data frames. The DTMF 70 tone detector uses the Goertzel algorithm with frequency and displacement tests specified in the AT &T application note titled "Dual-Tone Multifrequency Receiver Using the WE DSP16 Digital Signal Processor", to determine whether or not they are no DTMF tones present The Goertzel algorithm and the application note of the AT &T are well known to those skilled in the art. If DTMF tones are detected, the DTMF tone detector 72 sends a signal to the Tx squelch 42 which instructs the Tx squelch 42 to replace the DTMF tones with synthesized noise. The PCM data is then sent unchanged to the noise suppressor 38 even when they are subsequently silenced because the background noise estimate can still be updated by the noise suppressor 38 during the pauses between DTMF tones. The status information of the echo canceller 10 is used to deactivate the tone detector
P1454 / 99MX DTMF 70 if the echo canceller 10 determines that only the far end conversation is occurring or if both speakers are silent. This results in a saving in processing power. When the DTMF tone detector 70 is deactivated, the PCM data of the echo canceller 10 is unchanged and is supplied to the noise suppressor 38. In a second mode, the status information of the echo canceller 10 is used to activate the echo detector. tone 70 if echo canceller 10 determines that only near end speech is occurring. For all other conversation states, the tone detector 70 would be deactivated. The state determination signal of the echo canceller 10 is also used to control the Tx squelch 42. As shown in Figure 6, the PCM data is received by the switch 76. If the echo canceller 10 detects only the speech at the far end, a signal is sent to the switch 76 which replaces the PCM data with synthesized noise from the synthesized noise generator 74. The synthesized noise generator 74 uses LPC parameters and vocoder code 44 volume information to match the spectral characteristics of the noise of real background. Subsequently, an analysis of the LPC parameters and the information of the
P1454 / 99MX volume control. If silence does not occur, the function of the Tx mute is derived or surrounded, allowing the PCM data to be sent unchanged to the vocoder coder 44. The status determination function of the echo canceller 10 is also used to control the vocoder encoder 44. A functional block diagram of the vocoder coder 44 is shown in Figure 7. The PCM data of the Tx silencer 42 is supplied to the speech activity detector 80 and the threshold generator 78. The speech activity detector 80 calculates the amount or amount of voice activity of the PCM data signal. When the near-end speaker is conversing, the voice activity is relatively high. During periods of near-end silence or short pauses between words, voice activity is relatively low. The threshold generator 78 calculates three levels of thresholds based on the level of background noise of the PCM data of suppressed noise. The threshold levels are updated each time the voice activity detector determines a minimum level of speaker activity. However, if the state determination of the echo canceller 10 indicates that the conversation state is only speaking of the far end, a status determination signal of the echo canceller 10 is provided to the
P1454 / 99MX threshold generator 78 that disables the updating of background noise. It is necessary to avoid updating the background noise in that situation, because when the near-end speaker is silent, the synthesized noise replaces the actual data signal in the Tx 42 muffler, as discussed above. It is not desirable to update the estimated background noise signal based on the synthesized noise. In a second embodiment, the echo canceller 10 provides an enabling signal that enables the threshold generator 78 to perform the background noise estimates when it is determined that the conversation state of the two loudspeakers is silent. In this mode, background noise updates will not be made unless the enabling signal is provided by the echo canceller 10. The three calculated thresholds mentioned above are sent to the threshold comparator 82 where they form the basis for the decision of Speed coding The level of voice activity is compared to these thresholds frame by frame. In the exemplary embodiment, each frame contains 160 samples or 20 msec data. If the energy of the voice activity exceeds the highest threshold during some PCM data frame, it is determined that the near end speaker is conversing and
P1454 / 99MX that the frame is multiplexed through the mux 84 and is encoded at full speed using the CELP 86. If the energy of the voice activity during any frame is lower than the lower threshold, that frame is multiplexed through the mux 84 and is encoded at one-eighth speed using CELP 92. If the voice activity energy during any frame falls between the highest threshold and the lowest threshold, that frame is coded at a rate of either one half or one a quarter using CELP 86 and CELP 88, respectively. The output of each of the CELP processing blocks 86 to 92 are supplied to the post-processing element 94, where they are combined to produce a variable data rate signal between, in the exemplary embodiment, 1.2 kbps and 9.6 kbps . The output of the post-processing element 94 is sent to the control microprocessor (not shown). The prior description of the preferred embodiments is provided to enable any person skilled in the art to prepare or use the present invention. The various modifications to these modalities will be readily apparent to those of experience in the field and the generic principles defined herein may be applied to other modalities without the use of the inventive faculty. Thus, it is not intended that the present invention be limited to the embodiments shown herein but
P1454 / 99MX that is in accordance with the broadest scope consistent with the novel principles and features disclosed here.
P1454 / 99MX
Claims (6)
- NOVELTY PE THE INVENTION Having described the present invention, it is considered as a novelty and, therefore, the content of the following CLAIMS is claimed as property; 1. An apparatus for speech processing in a digital telephone system, comprising: an echo canceller for receiving a speech signal plus digitized echo, for receiving a far-end speech signal and for providing an echo output signal deleted, the echo canceller comprises; a state determining means for determining in which conversation state two loudspeakers are located, the state determining means provides a status information signal indicative of the conversation states; and a speech processing means for receiving the suppressed echo output and the status information output signal and for using the status information to control the speech processing means. The apparatus according to claim 1, wherein the speech processing means comprises: a tone detector; a noise suppressor; a transmission silencing means; and P1454 / 99MX a vocoder encoder. The apparatus according to claim 1, wherein the speech processing means comprises: a tone detector; an adaptive equalizer; a transmission silencing means; and a vocoder encoder. The apparatus according to claim 1, wherein the speech processing means is a tone detector. The apparatus according to claim 4, wherein the tone detector comprises: an input means for receiving samples of the digitized speech and for receiving the status information signal; an output means for providing samples of the digitized speech and for providing a signal indicative of the selection and duration of the DTMF tone; a tone detection means for detecting DTMF tones; and a controller means for deactivating the tone detecting means when the status information signal indicates a speech-only state for the far end. The apparatus according to claim 4, wherein the tone detector comprises: an input means for receiving samples of P1454 / 99MX speaking at the far end. The apparatus according to claim 7, wherein the noise suppressor comprises: an input means for receiving the digitized speech samples and for receiving the status information signal; an output means for providing the digitized speech signal with suppressed noise; a means of estimating background noise to generate a signal of the estimated background noise used to suppress background noise; and a controlling means for enabling the background noise estimation means when the status information signal indicates both speakers are silent. The apparatus according to claim 1, wherein the speech processing means is a transmission silencing means. The apparatus according to claim 10, wherein the transmission silencing means comprises: an input means for receiving samples of the digitized speech and for receiving the status information signal; an output means for providing any of the digitized speech samples or a synthesized noise signal; a means of generating noise to generate P1454 / 99MX the synthesized noise signal; and a controlling means for replacing the digitized speech samples with the synthesized noise when the status information signal indicates only speech at the far end. 12. The apparatus according to claim 1, wherein the speech processing means is a vocoder coder. The apparatus according to claim 12, wherein the vocoder encoder comprises: an input means for receiving the digitized speech samples and for receiving the status information signal; an output means for providing a digital speech packet encoded at a reduced data rate; a means for estimating the background noise to generate the threshold information used to determine at what speed to encode the digitized speech samples; and a controller means for deactivating the background noise estimation means when the status information signal indicates only speech at the far end. The apparatus according to claim 12, wherein the vocoder encoder comprises: an input means for receiving the digitized speech samples and for receiving the P1454 / 99MX status information signal; an output means for providing a digital speech packet encoded at a reduced data rate; a means for estimating the background noise to generate the threshold information used to determine at what speed to encode the digitized speech samples; and a controlling means for activating the background noise estimation means when the status information signal indicates that both speakers are silent. 15. The apparatus according to claim 1, wherein the speech processing means is an adaptive equalizer. The apparatus according to claim 15, wherein the adaptive equalizer comprises: an input means for receiving the digitized speech samples and for receiving the status information signal; an output means for providing a compensated frequency digitized speech signal; a frequency estimation means for estimating the spectral content of the digitized speech samples; and a controlling means for activating the frequency estimation means when the status information signal indicates only speech in the P1454 / 99MX far end. The apparatus according to claim 1, wherein the echo canceller further comprises: an echo detection means for determining the presence or absence of echo at the input to the echo canceller; and an output means for providing a signal indicative of the presence or absence of echo at the input to the echo canceller. The apparatus according to claim 15, wherein the speech processing means is a noise suppressor, comprising: an input means for receiving digitized speech samples, for receiving the status information signal and for receiving the signal indicative of whether there is echo present or not in the echo canceller input; an output means for providing a digitized speech signal of suppressed noise; a means of estimating background noise to generate a signal of the estimated background noise used to suppress background noise; and a controlling means for activating the background noise estimation means when the status information signal indicates only speech at the far end and the echo detection means indicates that there is no echo present. 19. In a speech processing device P1454 / 99MX comprising an echo canceller and a digital processing element, a method for controlling the operation of the digital processing element using the status information from the echo canceller, the method comprises the steps of: generating an information signal of state by the echo canceller, indicative of a plurality of conversation states; control the digital processing element using the status information signal from the echo canceller. The method according to claim 19, wherein the step of controlling is further comprised of the steps of: deactivating the function of the tone detector within the digital processing element when the status information signal indicates only speech at the end year; deactivate the calculation of the background noise estimation performed by the noise suppression function within the processing element, when the state information signal indicates only speech at the end of the anode; activate the transmission mute function that replaces digitized speech with synthesized noise, when the status information signal indicates only speech at the far end; and disable the calculation of the estimate of P1454 / 99MX background noise made by the vocoder encoder function within the digital processing element. The method according to claim 19, wherein the step of controlling is further comprised of the steps of: deactivating the function of the tone detector within the digital processing element when the state information signal indicates only speaks of the far end. 22. The method according to claim 19, wherein the step of controlling is further comprised of the steps of: activating the function of the tone detector wn the digital processing element when the status information signal indicates only speech at the near end. The method according to claim 19, wherein the step of controlling is further comprised of the steps of: deactivating the calculation of the background noise estimate performed by the noise suppression function wn the processing element, when the signal of status information indicate only speak at the far end. The method according to claim 19, wherein the step of controlling is further comprised of the steps of: P1454 / 99MX activating the calculation of the background noise estimate performed by the noise suppression function wn the processing element, when the status information signal indicates that both speakers are silent. The method according to claim 19, wherein the step of controlling is further comprised of the steps of: activating the transmission silence function that replaces the digitized speech wsynthesized noise, when the status information signal indicates only speech in the far end and deactivate the transmission mute function for all other conversation states. 26. The method according to claim 19, wherein the step of controlling is further comprised of the steps of: deactivating the calculation of the background noise estimate performed by the vocoder encoder function wn the digital processing element, when the state of information indicates it only speaks at the far end. 27. The method according to claim 19, wherein the step of controlling is further comprised of the steps of: activating the calculation of the background noise estimate performed by the * function of the vocoder encoder wn the element of P1454 / 99MX digital processing, when the status information signal indicates that both speakers are silent. The method according to claim 19, wherein the step of controlling is further comprised of the steps of: activating the update of the frequency response performed by the adaptive equalizer function within the digital processing element, when the information signal state indicates only talk at the near end. 29. The method according to claim 19, further comprising the steps of: generating an echo signal present by the echo canceller indicating whether or not echo is present at the input to the echo canceller; and controlling the digital processing element using the echo canceller information signal and the present echo signal. 30. The method according to claim 29, wherein the step of controlling is further comprised of: deactivating the calculation of the background noise estimation effected by the noise suppression function within the digital processing element, when both the signal of status information indicate only speak at the far end as the present echo signal indicates that there is echo present P1454 / 99MX at the entrance to the echo canceller; and activating the calculation of the background noise estimate when both the status information signal indicates only speech at the far end as the present echo signal indicates that there is no echo present at the echo canceller input. P1454 / 99MX
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08834397 | 1997-01-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
MXPA99007002A true MXPA99007002A (en) | 2000-01-01 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5920834A (en) | Echo canceller with talk state determination to control speech processor functional elements in a digital telephone system | |
EP0861531B1 (en) | Acoustic echo elimination in a digital mobile communications system | |
EP1208689B1 (en) | Acoustical echo cancellation device | |
JP3447735B2 (en) | Network echo canceller | |
US7031269B2 (en) | Acoustic echo canceller | |
US9294851B2 (en) | Hearing assistance devices with echo cancellation | |
WO2000016497A1 (en) | Echo canceler adaptive filter optimization | |
US7558729B1 (en) | Music detection for enhancing echo cancellation and speech coding | |
JP2003514264A (en) | Noise suppression device | |
JP2001251652A (en) | Method for cooperatively reducing echo and/or noise | |
US6711259B1 (en) | Method and apparatus for noise suppression and side-tone generation | |
MXPA99007002A (en) | Method and apparatus for using state determination to control functional elements in digital telephone systems | |
WO2001019062A1 (en) | Suppression of residual acoustic echo | |
EP1341365A1 (en) | Method and arrangement for processing a speech signal | |
Eriksson et al. | Mobile crosstalk control—Enhancing speech quality in digital cellular networks |