WO2005124739A1 - 雑音抑圧装置および雑音抑圧方法 - Google Patents
雑音抑圧装置および雑音抑圧方法 Download PDFInfo
- Publication number
- WO2005124739A1 WO2005124739A1 PCT/JP2005/009859 JP2005009859W WO2005124739A1 WO 2005124739 A1 WO2005124739 A1 WO 2005124739A1 JP 2005009859 W JP2005009859 W JP 2005009859W WO 2005124739 A1 WO2005124739 A1 WO 2005124739A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- power spectrum
- noise
- band
- pitch harmonic
- voicedness
- Prior art date
Links
- 230000001629 suppression Effects 0.000 title claims abstract description 72
- 238000000034 method Methods 0.000 title claims description 31
- 238000001228 spectrum Methods 0.000 claims abstract description 176
- 238000001514 detection method Methods 0.000 claims abstract description 44
- 238000012935 Averaging Methods 0.000 claims description 15
- 238000000605 extraction Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 230000002265 prevention Effects 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 8
- 241000270666 Testudines Species 0.000 claims description 3
- 238000012937 correction Methods 0.000 abstract description 13
- 230000008439 repair process Effects 0.000 abstract description 4
- 239000000284 extract Substances 0.000 abstract description 3
- 239000011295 pitch Substances 0.000 description 94
- 238000004364 calculation method Methods 0.000 description 17
- 238000010586 diagram Methods 0.000 description 14
- 230000003595 spectral effect Effects 0.000 description 6
- 230000005236 sound signal Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 101150078996 HBZ gene Proteins 0.000 description 3
- 102100030387 Hemoglobin subunit zeta Human genes 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000005311 autocorrelation function Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 206010019133 Hangover Diseases 0.000 description 1
- 230000005534 acoustic noise Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000002542 deteriorative effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000011410 subtraction method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
Definitions
- the present invention relates to a noise suppression device and a noise suppression method, and more particularly to a noise suppression device and a noise suppression method used in a voice communication device and a voice recognition device for suppressing background noise.
- a low bit rate speech coding apparatus can provide high-quality speech communication for speech without background noise, but can provide low-quality speech for speech including background noise. Unpleasant distortion peculiar to the bit rate encoding may occur, thereby deteriorating sound quality.
- ss method a spectral subtraction method
- sin method a spectral subtraction method
- the spectral characteristics of the estimated noise component are regarded as stationary, and the speech power spectrum is uniformly subtracted as a noise base.
- the spectral characteristics of the noise components are not stationary, so that residual noise after noise-based subtraction, particularly residual noise between voice pitches, may cause unnatural distortion called so-called musical noise.
- Patent Document 1 Japanese Patent No. 2714656
- Patent Document 2 Japanese Patent Publication No. 10-513030
- Non-Patent Document 1 "Suppression of acoustic noise in speech using spectral subtraction", Boll, IEEE Trans. Acoustics, Speech, and Signal Processing, vol. ASSP—27, pp.113—120, 1979
- the present invention has been made in view of the power, and an object of the present invention is to provide a noise suppression device and a noise suppression method capable of improving noise suppression accuracy while reducing voice distortion.
- a noise suppression device of the present invention includes a suppression unit that suppresses the noise component from the speech power spectrum using detection results of a sound band and a noise band in the speech power spectrum including a noise component, and the speech power spectrum.
- Spectral power Extraction means for extracting a pitch harmonic power spectrum
- voicedness determination means for determining voicedness of the speech path vector based on the extracted pitch harmonic power spectrum
- extracted pitch harmonic power spectrum Restoration means for restoring a vector, and a pitch harmonic power spectrum selected from the restored pitch harmonic power spectrum and the extracted pitch harmonic power spectrum in accordance with the result of the judgment by the voicedness judgment means.
- correcting means for correcting the detection result.
- a noise suppression method is a noise suppression method for suppressing the noise component from the speech power spectrum using detection results of a sound band and a noise band in the speech power spectrum including the noise component,
- a noise suppression program is a noise suppression program that suppresses the noise component from the speech power spectrum using detection results of a sound band and a noise band in the speech power spectrum including a noise component.
- FIG. 1 is a block diagram showing a configuration of a noise suppression device according to Embodiment 1 of the present invention.
- FIG. 2A Diagram showing detection results of sound band and noise band
- FIG. 2B is a diagram showing an extraction result of a pitch harmonic power spectrum.
- FIG. 2C is a diagram showing a result of extraction of a peak of a pitch harmonic.
- FIG. 2E A diagram showing a correction result of the detection result shown in FIG. 2A.
- FIG. 3 is a block diagram showing a configuration of a noise suppression device according to Embodiment 2 of the present invention.
- FIG. 4 is a block diagram showing a configuration of a noise suppression device according to Embodiment 3 of the present invention.
- FIG. 5 is a block diagram showing a configuration of a noise suppression device according to Embodiment 4 of the present invention.
- FIG. 6 is a flowchart illustrating an operation of the noise suppression apparatus according to Embodiment 4 of the present invention.
- FIG. 1 is a block diagram showing a configuration of a noise suppression device according to Embodiment 1 of the present invention.
- the noise suppressing apparatus 100 includes a windowing section 101, an FFT (Fast Fourier Transform) section 102, a noise base estimating section 103, a band-based sound Z noise detecting section 104, and a pitch harmonic structure extracting section 105.
- Voicedness judgment section 106 pitch frequency estimation section 107, pitch harmonic structure restoration section 108, voiced Z noise correction section 109 for each band, subtraction Z attenuation coefficient calculation section 110, multiplication section 111 and IFFT (Inverse Fast Fourier Transform) Part 112
- Windowing section 101 divides an input audio signal including a noise component into frames in a predetermined time unit, applies a windowing process to the frame using a Hung window, and outputs the frame to FFT section 102. I do.
- FFT section 102 performs FFT on a frame input from windowing section 101, that is, an audio signal divided into frame units, and converts the audio signal into a frequency domain. As a result, a speech power spectrum is obtained. Therefore, the audio signal of each frame is an audio spectrum having a predetermined frequency band.
- the speech power spectrum in which the frame power is also generated in this manner is obtained by the noise-based estimator 103, the band-specific sound Z noise detector 104, the pitch harmonic structure extractor 105, the pitch frequency estimator 107, Output to calculation section 110 and multiplication section 111.
- Noise-based estimating section 103 estimates a frequency amplitude spectrum of a signal containing only a noise component, that is, a noise base, based on the input speech power spectrum.
- the estimated noise base is output to band-specific voiced Z noise detection section 104, pitch harmonic structure extraction section 105, voicedness determination section 106, pitch frequency estimation section 107, and subtraction Z attenuation coefficient calculation section 110.
- noise-based estimating section 103 generates, for each frequency component of the frequency band of the audio power spectrum, the audio power spectrum generated from the latest frame from FFT section 102 and the audio power spectrum generated from the previous frame. Compare the voice spectrum with the estimated noise base. If the result of the comparison indicates that the difference between the two exceeds a preset threshold, it is determined that the latest frame contains an audio component, and the noise-based frame is determined. No estimation is performed. On the other hand, if the difference does not exceed the threshold value, it is determined that the latest frame contains an audio signal! / ⁇ , and the noise base is updated.
- Band-based speech Z noise detection section 104 calculates a speech band and a noise band in the speech power spectrum based on the speech spectrum from FFT section 102 and the noise base from noise base estimation section 103. To detect. The detection result is output to banded sound Z noise correction section 109.
- Pitch harmonic structure extracting section 105 extracts a voice harmonic spectrum, that is, a pitch harmonic structure, that is, a pitch harmonic spectrum, based on the speech spectrum from FFT section 102 and the noise base from noise base estimating section 103. I do.
- the extracted pitch harmonic spectrum is output to voicedness judgment section 106 and pitch harmonic structure restoration section 108.
- Voicedness determination section 106 determines the voicedness of the speech power spectrum based on the noise base from noise base estimation section 103 and the pitch harmonic power spectrum from pitch harmonic structure extraction section 105. The determination result is output to pitch frequency estimation section 107 and pitch harmonic structure restoration section 108.
- Pitch frequency estimation section 107 estimates the pitch frequency of the speech power spectrum based on the speech power spectrum from FFT section 102 and the noise base from noise base estimation section 103. Also, as a result of the determination by the voicedness determination unit 106, if the voicedness of the speech power spectrum is equal to or lower than a predetermined level, pitch frequency estimation is avoided. The estimation result is output to pitch harmonic structure restoration section 108.
- pitch harmonic structure restoring section 108 Based on the pitch harmonic pulse vector from pitch harmonic structure extracting section 105 and the estimation result from pitch frequency estimating section 107, pitch harmonic structure restoring section 108 generates a pitch harmonic structure, that is, a pitch harmonic. Repair wave power spectrum. Also, as a result of the determination by the voicedness determination unit 106, if the voicedness of the speech power spectrum is equal to or lower than a predetermined level, pitch harmonic pulse vector restoration is avoided. The restored pitch harmonic power spectrum is output to band-specific sound Z noise correcting section 109.
- the band-specific sound Z noise correction unit 109 includes a pitch harmonic power spectrum restored by the pitch harmonic structure repairing unit 108 and a pitch harmonic power spectrum extracted by the pitch harmonic structure extracting unit 105. Is selected according to the result of the determination by the voicedness determination unit 106.
- the detection result is corrected based on the pitch harmonic power spectrum. For example, as a result of the voicedness determination, when it is determined that the voicedness of the speech power spectrum is equal to or lower than a predetermined level, the extracted pitch harmonic power spectrum is selected. In this case, the detection result is corrected by combining the pitch harmonic power spectrum from the pitch harmonic structure extraction unit 105 and the detection result from the band-specific sound Z noise detection unit 104.
- band-specific sound Z noise correcting section 109 combines the pitch harmonic power spectrum from pitch harmonic structure correcting section 108 with the detection result from band-specific sound Z noise detecting section 104, Modify the detection result.
- the corrected detection result is output to subtraction Z attenuation coefficient calculation section 110.
- the subtraction Z-attenuation coefficient calculation unit 110 is based on the speech spectrum from the FFT unit 102, the noise base from the noise base estimation unit 103, and the detection result from the band-specific sound Z noise correction unit 109. , Calculate the Z attenuation coefficient. The calculated subtraction Z attenuation coefficient is multiplied by
- Multiplication section 111 multiplies the sound band and the noise band in the speech power spectrum from FFT section 102 by the subtraction Z attenuation coefficient from subtraction Z attenuation coefficient calculation section 110. As a result, a speech power spectrum in which noise components are suppressed can be obtained. The result of this multiplication is output to the single unit 112.
- the combination of the subtraction Z attenuation coefficient calculation unit 110 and the multiplication unit 111 uses the detection results of the voiced band and the noise band in the speech power spectrum including the noise component V, and the speech power spectrum power also reduces the noise component.
- a suppression unit for suppressing is configured.
- the section 112 performs an IFFT on the speech spectrum obtained as a result of the multiplication from the multiplication section 111. As a result, a speech power spectrum speech signal in which noise components are suppressed is generated.
- 2A to 2E are diagrams for explaining the operation of correcting the detection results of the sound band and the noise band.
- Voice spectrum S (k) is, c represented with the following formula (1)
- k indicates a number for specifying a frequency component of a frequency band of a speech power spectrum.
- Re ⁇ D (k) ⁇ and Im ⁇ D (k) ⁇ are the sounds after FFT conversion, respectively.
- Equation (1) uses the square root
- noise-based estimating section 103 generates a noise base based on speech power spectrum S (k).
- N (n-l, k) is the noise in the previous frame.
- ⁇ is the noise-based moving average coefficient
- ⁇ is the audio component
- the band-based sound / noise detection unit 104 determines the speech spectrum S (k) based on the speech spectrum S (k) and the noise base N (n, k). k)
- pitch harmonic structure extraction section 105 outputs speech power spectrum S
- the pitch harmonic power spectrum H (k) is calculated by using the following equation (4).
- H M (k) r F "c ' ⁇ 2 ⁇ 1 ⁇ k ⁇ HB / 2 ... (4)
- voicedness determination section 106 generates noise base N (n, k) and pitch harmonic path.
- the voicedness of the speech power spectrum S (k) is determined based on the tuttle H (k).
- the wavenumber band (1 to: HP) is set as the target band for voicedness judgment. That is, HP is the upper limit frequency component in the determination target band.
- the frequency band (1 to: HBZ2) is divided into low, middle, and high bands, and each band is used as a specific frequency band to determine voicing.
- the frequency band (1 to HBZ2) may be divided into a low band and a high band, and each band may be used as a specific frequency band to determine voicedness.
- the pitch harmonic power spectrum H (k) is extracted with high quality.
- voicedness determination section 106 has a configuration for identifying whether the original voice is a consonant or a vowel based on the voicedness determination result for each band obtained by dividing the frequency band.
- the consonants and vowels have different powers to decide whether to restore the pitch harmonic spectrum H (k).
- the voicedness judgment of the specific frequency band is performed by using the following equation (5), and calculating the sum of the values of the parts corresponding to the specific frequency in the pitch harmonic spectrum H (k). And the noise base N
- the calculation is performed by calculating the ratio between the power of the part corresponding to the specific frequency in (n, k) and the sum of the power. If the result of this determination is that the voicedness of the specific frequency band is higher than a predetermined level, pitch frequency estimation and pitch harmonic structure restoration described later are performed.
- the band-specific sound Z noise correction unit 109 uses the extracted pitch harmonic spectrum H (k) to extract the speech spectrum.
- the detection accuracy of the sound band and the noise band can be significantly improved.
- Pitch frequency estimating section 107 uses equation (6) to calculate the characteristics of noise base N (n, k).
- the restoration is performed in the following procedure when it is determined that the voiceability of a specific frequency band is higher than a predetermined level.
- Extract peaks (pl-p5, p9-pl2).
- the extraction of the pitch harmonic peak may be performed only for a specific frequency band.
- the interval between the extracted peaks is calculated. When the calculated interval exceeds a predetermined threshold value (for example, 1.5 times the pitch frequency), as shown in FIG. 2D, the pitch harmonic power spectrum H (k) is missing, Peaks based on the estimated pitch frequency m.
- a predetermined threshold value for example, 1.5 times the pitch frequency
- the band-specific sound Z noise correction unit 109 detects the detection result S (k)
- the portion that overlaps with the restored pitch harmonic power spectrum H (k) is referred to as the sound band.
- the part that overlaps with the restored pitch harmonic power spectrum H (k) is regarded as the noise band.
- the subtraction Z attenuation coefficient calculation unit 110 generates a sound band in the corrected detection result S (k).
- ⁇ is a constant and g is a predetermined constant greater than zero and less than 1.
- Gc (k) ⁇ gc noise band k ⁇ ⁇ ⁇ ⁇ (8)
- the detection result S (k) is
- the noise suppression accuracy can be further improved.
- FIG. 3 is a block diagram showing a configuration of a noise suppression device according to Embodiment 2 of the present invention. Since the noise suppression device described in the present embodiment has the same basic configuration as that described in Embodiment 1, the same or corresponding components have the same reference characters allotted. Detailed description is omitted.
- the noise suppressing device 200 shown in FIG. 3 has a configuration in which a speech Z noise frame determining unit 201 is added to the components of the noise suppressing device 100 described in the first embodiment.
- Voice Z noise frame determination section 201 generates a power noise in which the frame from which the voice power spectrum is obtained is a voice frame, based on the voice power spectrum from FFT section 102 and the noise base from noise base estimating section 103. It is determined whether the frame is a frame. The result of the determination is output to voicedness determination section 106 and voiced Z noise correction section 109 for each band.
- voice Z noise frame determination section 201 the frame determination operation of voice Z noise frame determination section 201 will be described more specifically.
- the speech Z noise frame determination unit 201 firstly uses the following equation (based on the speech power spectrum S (k) from the FFT unit 102 and the noise base N (n, k) from the noise base estimation unit 103:
- One of the two ratios is the ratio SNR between the speech power and the noise power in the lower frequency band of the speech power spectrum S (k).
- HL is the upper limit frequency component in the above low frequency range.
- HF is the upper limit frequency component in the frequency band of the audio power spectrum S (k).
- frame determination is performed using the following equation (11).
- frame information SNF is generated.
- Frame information SNF is subject to judgment Is information indicating whether the frame is a speech frame or a noise frame.
- M is the number of hangover frames. Also, when R is less than or equal to ⁇
- the result of the frame judgment is a speech frame.
- the voicedness determination unit 106 When the frame to be determined is determined to be a speech frame, normal operation (the operation described in the first embodiment) is performed in voicedness determination section 106 and band-based voiced Z noise correction section 109. On the other hand, when the frame to be determined is determined to be a noise frame, the voicedness determination unit 106 forcibly forces the speech power spectrum S (
- the band-specific sound Z noise correction unit 109 corrects the entire band as a noise band.
- the voicing of the entire band of the audio power spectrum S (k) is equal to or less than the predetermined level.
- the load on the correction unit can be reduced.
- the ratio SNR of the power in the low band of audio power spectrum S (k) is
- the power spectrum of a high-sound component can be emphasized, while the power spectrum of a low-correlation noise component can be reduced. As a result, the accuracy of frame determination can be improved.
- FIG. 4 is a block diagram showing a configuration of a noise suppression device according to Embodiment 3 of the present invention. Note that the noise suppression device described in the present embodiment has the same basic configuration as the noise suppression device described in Embodiment 1, and the same or corresponding components have the same reference characters. And a detailed description thereof will be omitted.
- Noise suppression device 300 shown in FIG. 4 has the same configuration as noise suppression device 100 described in the first embodiment.
- the configuration is such that a subtraction Z attenuation coefficient averaging unit 301 is added to the components.
- the subtraction Z attenuation coefficient averaging unit 301 averages the subtraction Z attenuation coefficient obtained as a result of the calculation by the subtraction Z attenuation coefficient calculation unit 110 in each of the time domain and the frequency domain.
- the averaged subtraction Z attenuation coefficient is output to the multiplier ill.
- the combination of the subtraction Z attenuation coefficient calculation unit 110, the subtraction Z attenuation coefficient average processing unit 301, and the multiplication unit 111 forms the sound band and the speech band in the speech spectrum including the noise component.
- a suppression unit that suppresses a noise component from a speech power spectrum is configured.
- the subtraction Z attenuation coefficient obtained by the calculation in the subtraction Z attenuation coefficient calculation section 110 is averaged in the time domain using the following equation (12). Become here,
- the moving average coefficient that satisfies the relationship is the moving average coefficient that satisfies the relationship.
- the subtracted Z attenuation coefficient is averaged in the frequency domain.
- K — K is the number of frequency components as the averaging target range.
- the subtraction / attenuation coefficient subjected to the time averaging process using Equation (12) is compared with the subtraction / attenuation coefficient subjected to the frequency averaging process using Equation (13).
- the present embodiment since the time averaging process is performed on the subtracted Z attenuation coefficient used for noise suppression, the non-speech of the speech due to a rapid change in the subtracted Z attenuation coefficient on the time axis. It is possible to improve continuity and reduce speech distortion caused by fluctuation of residual noise.
- the discontinuity of the attenuation on the frequency axis is reduced, and the noise attenuation is increased. Can also reduce audio distortion.
- the subtraction Z attenuation coefficient averaging unit 301 described in the present embodiment can also be used in the noise suppression device 200 described in the second embodiment.
- FIG. 5 is a block diagram showing a configuration of a noise suppression device according to Embodiment 4 of the present invention. Note that the noise suppression device described in the present embodiment has the same basic configuration as the noise suppression device described in Embodiment 1, and the same or corresponding components have the same reference characters. And a detailed description thereof will be omitted.
- the noise suppressing device 400 shown in FIG. 5 has a configuration in which a deadlock prevention unit 401 is added to the components of the noise suppressing device 100 described in the first embodiment.
- noise-based estimating section 103 in noise suppression apparatus 400 stops updating of the noise base when the level of the noise component changes abruptly, that is, the dead-end. Generate a lock state.
- the deadlock prevention unit 401 has a counter.
- the counter is provided in association with the frequency component in the frequency band of the audio power spectrum, and the frequency of the corresponding frequency component of the noise base estimated by the noise base estimating unit 103 is continuously higher than a predetermined value. Count the number of times.
- the deadlock preventing unit 401 prevents the noise base estimating unit 103 from stopping the updating of the noise base and the so-called deadlock state based on the counted number.
- step S 1000 the deadlock prevention unit 401 uses the speech power spectrum S (k)
- the noise base estimating unit 103 performs normal noise base estimation (S1010). Then, in step S1020, the number count (k) counted by the counter provided in the deadlock prevention unit 401 is reset to zero. Then, the process returns to step S1000.
- step S 1000 the speech power spectrum S (k)
- step S1040 the deadlock prevention unit 401 compares the number count (k) with a predetermined threshold. As a result of the comparison, when the count count (k) is larger than the threshold (S1 040: YES), the deadlock prevention unit 401 determines the minimum value of the noise power spectrum in a predetermined band including the corresponding frequency component k as the noise base N. (n, k) as the updated value (S 1050)
- step S the noise base N (n, k) is updated using the updated value (S1060).
- step S1040 when the count count (k) is equal to or smaller than the threshold (S1040: NO), the process directly returns to step S1000.
- the power in the voice power spectrum S (k) is equal to or more than the predetermined value for the predetermined number of consecutive times.
- the noise base N (n, k) can be updated with the minimum value of the noise power spectrum in a predetermined band including the frequency component k, and as a result, speech section noise is reduced.
- the deadlock state can be prevented regardless of the sound section.
- the predetermined band is preferably provided between peaks in the pitch harmonic. As a result, the valley of the noise power spectrum can be detected, and the minimum value of the noise power spectrum serving as the updated value can be easily detected.
- deadlock prevention section 401 described in the present embodiment can also be used in noise suppression apparatuses 200 and 300 described in Embodiments 2 and 3.
- a computer may execute the noise suppression method as software. That is, a program for executing the noise suppression method described in the above embodiment is previously stored in, for example, a ROM (Read Only Memory) or the like.
- the noise suppression method of the present invention can be executed by recording the program on a recording medium and operating the program by a CPU (Central Processor Unit).
- Each functional block used in the description of each of the above embodiments is typically realized as an LSI that is an integrated circuit. These may be individually made into one chip, or may be made into one chip so as to include a part or all of them.
- an LSI depending on the difference in the degree of power integration as an LSI, it may be called an IC, a system LSI, a super LSI, or a general LSI.
- the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. It is also possible to use an FPGA (Field Programmable Gate Array) that can be programmed after the LSI is manufactured, or a reconfigurable processor that can reconfigure the connections and settings of circuit cells inside the LSI.
- FPGA Field Programmable Gate Array
- the technology may be used to integrate the functional blocks. Biotechnology can be applied.
- the noise suppression device and the noise suppression method of the present invention have an effect of improving noise suppression accuracy while reducing voice distortion, and can be applied to a voice communication device, a voice recognition device, and the like.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Noise Elimination (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/629,381 US20080281589A1 (en) | 2004-06-18 | 2005-05-30 | Noise Suppression Device and Noise Suppression Method |
EP05743170A EP1768108A4 (en) | 2004-06-18 | 2005-05-30 | NOISE SUPPRESSION DEVICE AND NOISE SUPPRESSION METHOD |
JP2006514681A JPWO2005124739A1 (ja) | 2004-06-18 | 2005-05-30 | 雑音抑圧装置および雑音抑圧方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2004-181454 | 2004-06-18 | ||
JP2004181454 | 2004-06-18 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2005124739A1 true WO2005124739A1 (ja) | 2005-12-29 |
Family
ID=35509948
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2005/009859 WO2005124739A1 (ja) | 2004-06-18 | 2005-05-30 | 雑音抑圧装置および雑音抑圧方法 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20080281589A1 (ja) |
EP (1) | EP1768108A4 (ja) |
JP (1) | JPWO2005124739A1 (ja) |
CN (1) | CN1969320A (ja) |
WO (1) | WO2005124739A1 (ja) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008116686A (ja) * | 2006-11-06 | 2008-05-22 | Nec Engineering Ltd | 雑音抑圧装置 |
JP2010217552A (ja) * | 2009-03-17 | 2010-09-30 | Yamaha Corp | 音処理装置およびプログラム |
WO2012038998A1 (ja) * | 2010-09-21 | 2012-03-29 | 三菱電機株式会社 | 雑音抑圧装置 |
JP2019060942A (ja) * | 2017-09-25 | 2019-04-18 | 富士通株式会社 | 音声処理プログラム、音声処理方法および音声処理装置 |
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006006366A1 (ja) * | 2004-07-13 | 2006-01-19 | Matsushita Electric Industrial Co., Ltd. | ピッチ周波数推定装置およびピッチ周波数推定方法 |
US7873114B2 (en) * | 2007-03-29 | 2011-01-18 | Motorola Mobility, Inc. | Method and apparatus for quickly detecting a presence of abrupt noise and updating a noise estimate |
EP2031583B1 (en) * | 2007-08-31 | 2010-01-06 | Harman Becker Automotive Systems GmbH | Fast estimation of spectral noise power density for speech signal enhancement |
EP2058803B1 (en) * | 2007-10-29 | 2010-01-20 | Harman/Becker Automotive Systems GmbH | Partial speech reconstruction |
KR101317813B1 (ko) * | 2008-03-31 | 2013-10-15 | (주)트란소노 | 노이지 음성 신호의 처리 방법과 이를 위한 장치 및 컴퓨터판독 가능한 기록매체 |
KR101335417B1 (ko) * | 2008-03-31 | 2013-12-05 | (주)트란소노 | 노이지 음성 신호의 처리 방법과 이를 위한 장치 및 컴퓨터판독 가능한 기록매체 |
US9142221B2 (en) * | 2008-04-07 | 2015-09-22 | Cambridge Silicon Radio Limited | Noise reduction |
US9253568B2 (en) * | 2008-07-25 | 2016-02-02 | Broadcom Corporation | Single-microphone wind noise suppression |
US8515097B2 (en) * | 2008-07-25 | 2013-08-20 | Broadcom Corporation | Single microphone wind noise suppression |
JP5245714B2 (ja) * | 2008-10-24 | 2013-07-24 | ヤマハ株式会社 | 雑音抑圧装置及び雑音抑圧方法 |
WO2010113220A1 (ja) * | 2009-04-02 | 2010-10-07 | 三菱電機株式会社 | 雑音抑圧装置 |
US8423357B2 (en) * | 2010-06-18 | 2013-04-16 | Alon Konchitsky | System and method for biometric acoustic noise reduction |
JP5566846B2 (ja) * | 2010-10-15 | 2014-08-06 | 本田技研工業株式会社 | ノイズパワー推定装置及びノイズパワー推定方法並びに音声認識装置及び音声認識方法 |
CN103620113B (zh) * | 2011-04-28 | 2015-12-23 | Abb技术有限公司 | 从片材的扫描测量确定cd和md的变化 |
US20130282373A1 (en) * | 2012-04-23 | 2013-10-24 | Qualcomm Incorporated | Systems and methods for audio signal processing |
US9865277B2 (en) * | 2013-07-10 | 2018-01-09 | Nuance Communications, Inc. | Methods and apparatus for dynamic low frequency noise suppression |
CN104778949B (zh) * | 2014-01-09 | 2018-08-31 | 华硕电脑股份有限公司 | 音频处理方法及音频处理装置 |
JP6206271B2 (ja) * | 2014-03-17 | 2017-10-04 | 株式会社Jvcケンウッド | 雑音低減装置、雑音低減方法及び雑音低減プログラム |
CN104242850A (zh) * | 2014-09-09 | 2014-12-24 | 联想(北京)有限公司 | 一种音频信号处理方法及电子设备 |
US9734844B2 (en) * | 2015-11-23 | 2017-08-15 | Adobe Systems Incorporated | Irregularity detection in music |
CN106998214A (zh) * | 2017-04-05 | 2017-08-01 | 深圳天珑无线科技有限公司 | 一种谐波处理方法及装置 |
CN109862463A (zh) * | 2018-12-26 | 2019-06-07 | 广东思派康电子科技有限公司 | 耳机语音回放方法、耳机及其计算机可读存储介质 |
CN111292758B (zh) * | 2019-03-12 | 2022-10-25 | 展讯通信(上海)有限公司 | 语音活动检测方法及装置、可读存储介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0836400A (ja) * | 1994-07-25 | 1996-02-06 | Kokusai Electric Co Ltd | 音声状態判定回路 |
JPH09152894A (ja) * | 1995-11-30 | 1997-06-10 | Denso Corp | 有音無音判別器 |
JPH09311698A (ja) * | 1996-05-21 | 1997-12-02 | Oki Electric Ind Co Ltd | 背景雑音消去装置 |
JP2001249698A (ja) * | 2000-03-06 | 2001-09-14 | Yrp Kokino Idotai Tsushin Kenkyusho:Kk | 音声符号化パラメータ取得方法、音声復号方法及び装置 |
JP2002149200A (ja) * | 2000-08-31 | 2002-05-24 | Matsushita Electric Ind Co Ltd | 音声処理装置及び音声処理方法 |
JP2003280696A (ja) * | 2002-03-19 | 2003-10-02 | Matsushita Electric Ind Co Ltd | 音声強調装置及び音声強調方法 |
JP2004020679A (ja) * | 2002-06-13 | 2004-01-22 | Matsushita Electric Ind Co Ltd | 雑音抑圧装置および雑音抑圧方法 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5659622A (en) * | 1995-11-13 | 1997-08-19 | Motorola, Inc. | Method and apparatus for suppressing noise in a communication system |
AU2001241475A1 (en) * | 2000-02-11 | 2001-08-20 | Comsat Corporation | Background noise reduction in sinusoidal based speech coding systems |
AU2002241476A1 (en) * | 2000-11-22 | 2002-07-24 | Defense Group Inc. | Noise filtering utilizing non-gaussian signal statistics |
US7716046B2 (en) * | 2004-10-26 | 2010-05-11 | Qnx Software Systems (Wavemakers), Inc. | Advanced periodic signal enhancement |
-
2005
- 2005-05-30 WO PCT/JP2005/009859 patent/WO2005124739A1/ja not_active Application Discontinuation
- 2005-05-30 CN CN200580020128.3A patent/CN1969320A/zh active Pending
- 2005-05-30 JP JP2006514681A patent/JPWO2005124739A1/ja not_active Withdrawn
- 2005-05-30 EP EP05743170A patent/EP1768108A4/en not_active Withdrawn
- 2005-05-30 US US11/629,381 patent/US20080281589A1/en not_active Abandoned
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0836400A (ja) * | 1994-07-25 | 1996-02-06 | Kokusai Electric Co Ltd | 音声状態判定回路 |
JPH09152894A (ja) * | 1995-11-30 | 1997-06-10 | Denso Corp | 有音無音判別器 |
JPH09311698A (ja) * | 1996-05-21 | 1997-12-02 | Oki Electric Ind Co Ltd | 背景雑音消去装置 |
JP2001249698A (ja) * | 2000-03-06 | 2001-09-14 | Yrp Kokino Idotai Tsushin Kenkyusho:Kk | 音声符号化パラメータ取得方法、音声復号方法及び装置 |
JP2002149200A (ja) * | 2000-08-31 | 2002-05-24 | Matsushita Electric Ind Co Ltd | 音声処理装置及び音声処理方法 |
JP2003280696A (ja) * | 2002-03-19 | 2003-10-02 | Matsushita Electric Ind Co Ltd | 音声強調装置及び音声強調方法 |
JP2004020679A (ja) * | 2002-06-13 | 2004-01-22 | Matsushita Electric Ind Co Ltd | 雑音抑圧装置および雑音抑圧方法 |
Non-Patent Citations (4)
Title |
---|
PATEL N.V. ET AL: "Audio characterization for video indexing", PROC. OF SPIE, vol. 2670, 1996, pages 373 - 384, XP000950031 * |
See also references of EP1768108A4 * |
WANG Y. ET AL: "Comb Filterinhg o Mochiita Onsei to Zatsuon no Bunri no Kento", THE ACOUSTICAL SOCIETY OF JAPAN (ASJ) 2002 NEN SHUNKI KENKYU HAPPYOKAI KOEN RONBUNSHU-I-, 18 March 2002 (2002-03-18), pages 609 - 610, XP002995868 * |
WANG Y. ET AL: "Pitch Choka Kozo no Shufuku o Mochiita Onsei Kyochoho no Kento", THE ACOUSTICAL SOCIETY OF JAPAN (ASJ) 2001 NEN SHUKI KENKYU HAPPYOKAI KOEN RONBUNSHU-I-, 2 October 2001 (2001-10-02), pages 603 - 604, XP002995869 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008116686A (ja) * | 2006-11-06 | 2008-05-22 | Nec Engineering Ltd | 雑音抑圧装置 |
JP4757775B2 (ja) * | 2006-11-06 | 2011-08-24 | Necエンジニアリング株式会社 | 雑音抑圧装置 |
JP2010217552A (ja) * | 2009-03-17 | 2010-09-30 | Yamaha Corp | 音処理装置およびプログラム |
WO2012038998A1 (ja) * | 2010-09-21 | 2012-03-29 | 三菱電機株式会社 | 雑音抑圧装置 |
JP5183828B2 (ja) * | 2010-09-21 | 2013-04-17 | 三菱電機株式会社 | 雑音抑圧装置 |
US8762139B2 (en) | 2010-09-21 | 2014-06-24 | Mitsubishi Electric Corporation | Noise suppression device |
JP2019060942A (ja) * | 2017-09-25 | 2019-04-18 | 富士通株式会社 | 音声処理プログラム、音声処理方法および音声処理装置 |
US11069373B2 (en) | 2017-09-25 | 2021-07-20 | Fujitsu Limited | Speech processing method, speech processing apparatus, and non-transitory computer-readable storage medium for storing speech processing computer program |
Also Published As
Publication number | Publication date |
---|---|
CN1969320A (zh) | 2007-05-23 |
JPWO2005124739A1 (ja) | 2008-04-17 |
EP1768108A1 (en) | 2007-03-28 |
EP1768108A4 (en) | 2008-03-19 |
US20080281589A1 (en) | 2008-11-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2005124739A1 (ja) | 雑音抑圧装置および雑音抑圧方法 | |
CA2732723C (en) | Apparatus and method for processing an audio signal for speech enhancement using a feature extraction | |
JP3574123B2 (ja) | 雑音抑圧装置 | |
US7286980B2 (en) | Speech processing apparatus and method for enhancing speech information and suppressing noise in spectral divisions of a speech signal | |
US6415253B1 (en) | Method and apparatus for enhancing noise-corrupted speech | |
JP5752324B2 (ja) | 雑音の入った音声信号中のインパルス性干渉の単一チャネル抑制 | |
JP3960834B2 (ja) | 音声強調装置及び音声強調方法 | |
WO2006006366A1 (ja) | ピッチ周波数推定装置およびピッチ周波数推定方法 | |
US20020128830A1 (en) | Method and apparatus for suppressing noise components contained in speech signal | |
US10332541B2 (en) | Determining noise and sound power level differences between primary and reference channels | |
US11183172B2 (en) | Detection of fricatives in speech signals | |
JP4173525B2 (ja) | 雑音抑圧装置及び雑音抑圧方法 | |
JP4445460B2 (ja) | 音声処理装置及び音声処理方法 | |
JP5131149B2 (ja) | 雑音抑圧装置及び雑音抑圧方法 | |
JP2006126859A5 (ja) | ||
JP3761497B2 (ja) | 音声認識装置、音声認識方法、および、音声認識プログラム | |
Islam et al. | Speech enhancement in adverse environments based on non-stationary noise-driven spectral subtraction and snr-dependent phase compensation | |
JP4098271B2 (ja) | 雑音抑圧装置 | |
JP2006201622A (ja) | 帯域分割型雑音抑圧装置及び帯域分割型雑音抑圧方法 | |
Singh et al. | Sigmoid based Adaptive Noise Estimation Method for Speech Intelligibility Improvement | |
BRPI0911932B1 (pt) | Equipamento e método para processamento de um sinal de áudio para intensificação de voz utilizando uma extração de característica |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2006514681 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11629381 Country of ref document: US Ref document number: 2005743170 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 200580020128.3 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: DE |
|
WWP | Wipo information: published in national office |
Ref document number: 2005743170 Country of ref document: EP |