KR102012522B1 - Apparatus for processing directional sound - Google Patents
Apparatus for processing directional sound Download PDFInfo
- Publication number
- KR102012522B1 KR102012522B1 KR1020130037586A KR20130037586A KR102012522B1 KR 102012522 B1 KR102012522 B1 KR 102012522B1 KR 1020130037586 A KR1020130037586 A KR 1020130037586A KR 20130037586 A KR20130037586 A KR 20130037586A KR 102012522 B1 KR102012522 B1 KR 102012522B1
- Authority
- KR
- South Korea
- Prior art keywords
- input
- sound
- microphones
- signal
- coherence
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims abstract description 18
- 238000000034 method Methods 0.000 claims abstract description 22
- 230000005236 sound signal Effects 0.000 claims abstract description 20
- 230000000694 effects Effects 0.000 description 8
- 238000001514 detection method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000010363 phase shift Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The present invention discloses a directional sound processing apparatus. According to the present invention, a plurality of microphones for receiving ambient sound; A coherence calculator for calculating coherence for each frequency band of the sound signals input from the plurality of microphones; An estimator estimating the existence probability of the speech signal based on the acoustic signal input from each of the plurality of microphones; A weight calculator configured to assign different weights to respective frequency bands according to the estimated probability of existence of the speech signal, and output a result value by reflecting the weighted weights to the coherence; And a controller which compares the result value with a preset threshold and determines whether to process the plurality of micro-input sound signals.
Description
The present invention relates to an apparatus for processing directional acoustic signals, and more particularly, to an apparatus for selectively acquiring and processing a sound in a desired direction in a noise environment.
The microphone refers to a device that receives sound waves or ultrasonic waves and generates electric signals according to the vibrations.
Recently, with the development of robot-related technology, a microphone is used as an interface for communication between a robot and a user, and the robot recognizes a user's sound by converting a sound signal input through the microphone into an electrical signal.
This recognition process is always accompanied by noise (noise) generated in the surrounding environment of the microphone, and in order to recognize more various commands accurately and improve user convenience, it is essential to receive only sound signals except noise.
In addition, a large number of devices, such as a mobile communication terminal and a navigation device, have a user's voice recognition function, and in such a device, it is essential to process only a voice signal in a desired direction for an operation based on accurate voice recognition.
A number of conventional sound equipment companies have proposed an apparatus for blocking ambient noise by applying active noise canceling technology.
Conventional devices, however, are used for the purpose of creating a quiet environment that eliminates all ambient noise.
In addition, although the conventional technology of obtaining only a sound coming from a specific direction by using a multi-channel microphone has been applied to headphones and hearing aids, there is a limitation in utilization because it only accepts sound in a direction in which a person's eyes are directed.
More specifically, conventional noise canceling methods include a method using a generalized sidelobe canceller (GSC) and a method using a phase difference between signals input to two microphones.
The former method has a problem of output divergence or slowing down of convergence due to adaptive filter convergence problem in the GSC.
In addition, the latter method does not have a convergence problem because it does not use an adaptive filter. However, due to the limitation of using only phase information between two microphones, the system reacts sensitively to small changes in phase so that a temporary error occurs frequently. .
Korean Patent Laid-Open No. 2008-0000478 (Invention name: Method and apparatus for removing noise of signals input by a plurality of microphones in a portable terminal) is a method and apparatus for removing noise from a phase difference signal input to the portable terminal It is starting.
However, as described above, the conventional patent uses only the phase difference to remove the directional signal, so that the stability of the system is low, so that the noise cannot be effectively removed from the voice signal.
In order to solve the above problems of the prior art, the present invention is to propose a directional acoustic signal processing apparatus that can obtain only the sound of a specific direction desired by the user.
In order to solve the above technical problem, according to an embodiment of the present invention, a directional acoustic signal processing device, a plurality of microphones for receiving the ambient sound; A coherence calculator for calculating coherence for each frequency band of the sound signals input from the plurality of microphones; An estimator estimating the existence probability of the speech signal based on the acoustic signal input from each of the plurality of microphones; A weight calculator configured to assign different weights to respective frequency bands according to the estimated probability of existence of the speech signal, and output a result value by reflecting the weighted weights to the coherence; And a controller which compares the result value with a preset threshold and determines whether to process the plurality of micro-input sound signals.
An interface unit for receiving an angle at which the user wants to listen; And a phase converting unit converting phases of sound signals input from the plurality of microphones according to a listening angle input through the interface unit.
The estimator may include a background noise estimator for estimating background noise through an acoustic signal input from each of the plurality of microphones; And a signal existence probability estimator estimating a speech signal existence probability of each frequency band based on the background noise.
The weight calculator may assign a weight to 1 when the signal existence probability is equal to or greater than a preset value, and may assign the signal existence probability as a weight when less than the preset value.
The result value may be a value obtained by adding up a value obtained by multiplying the coherence for each frequency band by a weight.
The apparatus may further include a sound output selection unit configured to select the plurality of micro-input signals to be output to at least one of a speaker, a voice recognition unit, and a transmission unit according to the determination of the controller.
According to the present invention, since the background noise is estimated and the signal existence probability is determined to determine whether to process the acoustic signal, only the voice of the actual user except the noise can be selectively processed.
1 is a diagram showing the configuration of a directional sound processing apparatus according to an embodiment of the present invention.
2 is a diagram showing a detailed configuration of the spatial acoustic activity detection unit according to the present invention.
3 is a diagram illustrating an on / off process of the sound output selection unit according to the present invention;
4 illustrates acoustic attenuation and processing conditions in accordance with the present invention.
As the present invention allows for various changes and numerous embodiments, particular embodiments will be illustrated in the drawings and described in detail in the written description. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention. In describing the drawings, similar reference numerals are used for similar elements.
Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following description of the present invention, the same reference numerals will be used for the same means regardless of the reference numerals in order to facilitate the overall understanding.
1 is a block diagram of a directional sound processing apparatus according to a preferred embodiment of the present invention.
As shown in FIG. 1, the directional sound processing apparatus according to the present invention includes a plurality of microphones 100-1 to 100-n, a spatial voice
As shown in FIG. 1, in the directional sound processing apparatus according to the present invention, sounds from different directions, such as front and side sounds, are input through a plurality of
The spatial acoustic
Preferably, the spatial acoustic
At this time, the spatial acoustic
Herein, the processing of the signal input through the microphone may include outputting the input sound signal through a speaker, whether the device is a voice recognition device, whether to output the voice recognition processing unit (not shown), or when the device is a call device. Determining whether to deliver to the counterpart device.
Preferably, the sound
2 is a view showing a detailed configuration of the spatial acoustic activity detection unit according to the present invention.
As shown in FIG. 2, the spatial acoustic
In FIG. 2, the first microphone 100-1 to the second microphone 100-2 are provided for convenience of description, but it should be understood by those skilled in the art that the present invention is not limited thereto.
The sound signal input through the first microphone 100-1
, The sound signal input through the second microphone (100-2) AssumeHere, t is time (frame) and w means frequency band index.
The
The
Here, the listening angle is defined as the angle that the user wants to hear. For example, assuming that the angle of the sound input from the front with respect to the microphones 100-1 and 100-2 is 0 °, the listening angle is generally Can be set to 0 °. However, in some cases, when the user wants to hear the sound input from the side, the listening angle may be set to, for example, 90 °, and the
After the phase shift, the
here,
Is frequency coherence, Is a cross correlation value of the first microphone 100-1 and the second microphone 100-2 inputs X and Y, Wow Is an auto correlation value.According to an exemplary embodiment of the present invention, the
In more detail, the
The
The signal
here,
Prior Speech Absent Probability,
,to be.
Based on p (t, w) calculated through Equation 2, the
According to an embodiment of the present invention, if p (t, w) is 0.8 or greater, the weight
To 1, and in other cases p (t, w) To give.Herein, 0.8 is an experimentally determined value, and the value given to the weight 1 is not limited thereto and may be variously applied.
The
In addition, the
The
On the other hand, if the result of the
According to the present invention, the
As shown in FIG. 3, the
Preferably, the
In more detail, as illustrated in FIG. 3B, the
As shown in FIG. 4, when sound is input in a direction not desired by the user, such a signal is attenuated and outputs only sound input in the direction desired by the user, and thus can be efficiently used in a noisy environment.
Embodiments of the present invention may be implemented in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the media may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks. Examples of program instructions such as magneto-optical, ROM, RAM, flash memory, etc. may be executed by a computer using an interpreter as well as machine code such as produced by a compiler. Contains high-level language codes. The hardware device described above may be configured to operate as at least one software module to perform the operations of one embodiment of the present invention, and vice versa.
Preferred embodiments of the present invention described above are disclosed for purposes of illustration, and those skilled in the art will be able to make various modifications, changes, and additions within the spirit and scope of the present invention. Additions should be considered to be within the scope of the following claims.
Claims (6)
A plurality of microphones for receiving ambient sound; And
A coherence calculator for calculating coherence for each frequency band of the sound signals input from the plurality of microphones;
An estimator estimating the existence probability of the speech signal based on the acoustic signals input from the plurality of microphones;
A weight calculator configured to assign different weights to respective frequency bands according to the estimated probability of existence of the speech signal, and output a result value by reflecting the weighted weights to the coherence;
A control unit which determines whether to process the plurality of micro-input sound signals by comparing the result value with a preset threshold value; And
And a sound output selector configured to select the plurality of micro-input signals to be output to at least one of a speaker, a voice recognizer, and a transmitter according to a determination of the controller.
An interface unit for receiving an angle at which the user wants to listen; And
And a phase shifter for converting phases of sound signals input from the plurality of microphones according to a listening angle input through the interface unit.
The estimating unit,
A background noise estimator for estimating background noise through an acoustic signal input from each of the plurality of microphones; And
And a signal existence probability estimator for estimating a speech signal existence probability of each frequency band based on the background noise.
The weight calculation unit,
And a weight of 1 when the signal existence probability is equal to or greater than a preset value, and a weight of the signal existence probability as a weight when less than the preset value.
The result value is a directional sound signal processing apparatus is a sum of the sum of the product of the weighted multiplied coherence for each frequency band.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020130037586A KR102012522B1 (en) | 2013-04-05 | 2013-04-05 | Apparatus for processing directional sound |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020130037586A KR102012522B1 (en) | 2013-04-05 | 2013-04-05 | Apparatus for processing directional sound |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20140121168A KR20140121168A (en) | 2014-10-15 |
KR102012522B1 true KR102012522B1 (en) | 2019-08-20 |
Family
ID=51992804
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020130037586A KR102012522B1 (en) | 2013-04-05 | 2013-04-05 | Apparatus for processing directional sound |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR102012522B1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111833901B (en) * | 2019-04-23 | 2024-04-05 | 北京京东尚科信息技术有限公司 | Audio processing method, audio processing device, system and medium |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101009854B1 (en) * | 2007-03-22 | 2011-01-19 | 고려대학교 산학협력단 | Method and apparatus for estimating noise using harmonics of speech |
KR100943224B1 (en) * | 2007-10-16 | 2010-02-18 | 한국전자통신연구원 | An intelligent robot for localizing sound source by frequency-domain characteristics and method thereof |
-
2013
- 2013-04-05 KR KR1020130037586A patent/KR102012522B1/en active IP Right Grant
Also Published As
Publication number | Publication date |
---|---|
KR20140121168A (en) | 2014-10-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101210313B1 (en) | System and method for utilizing inter?microphone level differences for speech enhancement | |
KR101444100B1 (en) | Noise cancelling method and apparatus from the mixed sound | |
US10580428B2 (en) | Audio noise estimation and filtering | |
KR101449433B1 (en) | Noise cancelling method and apparatus from the sound signal through the microphone | |
US9264804B2 (en) | Noise suppressing method and a noise suppressor for applying the noise suppressing method | |
US20170140771A1 (en) | Information processing apparatus, information processing method, and computer program product | |
US8611552B1 (en) | Direction-aware active noise cancellation system | |
KR101475864B1 (en) | Apparatus and method for eliminating noise | |
US20170092256A1 (en) | Adaptive block matrix using pre-whitening for adaptive beam forming | |
CN104158990A (en) | Method for processing an audio signal and audio receiving circuit | |
KR102076760B1 (en) | Method for cancellating nonlinear acoustic echo based on kalman filtering using microphone array | |
US9185506B1 (en) | Comfort noise generation based on noise estimation | |
JP6833616B2 (en) | Echo suppression device, echo suppression method and echo suppression program | |
US9330677B2 (en) | Method and apparatus for generating a noise reduced audio signal using a microphone array | |
US20200286501A1 (en) | Apparatus and a method for signal enhancement | |
US9485572B2 (en) | Sound processing device, sound processing method, and program | |
KR20160014709A (en) | Echo suppression | |
CN111383647A (en) | Voice signal processing method and device and readable storage medium | |
CN109215672B (en) | Method, device and equipment for processing sound information | |
US11205437B1 (en) | Acoustic echo cancellation control | |
KR102012522B1 (en) | Apparatus for processing directional sound | |
KR100949910B1 (en) | Method and apparatus for acoustic echo cancellation using spectral subtraction | |
KR102063824B1 (en) | Apparatus and Method for Cancelling Acoustic Feedback in Hearing Aids | |
CN113824843B (en) | Voice call quality detection method, device, equipment and storage medium | |
US10692514B2 (en) | Single channel noise reduction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant |