CN102565759B - Binaural sound source localization method based on sub-band signal to noise ratio estimation - Google Patents
Binaural sound source localization method based on sub-band signal to noise ratio estimation Download PDFInfo
- Publication number
- CN102565759B CN102565759B CN 201110448129 CN201110448129A CN102565759B CN 102565759 B CN102565759 B CN 102565759B CN 201110448129 CN201110448129 CN 201110448129 CN 201110448129 A CN201110448129 A CN 201110448129A CN 102565759 B CN102565759 B CN 102565759B
- Authority
- CN
- China
- Prior art keywords
- signal
- orientation
- subband
- itd
- noise ratio
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
A binaural sound source localization method based on sub-band signal to noise ratio estimation is an improved sound source localization method, wherein the mean value of the ITD (Interaural Time Difference) of various orientations is used as the localization characteristic clue for the sound source orientation to build an orientation mapping model; during the actual sound source localization, a dual-channel acoustic signal is input, the input acoustic signal is firstly subjected to frequency domain transformation, a frequency domain is divided into a plurality of sub-bands, signal to noise ratio estimation is carried out in each sub-band, according to the sub-band signal to noise ratio, the power spectrum of the corresponding sub-band is selected to calculate the ITD parameters of each frame, one-by-one match is performed according to the orientation characteristic model built by the ITD characteristic parameters and a training module, and based on the Euclidean distance measurement, the orientation is output. With the binaural sound source localization method, the performance of sound source localization in noisy environments can be improved.
Description
Technical field
The invention belongs to the auditory localization technical field, be a kind of binaural sound sources localization method based on the subband SNR estimation.
Background technology
The auditory localization technology can help to transmit and the identification visual information as an emerging intersect edge subject, increases the fidelity of three-dimensional artificial environment.At present main location algorithm has the auditory localization algorithm of multi-microphone array and based on the auditory localization algorithm of binary channel.The auditory localization algorithm of multi-microphone array exists that calculated amount is large, the microphone array size is large, and algorithm is subjected to the factor such as reverberation to disturb the problems such as large.Based on the aural signature of the sound localization method of binary channel acoustical signal simulation people ear, can realize comparatively accurately auditory localization.The most representative interaural difference ITD that is based on simple crosscorrelation (Interaural Time Difference) estimates, yet for signals and associated noises, the positioning performance degradation of estimating based on the ITD of simple crosscorrelation.
Summary of the invention
The problem to be solved in the present invention is: the auditory localization algorithm of present multi-microphone array exists that calculated amount is large, the microphone array size is large, and algorithm is subjected to the factor such as reverberation to disturb the problems such as large, and existing sound localization method based on the binary channel acoustical signal is not enough for the positioning performance of signals and associated noises.
Technical scheme of the present invention is: a kind of binaural sound sources localization method based on the subband SNR estimation, the training of advanced row data, training data is the known acoustical signal in orientation, through feature extraction, estimate the interaural difference ITD parameter of each orientation acoustical signal, with the average of the ITD parameter of each the orientation multiframe acoustical signal parameter as the vector quantization VQ model of this orientation ITD parameter, set up the orientation mapping model; During actual auditory localization, input binary channel acoustical signal, the input acoustical signal is passed through first the frequency domain conversion, divides some subbands at frequency domain, estimates each subband signal to noise ratio (S/N ratio), the snr threshold of each subband signal to noise ratio (S/N ratio) and setting is compared, select signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculate subband ITD characteristic parameter, train the orientation mapping model of setting up to mate one by one according to subband ITD characteristic parameter and data, estimate the output orientation based on Euclidean distance.
Concrete steps comprise:
1) data training:
11) use 37 orientation, the wide surface level right side of KEMAR microtia, i.e. the coherent pulse response HRIR data of θ=0 °~180 ° are with the known Virtual Sound of white noise convolution generating direction;
12) to step 11) the gained Virtual Sound carries out pre-service, comprises amplitude normalization, pre-emphasis, minute frame and windowing, and each frame acoustical signal in each orientation is obtained stably single frames signal;
13) with step 12) described stably single frames signal carries out end-point detection, obtains effective single frames signal;
14) calculate each single frames signal and carry out interaural difference ITD characteristic parameter, obtain the ITD training sample;
15) according to step 14) gained ITD training sample, with the average of the ITD training sample of each the orientation multiframe acoustical signal parameter as the vector quantization VQ model of corresponding orientation ITD, set up the orientation mapping model;
2) positioning step for the treatment of localization of sound source location is:
21) acoustical signal that gathers is carried out pre-service, comprise amplitude normalization, pre-emphasis, minute frame and windowing, each frame acoustical signal in each orientation is obtained stably single frames signal;
22) with step 21) gained single frames signal carries out end-point detection, obtains effective single frames signal;
23) with step 22) the effective single frames signal of gained carries out the FFT conversion, is divided into some subbands, calculates the signal to noise ratio (S/N ratio) of estimating each subband, and described subband adopts average division rule, is divided into 7-13 subband;
24) snr threshold with each subband signal to noise ratio (S/N ratio) and setting compares, and the subband amplitude that will be lower than snr threshold is made as 0, selects signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculates subband ITD characteristic parameter;
25) the orientation mapping model according to subband ITD characteristic parameter and data training foundation mates one by one, estimates the output azimuth information according to Euclidean distance.
Compare with existing binary channel acoustical signal location technology, the method that the present invention proposes can obviously improve the performance of auditory localization under the noise, when signal to noise ratio (S/N ratio) is 0dB, correct localization of the present invention reaches 89%, original method correct localization only is 63%, during signal to noise ratio (S/N ratio) 10dB, auditory localization accuracy of the present invention can reach 94%, and original method correct localization is 82%.
Description of drawings
Fig. 1 is the spatial coordinate system synoptic diagram of auditory localization of the present invention.
Fig. 2 is positioning system block diagram of the present invention.
Embodiment
The training of the advanced row of the present invention data, training data is the known acoustical signal in orientation, through feature extraction, estimate the interaural difference ITD parameter of each orientation acoustical signal, with the average of the ITD parameter of each the orientation multiframe acoustical signal parameter as vector quantization VQ (Vector Quantization) model of this orientation ITD parameter, set up the orientation mapping model; During actual auditory localization, input binary channel acoustical signal, the input acoustical signal is passed through first the frequency domain conversion, be Fast Fourier Transform (FFT) FFT (Fast Fourier Transform), divide some subbands at frequency domain, estimate each subband signal to noise ratio (S/N ratio), the snr threshold of each subband signal to noise ratio (S/N ratio) and setting is compared, select signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculate subband ITD characteristic parameter, orientation mapping model according to subband ITD characteristic parameter and data training foundation mates one by one, estimates the output orientation based on Euclidean distance.
Fig. 1 is the spatial coordinate system synoptic diagram of auditory localization of the present invention, and in the present invention, sound source position is by coordinate
Unique definite.Wherein, 0≤r<+∞ is the distance of sound source and initial point; The elevation angle
Be the angle of direction vector and surface level,
With+90 ° respectively the expression under, surface level and directly over; 0 °≤θ of deflection<360 ° is that direction vector is at the projection of surface level and the angle of middle vertical plane.On the surface level, expression dead ahead, θ=0 °, along clockwise direction θ=90 °, 180 ° and 270 ° respectively expression positive right, just after and front-left.
The inventive method comprises that data training and auditory localization two go on foot greatly:
1) data training:
11) use 37 orientation, the wide surface level right side of KEMAR microtia, i.e. the coherent pulse response HRIR data of θ=0 °~180 ° are with the known Virtual Sound of white noise convolution generating direction;
12) to step 11) the gained Virtual Sound carries out pre-service, comprises amplitude normalization, pre-emphasis, minute frame and windowing, and each frame acoustical signal in each orientation is obtained stably single frames signal;
13) with step 12) described stably single frames signal carries out end-point detection, obtains effective single frames signal;
14) calculate each single frames signal and carry out interaural difference ITD characteristic parameter, obtain the ITD training sample;
15) according to step 14) gained ITD training sample, with the average of the ITD training sample of each the orientation multiframe acoustical signal parameter as the vector quantization VQ model of corresponding orientation ITD, set up the orientation mapping model;
2) positioning step for the treatment of localization of sound source location is:
21) acoustical signal that gathers is carried out pre-service, comprise amplitude normalization, pre-emphasis, minute frame and windowing, each frame acoustical signal in each orientation is obtained stably single frames signal;
22) with step 21) gained single frames signal carries out end-point detection, obtains effective single frames signal;
23) with step 22) the effective single frames signal of gained carries out the FFT conversion, is divided into some subbands, calculates the signal to noise ratio (S/N ratio) of estimating each subband, and described subband adopts average division rule, is divided into 7-13 subband;
24) snr threshold with each subband signal to noise ratio (S/N ratio) and setting compares, and the subband amplitude that will be lower than snr threshold is made as 0, selects signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculates subband ITD characteristic parameter;
25) the orientation mapping model according to subband ITD characteristic parameter and data training foundation mates one by one, estimates the output azimuth information according to Euclidean distance.
Following correspondence implementation step of the present invention is elaborated to the implementation of technical solution of the present invention by reference to the accompanying drawings:
Fig. 2 has provided the realization block diagram based on the auditory localization of the binary channel acoustical signal of SNR estimation, HRTF (Head-Response Transfer Function) is a related transfer function, with the white noise convolution, produce the directionality virtual sound signal that is used for training.Among the figure respectively the treatment scheme to training and testing stage acoustical signal mark, lower mask body is introduced function and the implementation of each module.
1, pretreatment module, corresponding step 12) and 21) described in pre-service:
Owing to may be mingled with a lot of electronic noises and ground unrest in the acoustical signal that collecting device collects, in order to suppress noise effect to the analysis of follow-up signal, need to carry out pre-service; The pre-service of this method comprises: amplitude normalization, pre-emphasis, minute frame and windowing.It is 30ms that the present invention takes frame length, and frame moves and is 10ms.
Pre-emphasis adopts order digital filter H (z)=1-μ z
-1, μ=0.97 wherein.Voice signal after this method uses Hamming window to minute frame carries out windowing process, and the n frame signal after the windowing can be expressed as x
n(m)=w
H(m) x (nN+m) 0≤m<N, N is a frame sampling data length, is 1323,
Wherein,
Be Hamming window.
2, endpoint detection module, corresponding step 13) and 22) described in end-point detection:
The purpose of end-point detection is exactly to receive the acoustical signal from one section to find out starting point and the end point of useful signal, thereby only useful signal is processed.End-point detection not only can reduce memory data output and processing time accurately, and can get rid of the interference of unvoiced segments and noise.The inventive method adopts short-time energy and zero-crossing rate feature to combine, and monaural signal is detected, and the method that adopts short-time energy and zero-crossing rate feature to combine to carry out sound end to detect is prior art, and the present invention here simply introduces:
Short-time energy is the average energy that a frame signal has, and computing formula is
X wherein
n(m), m=0,1 ... N-1 is for gathering acoustical signal, X through pretreated n frame
n(k), k=0,1 ... N-1 is corresponding frequency-region signal.The threshold value of short-time energy can be made as fixed value and also can use the multi-frame mean energy as decision threshold.
Short-time zero-crossing rate is the number percent that number of times that a frame waveform passes zero level accounts for frame length, and for discrete signal, as long as the symbol of more adjacent two sampled points, computing formula is
Wherein, sgn (x) is sign function.The decision threshold that the present invention uses is Z
Min=0.01, Z
Max=0.4, lower limit Z wherein is set
MinIt is the impact for filtering part mute frame.
Short-time energy and zero-crossing rate feature are useful signal within decision threshold, thereby can judge sound section initial sum final position.
3, subband signal to noise ratio snr estimation module, corresponding step 23):
The useful signal that the opposite end spot check records carries out the frequency domain conversion, divides some subbands at frequency domain, carries out SNR estimation in each subband, and described subband adopts average division rule, is divided into 7-13 subband among the present invention; Concrete formula is as follows:
The frequency-region signal model can be with vector representation:
X(k)=S(k)+N(k)
X(k)={x
i(k),x
r(k)}
T
S(k)={S
i(k),S
r(k)}
T
Wherein, X (k) is noisy speech, and S (k) is pure acoustical signal, and k represents frequency domain.Subscript l, r represent respectively left and right two-way acoustical signal.
For the binary channel acoustical signal, travel path is distinguished to some extent to the decay of different frequency acoustical signal, and because different from the sound localization method of multi-microphone array, the binary channel auditory localization only has the two-way acoustical signal, so this method estimated snr in subband, one frame signal is divided into some subbands at frequency domain, estimates the covariance matrix of each subband, then calculated the signal to noise ratio (S/N ratio) of each frequency by covariance matrix.By frequency-region signal model vector form, as can be known, the covariance matrix of i subband is
Wherein, X
i(k) be the frequency domain vectors of the left and right sides road acoustical signal composition of i subband.P
Li, P
Ri, σ
2The energy and the noise power spectral density that represent respectively i subband left and right sides acoustical signal, IID is the interaural intensity difference of this subband acoustical signal.
Can be drawn voice and the noise energy power spectrum density of i subband by following formula:
By equation
Can draw P
Li
σ
2=R
1-P
li P
ri=R
4-σ
2
In the subband SNR estimation, because just there is interaural intensity difference in binaural signal at the frequency spectrum of different sub-band itself.Therefore, the decision of subband size Algorithm Performance.
The number selection of subband is relevant with the factors such as height of the type of sound-source signal, signal to noise ratio (S/N ratio).The number of subband needs moderate, and on the one hand, if the subband number is too many, the Frequency point of each subband the inside when SNR is low, has added more insecure frequencies very little, has affected algorithm effect.Because the average SNR of a subband is lower, just can ignore the frequency data of whole subband on the other hand, the subband number also should not be very little.The present invention arranges the Simulation Test Environment of different parameters, and according to test result, balance considers that the sub band number that the present invention adopts is 7-13.
4, ITD characteristic extracting module, corresponding step 14) and 24) in the calculating of ITD characteristic parameter:
The binaural sound signal is inputted the ITD characteristic extracting module through after pre-service and the end-point detection with its signal to noise ratio (S/N ratio) parameter with each subband of each frame.Adopt constant signal-noise ratio threshold, the frequency band of selecting signal to noise ratio (S/N ratio) to be higher than threshold value carries out the calculating of ITD.When the location clue is extracted, select the high spectrum signal of signal to noise ratio (S/N ratio) to carry out ITD and estimate, and give up to fall the low spectrum signal of signal to noise ratio (S/N ratio), Effective Raise the extraction accuracy of signals and associated noises location clue, thereby improved positioning performance.
ITD estimation procedure and the formula of i frame acoustical signal are as follows:
(1) according to subband signal to noise ratio (S/N ratio) and threshold value, calculate the SNR identification parameter SNRIndex of each frequency:
(2) according to the SNR identification parameter, left and right sides road acoustical signal frequency spectrum is revised.In the binaural sound signal spectrum, the frequency spectrum that signal to noise ratio (S/N ratio) is lower than the subband of threshold value is made as 0:
P
u=P
i·*SNRIndex
P
rr=P
r·*SNRIndex
Wherein, P
iAnd P
rLeft and right sides road acoustical signal frequency spectrum, P
LlAnd P
RrFor according to revising rear left right wing acoustical signal frequency spectrum.
(3) use the broad sense cross-correlation method to carry out the estimation of ITD.
The cross-spectral density P of left and right sides acoustical signal
LrComputing formula be: P
Lr=P
Ll* P
RrBy P
LrThrough the IFFT conversion, can obtain cross correlation function R
Lr(k).Here R
LrThe cross correlation function of binaural signal when (k) the expression mistiming is k sampled point.
5, training module, corresponding performing step 15):
Training module is used for setting up the statistical model of location feature, and its input signal is the known acoustical signal in orientation, through characteristic extraction procedure, estimates the ITD parameter of each orientation acoustical signal.Wherein, with the average of the ITD of each the orientation multiframe acoustical signal parameter as the VQ model of this orientation ITD.
The present invention uses the Virtual Sound of HRIR data that the MIT Media Lab measures and the generation of white noise convolution as training data; Use the HRIR data acquisition in wide 37 orientation, surface level right side of KEMAR microtia (0 °~180 ° of θ) for the virtual sound signal of training, the angle intervals of this partial data is 5 °.
6, locating module, corresponding performing step 25):
Locating module is used for each orientation characteristic model of acoustical signal to be measured and training module foundation is mated one by one and seeks the orientation of likelihood score maximum.Position fixing process carries out according to the following steps:
(1) signal to noise ratio (S/N ratio) of each each sub-band of frame of calculating acoustical signal to be positioned;
(2) acoustical signal to be positioned is carried out FFT, the frequency band that is lower than signal-noise ratio threshold is made as 0 with its amplitude;
(3) the ITD characteristic parameter of extraction acoustical signal to be positioned;
(4) according to the ITD characteristic parameter at 0 °~90 °, search minimum euclidean distances in 270 ° of-360 ° of scopes, orientation, output location:
In the following formula, λ
p(p=1,2 ..., P, P are positional number) be the value of model ITD.X is for measuring the ITD value.P* is the forward acoustic source position of output.
Build positioning system according to the said system framework, the training of advanced row data, then be used for the binaural sound sources location, through the experiment contrast, compare with existing binary channel acoustical signal location technology, the method that the present invention proposes can obviously improve the performance of auditory localization under the noise, when signal to noise ratio (S/N ratio) is 0dB, correct localization of the present invention reaches 89%, the art methods correct localization only is 63%, during signal to noise ratio (S/N ratio) 10dB, auditory localization accuracy of the present invention can reach 94%, and the art methods correct localization is 82%.
Claims (1)
1. binaural sound sources localization method based on the subband SNR estimation, it is characterized in that the training of advanced row data, training data is the known acoustical signal in orientation, through feature extraction, estimate the interaural difference ITD parameter of each orientation acoustical signal, with the average of the ITD parameter of each the orientation multiframe acoustical signal parameter as the vector quantization VQ model of this orientation ITD parameter, set up the orientation mapping model; During actual auditory localization, input binary channel acoustical signal, the input acoustical signal is passed through first the frequency domain conversion, divides some subbands at frequency domain, estimate each subband signal to noise ratio (S/N ratio), the snr threshold of each subband signal to noise ratio (S/N ratio) and setting is compared, select signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculate subband ITD characteristic parameter, orientation mapping model according to subband ITD characteristic parameter and data training foundation mates one by one, estimate based on Euclidean distance, the output orientation, concrete steps comprise:
1) data training:
11) use 37 orientation, the wide surface level right side of KEMAR microtia, i.e. the coherent pulse response HRIR data of θ=0 °~180 ° are with the known Virtual Sound of white noise convolution generating direction;
12) step 11) gained Virtual Sound is carried out pre-service, comprise amplitude normalization, pre-emphasis, minute frame and windowing, each frame acoustical signal in each orientation is obtained stably single frames signal;
13) the described stably single frames of step 12) signal is carried out end-point detection, obtain effective single frames signal;
14) calculate each single frames signal and carry out interaural difference ITD characteristic parameter, obtain the ITD training sample;
15) according to step 14) gained ITD training sample, with the average of the ITD training sample of each the orientation multiframe acoustical signal parameter as the vector quantization VQ model of corresponding orientation ITD, set up the orientation mapping model;
2) positioning step for the treatment of localization of sound source location is:
21) acoustical signal that gathers is carried out pre-service, comprise amplitude normalization, pre-emphasis, minute frame and windowing, each frame acoustical signal in each orientation is obtained stably single frames signal;
22) with step 21) gained single frames signal carries out end-point detection, obtains effective single frames signal;
23) with step 22) the effective single frames signal of gained carries out the FFT conversion, is divided into some subbands, calculates the signal to noise ratio (S/N ratio) of estimating each subband, and described subband adopts average division rule, is divided into 7-13 subband;
24) snr threshold with each subband signal to noise ratio (S/N ratio) and setting compares, and the subband amplitude that will be lower than snr threshold is made as 0, selects signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculates subband ITD characteristic parameter;
25) the orientation mapping model according to subband ITD characteristic parameter and data training foundation mates one by one, estimates the output azimuth information according to Euclidean distance.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201110448129 CN102565759B (en) | 2011-12-29 | 2011-12-29 | Binaural sound source localization method based on sub-band signal to noise ratio estimation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201110448129 CN102565759B (en) | 2011-12-29 | 2011-12-29 | Binaural sound source localization method based on sub-band signal to noise ratio estimation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102565759A CN102565759A (en) | 2012-07-11 |
CN102565759B true CN102565759B (en) | 2013-10-30 |
Family
ID=46411648
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 201110448129 Expired - Fee Related CN102565759B (en) | 2011-12-29 | 2011-12-29 | Binaural sound source localization method based on sub-band signal to noise ratio estimation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102565759B (en) |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103901400B (en) * | 2014-04-10 | 2016-08-17 | 北京大学深圳研究生院 | A kind of based on delay compensation and ears conforming binaural sound source of sound localization method |
CN103901401B (en) * | 2014-04-10 | 2016-08-17 | 北京大学深圳研究生院 | A kind of binaural sound source of sound localization method based on ears matched filtering device |
CN104464750B (en) * | 2014-10-24 | 2017-07-07 | 东南大学 | A kind of speech separating method based on binaural sound sources positioning |
CN104468576A (en) * | 2014-12-10 | 2015-03-25 | 深圳市彩煌通信技术有限公司 | Protocol conversion method based on passive optical network and protocol converter |
CN105204002B (en) * | 2015-10-19 | 2019-01-04 | Tcl集团股份有限公司 | A kind of sound localization method and system |
CN106226739A (en) * | 2016-07-29 | 2016-12-14 | 太原理工大学 | Merge the double sound source localization method of Substrip analysis |
CN106373589B (en) * | 2016-09-14 | 2019-07-26 | 东南大学 | A kind of ears mixing voice separation method based on iteration structure |
CN107799124A (en) * | 2017-10-12 | 2018-03-13 | 安徽咪鼠科技有限公司 | A kind of VAD detection methods applied to intelligent sound mouse |
CN108122559B (en) * | 2017-12-21 | 2021-05-14 | 北京工业大学 | Binaural sound source positioning method based on deep learning in digital hearing aid |
CN109164415B (en) * | 2018-09-07 | 2022-09-16 | 东南大学 | Binaural sound source positioning method based on convolutional neural network |
CN109298642B (en) * | 2018-09-20 | 2021-08-27 | 三星电子(中国)研发中心 | Method and device for monitoring by adopting intelligent sound box |
CN110133596B (en) * | 2019-05-13 | 2023-06-23 | 江苏第二师范学院(江苏省教育科学研究院) | Array sound source positioning method based on frequency point signal-to-noise ratio and bias soft decision |
CN110221249A (en) * | 2019-05-16 | 2019-09-10 | 西北工业大学 | Compressed sensing based broadband sound source localization method |
CN111707990B (en) * | 2020-08-19 | 2021-05-14 | 东南大学 | Binaural sound source positioning method based on dense convolutional network |
CN116316706B (en) * | 2023-05-08 | 2023-07-21 | 湖南大学 | Oscillation positioning method and system based on complementary average inherent time scale decomposition |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6178245B1 (en) * | 2000-04-12 | 2001-01-23 | National Semiconductor Corporation | Audio signal generator to emulate three-dimensional audio signals |
EP1600791B1 (en) * | 2004-05-26 | 2009-04-01 | Honda Research Institute Europe GmbH | Sound source localization based on binaural signals |
CN101982793B (en) * | 2010-10-20 | 2012-07-04 | 武汉大学 | Mobile sound source positioning method based on stereophonic signals |
-
2011
- 2011-12-29 CN CN 201110448129 patent/CN102565759B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN102565759A (en) | 2012-07-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102565759B (en) | Binaural sound source localization method based on sub-band signal to noise ratio estimation | |
CN102438189B (en) | Dual-channel acoustic signal-based sound source localization method | |
CN109839612B (en) | Sound source direction estimation method and device based on time-frequency masking and deep neural network | |
CN106373589B (en) | A kind of ears mixing voice separation method based on iteration structure | |
CN104464750B (en) | A kind of speech separating method based on binaural sound sources positioning | |
Mandel et al. | An EM algorithm for localizing multiple sound sources in reverberant environments | |
CN111429939B (en) | Sound signal separation method of double sound sources and pickup | |
EP1818909B1 (en) | Voice recognition system | |
CN107219512B (en) | Sound source positioning method based on sound transfer function | |
CN106504763A (en) | Based on blind source separating and the microphone array multiple target sound enhancement method of spectrum-subtraction | |
CN107346664A (en) | A kind of ears speech separating method based on critical band | |
CN106226739A (en) | Merge the double sound source localization method of Substrip analysis | |
Ren et al. | A novel multiple sparse source localization using triangular pyramid microphone array | |
CN109188362A (en) | A kind of microphone array auditory localization signal processing method | |
Cai et al. | Multi-Channel Training for End-to-End Speaker Recognition Under Reverberant and Noisy Environment. | |
CN106019230B (en) | A kind of sound localization method based on i-vector Speaker Identification | |
CN103901400A (en) | Binaural sound source positioning method based on delay compensation and binaural coincidence | |
Wang et al. | Pseudo-determined blind source separation for ad-hoc microphone networks | |
CN114822584B (en) | Transmission device signal separation method based on integral improved generalized cross-correlation | |
Mandel et al. | EM localization and separation using interaural level and phase cues | |
Plinge et al. | Online multi-speaker tracking using multiple microphone arrays informed by auditory scene analysis | |
Talagala et al. | Binaural localization of speech sources in the median plane using cepstral HRTF extraction | |
CN117711422A (en) | Underdetermined voice separation method and device based on compressed sensing space information estimation | |
Hu et al. | Robust binaural sound localisation with temporal attention | |
Wu et al. | Binaural localization of speech sources in 3-D using a composite feature vector of the HRTF |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20131030 Termination date: 20161229 |