[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN102565759B - Binaural sound source localization method based on sub-band signal to noise ratio estimation - Google Patents

Binaural sound source localization method based on sub-band signal to noise ratio estimation Download PDF

Info

Publication number
CN102565759B
CN102565759B CN 201110448129 CN201110448129A CN102565759B CN 102565759 B CN102565759 B CN 102565759B CN 201110448129 CN201110448129 CN 201110448129 CN 201110448129 A CN201110448129 A CN 201110448129A CN 102565759 B CN102565759 B CN 102565759B
Authority
CN
China
Prior art keywords
signal
orientation
subband
itd
noise ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 201110448129
Other languages
Chinese (zh)
Other versions
CN102565759A (en
Inventor
周琳
周菲菲
吴镇扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN 201110448129 priority Critical patent/CN102565759B/en
Publication of CN102565759A publication Critical patent/CN102565759A/en
Application granted granted Critical
Publication of CN102565759B publication Critical patent/CN102565759B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A binaural sound source localization method based on sub-band signal to noise ratio estimation is an improved sound source localization method, wherein the mean value of the ITD (Interaural Time Difference) of various orientations is used as the localization characteristic clue for the sound source orientation to build an orientation mapping model; during the actual sound source localization, a dual-channel acoustic signal is input, the input acoustic signal is firstly subjected to frequency domain transformation, a frequency domain is divided into a plurality of sub-bands, signal to noise ratio estimation is carried out in each sub-band, according to the sub-band signal to noise ratio, the power spectrum of the corresponding sub-band is selected to calculate the ITD parameters of each frame, one-by-one match is performed according to the orientation characteristic model built by the ITD characteristic parameters and a training module, and based on the Euclidean distance measurement, the orientation is output. With the binaural sound source localization method, the performance of sound source localization in noisy environments can be improved.

Description

A kind of binaural sound sources localization method based on the subband SNR estimation
Technical field
The invention belongs to the auditory localization technical field, be a kind of binaural sound sources localization method based on the subband SNR estimation.
Background technology
The auditory localization technology can help to transmit and the identification visual information as an emerging intersect edge subject, increases the fidelity of three-dimensional artificial environment.At present main location algorithm has the auditory localization algorithm of multi-microphone array and based on the auditory localization algorithm of binary channel.The auditory localization algorithm of multi-microphone array exists that calculated amount is large, the microphone array size is large, and algorithm is subjected to the factor such as reverberation to disturb the problems such as large.Based on the aural signature of the sound localization method of binary channel acoustical signal simulation people ear, can realize comparatively accurately auditory localization.The most representative interaural difference ITD that is based on simple crosscorrelation (Interaural Time Difference) estimates, yet for signals and associated noises, the positioning performance degradation of estimating based on the ITD of simple crosscorrelation.
Summary of the invention
The problem to be solved in the present invention is: the auditory localization algorithm of present multi-microphone array exists that calculated amount is large, the microphone array size is large, and algorithm is subjected to the factor such as reverberation to disturb the problems such as large, and existing sound localization method based on the binary channel acoustical signal is not enough for the positioning performance of signals and associated noises.
Technical scheme of the present invention is: a kind of binaural sound sources localization method based on the subband SNR estimation, the training of advanced row data, training data is the known acoustical signal in orientation, through feature extraction, estimate the interaural difference ITD parameter of each orientation acoustical signal, with the average of the ITD parameter of each the orientation multiframe acoustical signal parameter as the vector quantization VQ model of this orientation ITD parameter, set up the orientation mapping model; During actual auditory localization, input binary channel acoustical signal, the input acoustical signal is passed through first the frequency domain conversion, divides some subbands at frequency domain, estimates each subband signal to noise ratio (S/N ratio), the snr threshold of each subband signal to noise ratio (S/N ratio) and setting is compared, select signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculate subband ITD characteristic parameter, train the orientation mapping model of setting up to mate one by one according to subband ITD characteristic parameter and data, estimate the output orientation based on Euclidean distance.
Concrete steps comprise:
1) data training:
11) use 37 orientation, the wide surface level right side of KEMAR microtia, i.e. the coherent pulse response HRIR data of θ=0 °~180 ° are with the known Virtual Sound of white noise convolution generating direction;
12) to step 11) the gained Virtual Sound carries out pre-service, comprises amplitude normalization, pre-emphasis, minute frame and windowing, and each frame acoustical signal in each orientation is obtained stably single frames signal;
13) with step 12) described stably single frames signal carries out end-point detection, obtains effective single frames signal;
14) calculate each single frames signal and carry out interaural difference ITD characteristic parameter, obtain the ITD training sample;
15) according to step 14) gained ITD training sample, with the average of the ITD training sample of each the orientation multiframe acoustical signal parameter as the vector quantization VQ model of corresponding orientation ITD, set up the orientation mapping model;
2) positioning step for the treatment of localization of sound source location is:
21) acoustical signal that gathers is carried out pre-service, comprise amplitude normalization, pre-emphasis, minute frame and windowing, each frame acoustical signal in each orientation is obtained stably single frames signal;
22) with step 21) gained single frames signal carries out end-point detection, obtains effective single frames signal;
23) with step 22) the effective single frames signal of gained carries out the FFT conversion, is divided into some subbands, calculates the signal to noise ratio (S/N ratio) of estimating each subband, and described subband adopts average division rule, is divided into 7-13 subband;
24) snr threshold with each subband signal to noise ratio (S/N ratio) and setting compares, and the subband amplitude that will be lower than snr threshold is made as 0, selects signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculates subband ITD characteristic parameter;
25) the orientation mapping model according to subband ITD characteristic parameter and data training foundation mates one by one, estimates the output azimuth information according to Euclidean distance.
Compare with existing binary channel acoustical signal location technology, the method that the present invention proposes can obviously improve the performance of auditory localization under the noise, when signal to noise ratio (S/N ratio) is 0dB, correct localization of the present invention reaches 89%, original method correct localization only is 63%, during signal to noise ratio (S/N ratio) 10dB, auditory localization accuracy of the present invention can reach 94%, and original method correct localization is 82%.
Description of drawings
Fig. 1 is the spatial coordinate system synoptic diagram of auditory localization of the present invention.
Fig. 2 is positioning system block diagram of the present invention.
Embodiment
The training of the advanced row of the present invention data, training data is the known acoustical signal in orientation, through feature extraction, estimate the interaural difference ITD parameter of each orientation acoustical signal, with the average of the ITD parameter of each the orientation multiframe acoustical signal parameter as vector quantization VQ (Vector Quantization) model of this orientation ITD parameter, set up the orientation mapping model; During actual auditory localization, input binary channel acoustical signal, the input acoustical signal is passed through first the frequency domain conversion, be Fast Fourier Transform (FFT) FFT (Fast Fourier Transform), divide some subbands at frequency domain, estimate each subband signal to noise ratio (S/N ratio), the snr threshold of each subband signal to noise ratio (S/N ratio) and setting is compared, select signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculate subband ITD characteristic parameter, orientation mapping model according to subband ITD characteristic parameter and data training foundation mates one by one, estimates the output orientation based on Euclidean distance.
Fig. 1 is the spatial coordinate system synoptic diagram of auditory localization of the present invention, and in the present invention, sound source position is by coordinate Unique definite.Wherein, 0≤r<+∞ is the distance of sound source and initial point; The elevation angle
Figure BDA0000126267080000032
Be the angle of direction vector and surface level,
Figure BDA0000126267080000033
With+90 ° respectively the expression under, surface level and directly over; 0 °≤θ of deflection<360 ° is that direction vector is at the projection of surface level and the angle of middle vertical plane.On the surface level, expression dead ahead, θ=0 °, along clockwise direction θ=90 °, 180 ° and 270 ° respectively expression positive right, just after and front-left.
The inventive method comprises that data training and auditory localization two go on foot greatly:
1) data training:
11) use 37 orientation, the wide surface level right side of KEMAR microtia, i.e. the coherent pulse response HRIR data of θ=0 °~180 ° are with the known Virtual Sound of white noise convolution generating direction;
12) to step 11) the gained Virtual Sound carries out pre-service, comprises amplitude normalization, pre-emphasis, minute frame and windowing, and each frame acoustical signal in each orientation is obtained stably single frames signal;
13) with step 12) described stably single frames signal carries out end-point detection, obtains effective single frames signal;
14) calculate each single frames signal and carry out interaural difference ITD characteristic parameter, obtain the ITD training sample;
15) according to step 14) gained ITD training sample, with the average of the ITD training sample of each the orientation multiframe acoustical signal parameter as the vector quantization VQ model of corresponding orientation ITD, set up the orientation mapping model;
2) positioning step for the treatment of localization of sound source location is:
21) acoustical signal that gathers is carried out pre-service, comprise amplitude normalization, pre-emphasis, minute frame and windowing, each frame acoustical signal in each orientation is obtained stably single frames signal;
22) with step 21) gained single frames signal carries out end-point detection, obtains effective single frames signal;
23) with step 22) the effective single frames signal of gained carries out the FFT conversion, is divided into some subbands, calculates the signal to noise ratio (S/N ratio) of estimating each subband, and described subband adopts average division rule, is divided into 7-13 subband;
24) snr threshold with each subband signal to noise ratio (S/N ratio) and setting compares, and the subband amplitude that will be lower than snr threshold is made as 0, selects signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculates subband ITD characteristic parameter;
25) the orientation mapping model according to subband ITD characteristic parameter and data training foundation mates one by one, estimates the output azimuth information according to Euclidean distance.
Following correspondence implementation step of the present invention is elaborated to the implementation of technical solution of the present invention by reference to the accompanying drawings:
Fig. 2 has provided the realization block diagram based on the auditory localization of the binary channel acoustical signal of SNR estimation, HRTF (Head-Response Transfer Function) is a related transfer function, with the white noise convolution, produce the directionality virtual sound signal that is used for training.Among the figure respectively the treatment scheme to training and testing stage acoustical signal mark, lower mask body is introduced function and the implementation of each module.
1, pretreatment module, corresponding step 12) and 21) described in pre-service:
Owing to may be mingled with a lot of electronic noises and ground unrest in the acoustical signal that collecting device collects, in order to suppress noise effect to the analysis of follow-up signal, need to carry out pre-service; The pre-service of this method comprises: amplitude normalization, pre-emphasis, minute frame and windowing.It is 30ms that the present invention takes frame length, and frame moves and is 10ms.
Pre-emphasis adopts order digital filter H (z)=1-μ z -1, μ=0.97 wherein.Voice signal after this method uses Hamming window to minute frame carries out windowing process, and the n frame signal after the windowing can be expressed as x n(m)=w H(m) x (nN+m) 0≤m<N, N is a frame sampling data length, is 1323,
Wherein, w H ( m ) = 0.54 - 0.46 cos [ 2 &pi;m / ( N - 1 ) ] 0 &le; m < N 0 m &GreaterEqual; N Be Hamming window.
2, endpoint detection module, corresponding step 13) and 22) described in end-point detection:
The purpose of end-point detection is exactly to receive the acoustical signal from one section to find out starting point and the end point of useful signal, thereby only useful signal is processed.End-point detection not only can reduce memory data output and processing time accurately, and can get rid of the interference of unvoiced segments and noise.The inventive method adopts short-time energy and zero-crossing rate feature to combine, and monaural signal is detected, and the method that adopts short-time energy and zero-crossing rate feature to combine to carry out sound end to detect is prior art, and the present invention here simply introduces:
Short-time energy is the average energy that a frame signal has, and computing formula is
E n = &Sigma; m = 0 N - 1 | x n ( m ) | 2 = &Sigma; k = 0 N - 1 | X n ( k ) | 2
X wherein n(m), m=0,1 ... N-1 is for gathering acoustical signal, X through pretreated n frame n(k), k=0,1 ... N-1 is corresponding frequency-region signal.The threshold value of short-time energy can be made as fixed value and also can use the multi-frame mean energy as decision threshold.
Short-time zero-crossing rate is the number percent that number of times that a frame waveform passes zero level accounts for frame length, and for discrete signal, as long as the symbol of more adjacent two sampled points, computing formula is
Z n = 1 2 N &Sigma; m = 1 N - 1 | sgn { x n ( m ) } - sgn { x n ( m - 1 ) } |
Wherein, sgn (x) is sign function.The decision threshold that the present invention uses is Z Min=0.01, Z Max=0.4, lower limit Z wherein is set MinIt is the impact for filtering part mute frame.
Short-time energy and zero-crossing rate feature are useful signal within decision threshold, thereby can judge sound section initial sum final position.
3, subband signal to noise ratio snr estimation module, corresponding step 23):
The useful signal that the opposite end spot check records carries out the frequency domain conversion, divides some subbands at frequency domain, carries out SNR estimation in each subband, and described subband adopts average division rule, is divided into 7-13 subband among the present invention; Concrete formula is as follows:
The frequency-region signal model can be with vector representation:
X(k)=S(k)+N(k)
X(k)={x i(k),x r(k)} T
S(k)={S i(k),S r(k)} T
Wherein, X (k) is noisy speech, and S (k) is pure acoustical signal, and k represents frequency domain.Subscript l, r represent respectively left and right two-way acoustical signal.
For the binary channel acoustical signal, travel path is distinguished to some extent to the decay of different frequency acoustical signal, and because different from the sound localization method of multi-microphone array, the binary channel auditory localization only has the two-way acoustical signal, so this method estimated snr in subband, one frame signal is divided into some subbands at frequency domain, estimates the covariance matrix of each subband, then calculated the signal to noise ratio (S/N ratio) of each frequency by covariance matrix.By frequency-region signal model vector form, as can be known, the covariance matrix of i subband is
R = R 1 R 2 R 3 R 4 = E { X i ( k ) X i T ( k ) } = P li + &sigma; 2 P li * P ri P li * P ri P ri + &sigma; 2 = P li + &sigma; 2 IID P li IID P li IId * P li + &sigma; 2
Wherein, X i(k) be the frequency domain vectors of the left and right sides road acoustical signal composition of i subband.P Li, P Ri, σ 2The energy and the noise power spectral density that represent respectively i subband left and right sides acoustical signal, IID is the interaural intensity difference of this subband acoustical signal.
Can be drawn voice and the noise energy power spectrum density of i subband by following formula:
By equation P li 2 + ( R 4 - R 1 ) P li - R 2 2 = 0 Can draw P Li
σ 2=R 1-P li P ri=R 42
Thus, can draw the signal to noise ratio (S/N ratio) of i subband,
Figure BDA0000126267080000062
In the subband SNR estimation, because just there is interaural intensity difference in binaural signal at the frequency spectrum of different sub-band itself.Therefore, the decision of subband size Algorithm Performance.
The number selection of subband is relevant with the factors such as height of the type of sound-source signal, signal to noise ratio (S/N ratio).The number of subband needs moderate, and on the one hand, if the subband number is too many, the Frequency point of each subband the inside when SNR is low, has added more insecure frequencies very little, has affected algorithm effect.Because the average SNR of a subband is lower, just can ignore the frequency data of whole subband on the other hand, the subband number also should not be very little.The present invention arranges the Simulation Test Environment of different parameters, and according to test result, balance considers that the sub band number that the present invention adopts is 7-13.
4, ITD characteristic extracting module, corresponding step 14) and 24) in the calculating of ITD characteristic parameter:
The binaural sound signal is inputted the ITD characteristic extracting module through after pre-service and the end-point detection with its signal to noise ratio (S/N ratio) parameter with each subband of each frame.Adopt constant signal-noise ratio threshold, the frequency band of selecting signal to noise ratio (S/N ratio) to be higher than threshold value carries out the calculating of ITD.When the location clue is extracted, select the high spectrum signal of signal to noise ratio (S/N ratio) to carry out ITD and estimate, and give up to fall the low spectrum signal of signal to noise ratio (S/N ratio), Effective Raise the extraction accuracy of signals and associated noises location clue, thereby improved positioning performance.
ITD estimation procedure and the formula of i frame acoustical signal are as follows:
(1) according to subband signal to noise ratio (S/N ratio) and threshold value, calculate the SNR identification parameter SNRIndex of each frequency:
(2) according to the SNR identification parameter, left and right sides road acoustical signal frequency spectrum is revised.In the binaural sound signal spectrum, the frequency spectrum that signal to noise ratio (S/N ratio) is lower than the subband of threshold value is made as 0:
P u=P i·*SNRIndex
P rr=P r·*SNRIndex
Wherein, P iAnd P rLeft and right sides road acoustical signal frequency spectrum, P LlAnd P RrFor according to revising rear left right wing acoustical signal frequency spectrum.
(3) use the broad sense cross-correlation method to carry out the estimation of ITD.
The cross-spectral density P of left and right sides acoustical signal LrComputing formula be: P Lr=P Ll* P RrBy P LrThrough the IFFT conversion, can obtain cross correlation function R Lr(k).Here R LrThe cross correlation function of binaural signal when (k) the expression mistiming is k sampled point.
Thereby can calculate, the ITD estimated value of i frame acoustical signal is
Figure BDA0000126267080000071
5, training module, corresponding performing step 15):
Training module is used for setting up the statistical model of location feature, and its input signal is the known acoustical signal in orientation, through characteristic extraction procedure, estimates the ITD parameter of each orientation acoustical signal.Wherein, with the average of the ITD of each the orientation multiframe acoustical signal parameter as the VQ model of this orientation ITD.
The present invention uses the Virtual Sound of HRIR data that the MIT Media Lab measures and the generation of white noise convolution as training data; Use the HRIR data acquisition in wide 37 orientation, surface level right side of KEMAR microtia (0 °~180 ° of θ) for the virtual sound signal of training, the angle intervals of this partial data is 5 °.
6, locating module, corresponding performing step 25):
Locating module is used for each orientation characteristic model of acoustical signal to be measured and training module foundation is mated one by one and seeks the orientation of likelihood score maximum.Position fixing process carries out according to the following steps:
(1) signal to noise ratio (S/N ratio) of each each sub-band of frame of calculating acoustical signal to be positioned;
(2) acoustical signal to be positioned is carried out FFT, the frequency band that is lower than signal-noise ratio threshold is made as 0 with its amplitude;
(3) the ITD characteristic parameter of extraction acoustical signal to be positioned;
(4) according to the ITD characteristic parameter at 0 °~90 °, search minimum euclidean distances in 270 ° of-360 ° of scopes, orientation, output location: p * = arg min 1 &le; p &le; P d ( x , &lambda; p )
In the following formula, λ p(p=1,2 ..., P, P are positional number) be the value of model ITD.X is for measuring the ITD value.P* is the forward acoustic source position of output.
Build positioning system according to the said system framework, the training of advanced row data, then be used for the binaural sound sources location, through the experiment contrast, compare with existing binary channel acoustical signal location technology, the method that the present invention proposes can obviously improve the performance of auditory localization under the noise, when signal to noise ratio (S/N ratio) is 0dB, correct localization of the present invention reaches 89%, the art methods correct localization only is 63%, during signal to noise ratio (S/N ratio) 10dB, auditory localization accuracy of the present invention can reach 94%, and the art methods correct localization is 82%.

Claims (1)

1. binaural sound sources localization method based on the subband SNR estimation, it is characterized in that the training of advanced row data, training data is the known acoustical signal in orientation, through feature extraction, estimate the interaural difference ITD parameter of each orientation acoustical signal, with the average of the ITD parameter of each the orientation multiframe acoustical signal parameter as the vector quantization VQ model of this orientation ITD parameter, set up the orientation mapping model; During actual auditory localization, input binary channel acoustical signal, the input acoustical signal is passed through first the frequency domain conversion, divides some subbands at frequency domain, estimate each subband signal to noise ratio (S/N ratio), the snr threshold of each subband signal to noise ratio (S/N ratio) and setting is compared, select signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculate subband ITD characteristic parameter, orientation mapping model according to subband ITD characteristic parameter and data training foundation mates one by one, estimate based on Euclidean distance, the output orientation, concrete steps comprise:
1) data training:
11) use 37 orientation, the wide surface level right side of KEMAR microtia, i.e. the coherent pulse response HRIR data of θ=0 °~180 ° are with the known Virtual Sound of white noise convolution generating direction;
12) step 11) gained Virtual Sound is carried out pre-service, comprise amplitude normalization, pre-emphasis, minute frame and windowing, each frame acoustical signal in each orientation is obtained stably single frames signal;
13) the described stably single frames of step 12) signal is carried out end-point detection, obtain effective single frames signal;
14) calculate each single frames signal and carry out interaural difference ITD characteristic parameter, obtain the ITD training sample;
15) according to step 14) gained ITD training sample, with the average of the ITD training sample of each the orientation multiframe acoustical signal parameter as the vector quantization VQ model of corresponding orientation ITD, set up the orientation mapping model;
2) positioning step for the treatment of localization of sound source location is:
21) acoustical signal that gathers is carried out pre-service, comprise amplitude normalization, pre-emphasis, minute frame and windowing, each frame acoustical signal in each orientation is obtained stably single frames signal;
22) with step 21) gained single frames signal carries out end-point detection, obtains effective single frames signal;
23) with step 22) the effective single frames signal of gained carries out the FFT conversion, is divided into some subbands, calculates the signal to noise ratio (S/N ratio) of estimating each subband, and described subband adopts average division rule, is divided into 7-13 subband;
24) snr threshold with each subband signal to noise ratio (S/N ratio) and setting compares, and the subband amplitude that will be lower than snr threshold is made as 0, selects signal to noise ratio (S/N ratio) to be higher than the subband of snr threshold, calculates subband ITD characteristic parameter;
25) the orientation mapping model according to subband ITD characteristic parameter and data training foundation mates one by one, estimates the output azimuth information according to Euclidean distance.
CN 201110448129 2011-12-29 2011-12-29 Binaural sound source localization method based on sub-band signal to noise ratio estimation Expired - Fee Related CN102565759B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110448129 CN102565759B (en) 2011-12-29 2011-12-29 Binaural sound source localization method based on sub-band signal to noise ratio estimation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110448129 CN102565759B (en) 2011-12-29 2011-12-29 Binaural sound source localization method based on sub-band signal to noise ratio estimation

Publications (2)

Publication Number Publication Date
CN102565759A CN102565759A (en) 2012-07-11
CN102565759B true CN102565759B (en) 2013-10-30

Family

ID=46411648

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110448129 Expired - Fee Related CN102565759B (en) 2011-12-29 2011-12-29 Binaural sound source localization method based on sub-band signal to noise ratio estimation

Country Status (1)

Country Link
CN (1) CN102565759B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103901400B (en) * 2014-04-10 2016-08-17 北京大学深圳研究生院 A kind of based on delay compensation and ears conforming binaural sound source of sound localization method
CN103901401B (en) * 2014-04-10 2016-08-17 北京大学深圳研究生院 A kind of binaural sound source of sound localization method based on ears matched filtering device
CN104464750B (en) * 2014-10-24 2017-07-07 东南大学 A kind of speech separating method based on binaural sound sources positioning
CN104468576A (en) * 2014-12-10 2015-03-25 深圳市彩煌通信技术有限公司 Protocol conversion method based on passive optical network and protocol converter
CN105204002B (en) * 2015-10-19 2019-01-04 Tcl集团股份有限公司 A kind of sound localization method and system
CN106226739A (en) * 2016-07-29 2016-12-14 太原理工大学 Merge the double sound source localization method of Substrip analysis
CN106373589B (en) * 2016-09-14 2019-07-26 东南大学 A kind of ears mixing voice separation method based on iteration structure
CN107799124A (en) * 2017-10-12 2018-03-13 安徽咪鼠科技有限公司 A kind of VAD detection methods applied to intelligent sound mouse
CN108122559B (en) * 2017-12-21 2021-05-14 北京工业大学 Binaural sound source positioning method based on deep learning in digital hearing aid
CN109164415B (en) * 2018-09-07 2022-09-16 东南大学 Binaural sound source positioning method based on convolutional neural network
CN109298642B (en) * 2018-09-20 2021-08-27 三星电子(中国)研发中心 Method and device for monitoring by adopting intelligent sound box
CN110133596B (en) * 2019-05-13 2023-06-23 江苏第二师范学院(江苏省教育科学研究院) Array sound source positioning method based on frequency point signal-to-noise ratio and bias soft decision
CN110221249A (en) * 2019-05-16 2019-09-10 西北工业大学 Compressed sensing based broadband sound source localization method
CN111707990B (en) * 2020-08-19 2021-05-14 东南大学 Binaural sound source positioning method based on dense convolutional network
CN116316706B (en) * 2023-05-08 2023-07-21 湖南大学 Oscillation positioning method and system based on complementary average inherent time scale decomposition

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6178245B1 (en) * 2000-04-12 2001-01-23 National Semiconductor Corporation Audio signal generator to emulate three-dimensional audio signals
EP1600791B1 (en) * 2004-05-26 2009-04-01 Honda Research Institute Europe GmbH Sound source localization based on binaural signals
CN101982793B (en) * 2010-10-20 2012-07-04 武汉大学 Mobile sound source positioning method based on stereophonic signals

Also Published As

Publication number Publication date
CN102565759A (en) 2012-07-11

Similar Documents

Publication Publication Date Title
CN102565759B (en) Binaural sound source localization method based on sub-band signal to noise ratio estimation
CN102438189B (en) Dual-channel acoustic signal-based sound source localization method
CN109839612B (en) Sound source direction estimation method and device based on time-frequency masking and deep neural network
CN106373589B (en) A kind of ears mixing voice separation method based on iteration structure
CN104464750B (en) A kind of speech separating method based on binaural sound sources positioning
Mandel et al. An EM algorithm for localizing multiple sound sources in reverberant environments
CN111429939B (en) Sound signal separation method of double sound sources and pickup
EP1818909B1 (en) Voice recognition system
CN107219512B (en) Sound source positioning method based on sound transfer function
CN106504763A (en) Based on blind source separating and the microphone array multiple target sound enhancement method of spectrum-subtraction
CN107346664A (en) A kind of ears speech separating method based on critical band
CN106226739A (en) Merge the double sound source localization method of Substrip analysis
Ren et al. A novel multiple sparse source localization using triangular pyramid microphone array
CN109188362A (en) A kind of microphone array auditory localization signal processing method
Cai et al. Multi-Channel Training for End-to-End Speaker Recognition Under Reverberant and Noisy Environment.
CN106019230B (en) A kind of sound localization method based on i-vector Speaker Identification
CN103901400A (en) Binaural sound source positioning method based on delay compensation and binaural coincidence
Wang et al. Pseudo-determined blind source separation for ad-hoc microphone networks
CN114822584B (en) Transmission device signal separation method based on integral improved generalized cross-correlation
Mandel et al. EM localization and separation using interaural level and phase cues
Plinge et al. Online multi-speaker tracking using multiple microphone arrays informed by auditory scene analysis
Talagala et al. Binaural localization of speech sources in the median plane using cepstral HRTF extraction
CN117711422A (en) Underdetermined voice separation method and device based on compressed sensing space information estimation
Hu et al. Robust binaural sound localisation with temporal attention
Wu et al. Binaural localization of speech sources in 3-D using a composite feature vector of the HRTF

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20131030

Termination date: 20161229