Gburrek et al., 2021 - Google Patents

On source-microphone distance estimation using convolutional recurrent neural networks

Gburrek et al., 2021

Document ID: 15203320709659435288
Author: Gburrek T; Schmalenstroeer J; Haeb-Umbach R
Publication year: 2021
Publication venue: Speech Communication; 14th ITG Conference

External Links

Cited by

Snippet

Several features computed from an audio signal have been shown to depend on the distance between the acoustic source and the receiver, but at the same time are heavily influenced by room characteristics and the microphone setup. While neural networks, if …

Continue reading at ris.uni-paderborn.de (PDF) (other versions)

230000001537 neural 0 title abstract description 9

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S3/00—Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received
- G01S3/80—Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic or infrasonic waves
- G01S3/802—Systems for determining direction or deviation from predetermined direction
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/18—Methods or devices for transmitting, conducting, or directing sound
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K15/00—Acoustics not otherwise provided for

Similar Documents

Publication	Publication Date	Title
Adavanne et al.	2018	Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network
Erdogan et al.	2016	Improved MVDR beamforming using single-channel mask prediction networks.
US10334357B2 (en)	2019-06-25	Machine learning based sound field analysis
JP6129316B2 (en)	2017-05-17	Apparatus and method for providing information-based multi-channel speech presence probability estimation
Wang et al.	2016	Over-determined source separation and localization using distributed microphones
Lee et al.	2020	Sound source localization based on GCC-PHAT with diffuseness mask in noisy and reverberant environments
EP3387648A1 (en)	2018-10-17	Localization algorithm for sound sources with known statistics
Wang et al.	2020	Time difference of arrival estimation based on a Kronecker product decomposition
Taseska et al.	2017	Blind source separation of moving sources using sparsity-based source detection and tracking
Gelderblom et al.	2021	Synthetic data for dnn-based doa estimation of indoor speech
Zhang et al.	2021	Microphone array generalization for multichannel narrowband deep speech enhancement
Mack et al.	2020	Single-Channel Blind Direct-to-Reverberation Ratio Estimation Using Masking.
KR20210137146A (en)	2021-11-17	Speech augmentation using clustering of queues
Kindt et al.	2021	2d acoustic source localisation using decentralised deep neural networks on distributed microphone arrays
Klein et al.	2012	Direction-of-arrival estimation using a microphone array with the multichannel cross-correlation method
Gburrek et al.	2021	On source-microphone distance estimation using convolutional recurrent neural networks
Aarabi et al.	2003	Robust sound localization using conditional time–frequency histograms
Mane et al.	2019	Localization of steady sound source and direction detection of moving sound source using CNN
Schwartz et al.	2023	Array Configuration Mismatch in Deep DOA Estimation: Towards Robust Training
Hübner et al.	2021	Efficient training data generation for phase-based DOA estimation
Kato et al.	2017	TDOA estimation based on phase-voting cross correlation and circular standard deviation
Moore et al.	2018	Room identification using frequency dependence of spectral decay statistics
ÇATALBAŞ et al.	2017	3D moving sound source localization via conventional microphones
Firoozabadi et al.	2012	Combination of nested microphone array and subband processing for multiple simultaneous speaker localization
Cirillo et al.	2008	Sound mapping in reverberant rooms by a robust direct method