Gburrek et al., 2021 - Google Patents
On source-microphone distance estimation using convolutional recurrent neural networksGburrek et al., 2021
View PDF- Document ID
- 15203320709659435288
- Author
- Gburrek T
- Schmalenstroeer J
- Haeb-Umbach R
- Publication year
- Publication venue
- Speech Communication; 14th ITG Conference
External Links
Snippet
Several features computed from an audio signal have been shown to depend on the distance between the acoustic source and the receiver, but at the same time are heavily influenced by room characteristics and the microphone setup. While neural networks, if …
- 230000001537 neural 0 title abstract description 9
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S3/00—Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received
- G01S3/80—Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic or infrasonic waves
- G01S3/802—Systems for determining direction or deviation from predetermined direction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/18—Methods or devices for transmitting, conducting, or directing sound
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K15/00—Acoustics not otherwise provided for
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Adavanne et al. | Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network | |
Erdogan et al. | Improved MVDR beamforming using single-channel mask prediction networks. | |
US10334357B2 (en) | Machine learning based sound field analysis | |
JP6129316B2 (en) | Apparatus and method for providing information-based multi-channel speech presence probability estimation | |
Wang et al. | Over-determined source separation and localization using distributed microphones | |
Lee et al. | Sound source localization based on GCC-PHAT with diffuseness mask in noisy and reverberant environments | |
EP3387648A1 (en) | Localization algorithm for sound sources with known statistics | |
Wang et al. | Time difference of arrival estimation based on a Kronecker product decomposition | |
Taseska et al. | Blind source separation of moving sources using sparsity-based source detection and tracking | |
Gelderblom et al. | Synthetic data for dnn-based doa estimation of indoor speech | |
Zhang et al. | Microphone array generalization for multichannel narrowband deep speech enhancement | |
Mack et al. | Single-Channel Blind Direct-to-Reverberation Ratio Estimation Using Masking. | |
KR20210137146A (en) | Speech augmentation using clustering of queues | |
Kindt et al. | 2d acoustic source localisation using decentralised deep neural networks on distributed microphone arrays | |
Klein et al. | Direction-of-arrival estimation using a microphone array with the multichannel cross-correlation method | |
Gburrek et al. | On source-microphone distance estimation using convolutional recurrent neural networks | |
Aarabi et al. | Robust sound localization using conditional time–frequency histograms | |
Mane et al. | Localization of steady sound source and direction detection of moving sound source using CNN | |
Schwartz et al. | Array Configuration Mismatch in Deep DOA Estimation: Towards Robust Training | |
Hübner et al. | Efficient training data generation for phase-based DOA estimation | |
Kato et al. | TDOA estimation based on phase-voting cross correlation and circular standard deviation | |
Moore et al. | Room identification using frequency dependence of spectral decay statistics | |
ÇATALBAŞ et al. | 3D moving sound source localization via conventional microphones | |
Firoozabadi et al. | Combination of nested microphone array and subband processing for multiple simultaneous speaker localization | |
Cirillo et al. | Sound mapping in reverberant rooms by a robust direct method |