Qian et al., 2022 - Google Patents
Speaker front‐back disambiguity using multi‐channel speech signalsQian et al., 2022
View PDF- Document ID
- 11607703097124362362
- Author
- Qian X
- Yang J
- Brutti A
- Publication year
- Publication venue
- Electronics Letters
External Links
Snippet
This paper tackles the front‐back disambiguity problem in speaker localization when the audio signals are captured by a symmetric microphone array. To this end, a deep neural network is proposed with an attention‐based mechanism designed to assign different …
- 230000004807 localization 0 abstract description 8
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/15—Aspects of sound capture and related signal processing for recording or reproduction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Grumiaux et al. | A survey of sound source localization with deep learning methods | |
US9008329B1 (en) | Noise reduction using multi-feature cluster tracker | |
Zhang et al. | Ambiear: mmwave based voice recognition in nlos scenarios | |
Yang et al. | Deepear: Sound localization with binaural microphones | |
Jiang et al. | Deep and CNN fusion method for binaural sound source localisation | |
Wang et al. | Deep-learning-assisted sound source localization from a flying drone | |
Dai et al. | Blind source separation‐based IVA‐Xception model for bird sound recognition in complex acoustic environments | |
Paikrao et al. | Consumer Personalized Gesture Recognition in UAV Based Industry 5.0 Applications | |
Malek et al. | Block‐online multi‐channel speech enhancement using deep neural network‐supported relative transfer function estimates | |
Liu et al. | Head‐related transfer function–reserved time‐frequency masking for robust binaural sound source localization | |
Ding et al. | Joint estimation of binaural distance and azimuth by exploiting deep neural networks | |
Liu et al. | Wavoice: An mmWave-Assisted Noise-Resistant Speech Recognition System | |
Wu et al. | Sound source localization based on multi-task learning and image translation network | |
Koizumi et al. | Informative acoustic feature selection to maximize mutual information for collecting target sources | |
Qian et al. | Speaker front‐back disambiguity using multi‐channel speech signals | |
US20230352040A1 (en) | Audio source feature separation and target audio source generation | |
Jayaram et al. | HRTF Estimation in the Wild | |
Liu et al. | Binaural sound source localization based on weighted template matching | |
Deleforge et al. | Audio-motor integration for robot audition | |
CN116868267A (en) | Multi-channel speech compression system and method | |
Liu et al. | Reverberation aware deep learning for environment tolerant microphone array DOA estimation | |
Wu et al. | Multi-speaker DoA Estimation Using Audio and Visual Modality | |
Jahanirad et al. | Blind source computer device identification from recorded VoIP calls for forensic investigation | |
Mathews | Development and evaluation of spherical microphone array-enabled systems for immersive multi-user environments | |
Chen et al. | Hearable devices with sound bubbles |