[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Qian et al., 2022 - Google Patents

Speaker front‐back disambiguity using multi‐channel speech signals

Qian et al., 2022

View PDF
Document ID
11607703097124362362
Author
Qian X
Yang J
Brutti A
Publication year
Publication venue
Electronics Letters

External Links

Snippet

This paper tackles the front‐back disambiguity problem in speaker localization when the audio signals are captured by a symmetric microphone array. To this end, a deep neural network is proposed with an attention‐based mechanism designed to assign different …
Continue reading at ietresearch.onlinelibrary.wiley.com (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification

Similar Documents

Publication Publication Date Title
Grumiaux et al. A survey of sound source localization with deep learning methods
US9008329B1 (en) Noise reduction using multi-feature cluster tracker
Zhang et al. Ambiear: mmwave based voice recognition in nlos scenarios
Yang et al. Deepear: Sound localization with binaural microphones
Jiang et al. Deep and CNN fusion method for binaural sound source localisation
Wang et al. Deep-learning-assisted sound source localization from a flying drone
Dai et al. Blind source separation‐based IVA‐Xception model for bird sound recognition in complex acoustic environments
Paikrao et al. Consumer Personalized Gesture Recognition in UAV Based Industry 5.0 Applications
Malek et al. Block‐online multi‐channel speech enhancement using deep neural network‐supported relative transfer function estimates
Liu et al. Head‐related transfer function–reserved time‐frequency masking for robust binaural sound source localization
Ding et al. Joint estimation of binaural distance and azimuth by exploiting deep neural networks
Liu et al. Wavoice: An mmWave-Assisted Noise-Resistant Speech Recognition System
Wu et al. Sound source localization based on multi-task learning and image translation network
Koizumi et al. Informative acoustic feature selection to maximize mutual information for collecting target sources
Qian et al. Speaker front‐back disambiguity using multi‐channel speech signals
US20230352040A1 (en) Audio source feature separation and target audio source generation
Jayaram et al. HRTF Estimation in the Wild
Liu et al. Binaural sound source localization based on weighted template matching
Deleforge et al. Audio-motor integration for robot audition
CN116868267A (en) Multi-channel speech compression system and method
Liu et al. Reverberation aware deep learning for environment tolerant microphone array DOA estimation
Wu et al. Multi-speaker DoA Estimation Using Audio and Visual Modality
Jahanirad et al. Blind source computer device identification from recorded VoIP calls for forensic investigation
Mathews Development and evaluation of spherical microphone array-enabled systems for immersive multi-user environments
Chen et al. Hearable devices with sound bubbles