Sheeja et al., 2022 - Google Patents
CNN-QTLBO: an optimal blind source separation and blind dereverberation scheme using lightweight CNN-QTLBO and PCDP-LDA for speech mixturesSheeja et al., 2022
- Document ID
- 4256450248431505968
- Author
- Sheeja J
- Sankaragomathi B
- Publication year
- Publication venue
- Signal, Image and Video Processing
External Links
Snippet
A microphone positioned far away observes speech signals with little acoustic interference, in terms of both reverberation and noise. As a result, the quality of blind speech degrades, blind source separation (BSS) from obtained speech samples and blind reverberation (BD) …
- 238000000926 separation method 0 title abstract description 42
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sawada et al. | Multichannel extensions of non-negative matrix factorization with complex-valued data | |
Wang | Time-frequency masking for speech separation and its potential for hearing aid design | |
Venkataramani et al. | Adaptive front-ends for end-to-end source separation | |
Sheeja et al. | CNN-QTLBO: an optimal blind source separation and blind dereverberation scheme using lightweight CNN-QTLBO and PCDP-LDA for speech mixtures | |
Do et al. | Speech source separation using variational autoencoder and bandpass filter | |
Zhang | Deep ad-hoc beamforming | |
Rivet et al. | Visual voice activity detection as a help for speech source separation from convolutive mixtures | |
Paikrao et al. | Consumer Personalized Gesture Recognition in UAV Based Industry 5.0 Applications | |
Do et al. | Speech Separation in the Frequency Domain with Autoencoder. | |
Sainath et al. | Raw multichannel processing using deep neural networks | |
Yang et al. | Deep ad-hoc beamforming based on speaker extraction for target-dependent speech separation | |
Luo et al. | Real-time implementation and explainable AI analysis of delayless CNN-based selective fixed-filter active noise control | |
Sheeja et al. | Speech dereverberation and source separation using DNN-WPE and LWPR-PCA | |
Liu et al. | A separation and interaction framework for causal multi-channel speech enhancement | |
Liu et al. | Use of bimodal coherence to resolve the permutation problem in convolutive BSS | |
Agrawal et al. | Unsupervised modulation filter learning for noise-robust speech recognition | |
Giacobello et al. | Speech dereverberation based on convex optimization algorithms for group sparse linear prediction | |
Albataineh et al. | A RobustICA-based algorithmic system for blind separation of convolutive mixtures | |
CN118212929A (en) | Personalized Ambiosonic voice enhancement method | |
Arberet et al. | A tractable framework for estimating and combining spectral source models for audio source separation | |
Kemiha et al. | Single-Channel Blind Source Separation using Adaptive Mode Separation-Based Wavelet Transform and Density-Based Clustering with Sparse Reconstruction | |
Čmejla et al. | Independent vector analysis exploiting pre-learned banks of relative transfer functions for assumed target’s positions | |
Koteswararao et al. | Single channel source separation using time–frequency non-negative matrix factorization and sigmoid base normalization deep neural networks | |
Miyazaki et al. | Environmental sound processing and its applications | |
Al-Ali et al. | Enhanced forensic speaker verification performance using the ICA-EBM algorithm under noisy and reverberant environments |