Sheeja et al., 2022 - Google Patents

CNN-QTLBO: an optimal blind source separation and blind dereverberation scheme using lightweight CNN-QTLBO and PCDP-LDA for speech mixtures

Sheeja et al., 2022

Document ID: 4256450248431505968
Author: Sheeja J; Sankaragomathi B
Publication year: 2022
Publication venue: Signal, Image and Video Processing

External Links

Cited by

Snippet

A microphone positioned far away observes speech signals with little acoustic interference, in terms of both reverberation and noise. As a result, the quality of blind speech degrades, blind source separation (BSS) from obtained speech samples and blind reverberation (BD) …

Continue reading at link.springer.com (other versions)

238000000926 separation method 0 title abstract description 42

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/624—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on a separation criterion, e.g. independent component analysis

Similar Documents

Publication	Publication Date	Title
CN109830245B (en)	2021-03-12	A method and system for multi-speaker speech separation based on beamforming
Sawada et al.	2013	Multichannel extensions of non-negative matrix factorization with complex-valued data
Wang	2008	Time-frequency masking for speech separation and its potential for hearing aid design
Schädler et al.	2015	Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition
Sheeja et al.	2022	CNN-QTLBO: an optimal blind source separation and blind dereverberation scheme using lightweight CNN-QTLBO and PCDP-LDA for speech mixtures
Do et al.	2020	Speech source separation using variational autoencoder and bandpass filter
Paikrao et al.	2023	Consumer personalized gesture recognition in UAV-based industry 5.0 applications
Rivet et al.	2007	Visual voice activity detection as a help for speech source separation from convolutive mixtures
Do et al.	2020	Speech Separation in the Frequency Domain with Autoencoder.
Liu et al.	2022	A separation and interaction framework for causal multi-channel speech enhancement
Luo et al.	2024	Real-time implementation and explainable AI analysis of delayless CNN-based selective fixed-filter active noise control
Giacobello et al.	2018	Speech dereverberation based on convex optimization algorithms for group sparse linear prediction
Selvi et al.	2016	Hybridization of spectral filtering with particle swarm optimization for speech signal enhancement
Sheeja et al.	2023	Speech dereverberation and source separation using DNN-WPE and LWPR-PCA
Albataineh et al.	2021	A RobustICA-based algorithmic system for blind separation of convolutive mixtures
CN118212929A (en)	2024-06-18	A personalized Ambisonics speech enhancement method
Čmejla et al.	2018	Independent vector analysis exploiting pre-learned banks of relative transfer functions for assumed target’s positions
Koteswararao et al.	2022	Single channel source separation using time–frequency non-negative matrix factorization and sigmoid base normalization deep neural networks
Zdunek	2013	Improved convolutive and under-determined blind audio source separation with MRF smoothing
Kemiha et al.	2023	Single-channel blind source separation using adaptive mode separation-based wavelet transform and density-based clustering with sparse reconstruction
Minhas et al.	2014	A hybrid algorithm for blind source separation of a convolutive mixture of three speech sources
Fontaine et al.	2018	Multichannel audio modeling with elliptically stable tensor decomposition
Al-Ali et al.	2021	Enhanced forensic speaker verification performance using the ICA-EBM algorithm under noisy and reverberant environments
Jang et al.	2007	Independent vector analysis using non-spherical joint densities for the separation of speech signals
Watanabe et al.	2021	DNN-based frequency component prediction for frequency-domain audio source separation