Wang et al., 2022 - Google Patents
Low-latency real-time independent vector analysis using convolutive transfer functionWang et al., 2022
- Document ID
- 3039719325420601514
- Author
- Wang T
- Yang F
- Li N
- Zhang C
- Yang J
- Publication year
- Publication venue
- Applied Acoustics
External Links
Snippet
Most blind source separation (BSS) methods use a long short-time Fourier transform (STFT) frame size in highly reverberant environments, which leads to a high algorithmic latency. In addition, the separated signal using the back-projection technology in most BSS methods …
- 238000004458 analytical method 0 title abstract description 20
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kameoka et al. | Supervised determined source separation with multichannel variational autoencoder | |
Ono | Stable and fast update rules for independent vector analysis based on auxiliary function technique | |
EP3257044B1 (en) | Audio source separation | |
EP3329488B1 (en) | Keystroke noise canceling | |
Kameoka et al. | Semi-blind source separation with multichannel variational autoencoder | |
Krueger et al. | Model-based feature enhancement for reverberant speech recognition | |
Wang et al. | Convolutive transfer function-based multichannel nonnegative matrix factorization for overdetermined blind source separation | |
Nakatani et al. | Blind and neural network-guided convolutional beamformer for joint denoising, dereverberation, and source separation | |
GB2510631A (en) | Sound source separation based on a Binary Activation model | |
Kang et al. | A low-complexity permutation alignment method for frequency-domain blind source separation | |
Wang et al. | Low-latency real-time independent vector analysis using convolutive transfer function | |
US11694707B2 (en) | Online target-speech extraction method based on auxiliary function for robust automatic speech recognition | |
Sawada et al. | Multi-frame full-rank spatial covariance analysis for underdetermined blind source separation and dereverberation | |
Albataineh et al. | A RobustICA-based algorithmic system for blind separation of convolutive mixtures | |
Čmejla et al. | Independent vector analysis exploiting pre-learned banks of relative transfer functions for assumed target’s positions | |
Mirzaei et al. | Under-determined reverberant audio source separation using Bayesian non-negative matrix factorization | |
Delcroix et al. | Multichannel speech enhancement approaches to DNN-based far-field speech recognition | |
Liu et al. | A hybrid reverberation model and its application to joint speech dereverberation and separation | |
Kamo et al. | Regularized fast multichannel nonnegative matrix factorization with ILRMA-based prior distribution of joint-diagonalization process | |
Shin et al. | Statistical Beamformer Exploiting Non-Stationarity and Sparsity With Spatially Constrained ICA for Robust Speech Recognition | |
Radfar et al. | Monaural speech separation based on gain adapted minimum mean square error estimation | |
Adiloğlu et al. | A general variational Bayesian framework for robust feature extraction in multisource recordings | |
Inoue et al. | Sepnet: a deep separation matrix prediction network for multichannel audio source separation | |
Wang et al. | Multichannel Linear Prediction-Based Speech Dereverberation Considering Sparse and Low-Rank Priors | |
Liu et al. | Joint dereverberation and blind source separation using a hybrid autoregressive and convolutive transfer function-based model |