Ding et al., 2022 - Google Patents
Ultraspeech: Speech enhancement by interaction between ultrasound and speechDing et al., 2022
- Document ID
- 618635670699285068
- Author
- Ding H
- Wang Y
- Li H
- Zhao C
- Wang G
- Xi W
- Zhao J
- Publication year
- Publication venue
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
External Links
Snippet
Speech enhancement can benefit lots of practical voice-based interaction applications, where the goal is to generate clean speech from noisy ambient conditions. This paper presents a practical design, namely UltraSpeech, to enhance speech by exploring the …
- 238000002604 ultrasonography 0 title abstract description 92
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tan et al. | Real-time speech enhancement using an efficient convolutional recurrent network for dual-microphone mobile phones in close-talk scenarios | |
Zhang et al. | Deep learning for environmentally robust speech recognition: An overview of recent developments | |
Tu et al. | Speech enhancement based on teacher–student deep learning using improved speech presence probability for noise-robust speech recognition | |
Sun et al. | UltraSE: single-channel speech enhancement using ultrasound | |
Qian et al. | Speech Enhancement Using Bayesian Wavenet. | |
Parchami et al. | Recent developments in speech enhancement in the short-time Fourier transform domain | |
Alamdari et al. | Improving deep speech denoising by noisy2noisy signal mapping | |
Liu et al. | Bone-conducted speech enhancement using deep denoising autoencoder | |
Liao et al. | Noise adaptive speech enhancement using domain adversarial training | |
Zhang et al. | Sensing to hear: Speech enhancement for mobile devices using acoustic signals | |
Ding et al. | Ultraspeech: Speech enhancement by interaction between ultrasound and speech | |
US8655656B2 (en) | Method and system for assessing intelligibility of speech represented by a speech signal | |
Xiao et al. | The NTU-ADSC systems for reverberation challenge 2014 | |
Hussain et al. | Ensemble hierarchical extreme learning machine for speech dereverberation | |
Fu et al. | Svoice: Enabling voice communication in silence via acoustic sensing on commodity devices | |
Shah et al. | Novel MMSE DiscoGAN for cross-domain whisper-to-speech conversion | |
Saleem et al. | Multi-objective long-short term memory recurrent neural networks for speech enhancement | |
Gallardo et al. | I-vector speaker verification for speech degraded by narrowband and wideband channels | |
JP6268916B2 (en) | Abnormal conversation detection apparatus, abnormal conversation detection method, and abnormal conversation detection computer program | |
Guo et al. | Robust speaker identification via fusion of subglottal resonances and cepstral features | |
Chhetri et al. | Speech Enhancement: A Survey of Approaches and Applications | |
Mamun et al. | CFTNet: Complex-valued frequency transformation network for speech enhancement | |
Fu et al. | Ultrasr: Silent speech reconstruction via acoustic sensing | |
Du et al. | End-to-end model for speech enhancement by consistent spectrogram masking | |
Xiang et al. | A two-stage deep representation learning-based speech enhancement method using variational autoencoder and adversarial training |