[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Ding et al., 2022 - Google Patents

Ultraspeech: Speech enhancement by interaction between ultrasound and speech

Ding et al., 2022

Document ID
618635670699285068
Author
Ding H
Wang Y
Li H
Zhao C
Wang G
Xi W
Zhao J
Publication year
Publication venue
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

External Links

Snippet

Speech enhancement can benefit lots of practical voice-based interaction applications, where the goal is to generate clean speech from noisy ambient conditions. This paper presents a practical design, namely UltraSpeech, to enhance speech by exploring the …
Continue reading at dl.acm.org (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems

Similar Documents

Publication Publication Date Title
Tan et al. Real-time speech enhancement using an efficient convolutional recurrent network for dual-microphone mobile phones in close-talk scenarios
Zhang et al. Deep learning for environmentally robust speech recognition: An overview of recent developments
Tu et al. Speech enhancement based on teacher–student deep learning using improved speech presence probability for noise-robust speech recognition
Sun et al. UltraSE: single-channel speech enhancement using ultrasound
Qian et al. Speech Enhancement Using Bayesian Wavenet.
Parchami et al. Recent developments in speech enhancement in the short-time Fourier transform domain
Alamdari et al. Improving deep speech denoising by noisy2noisy signal mapping
Liu et al. Bone-conducted speech enhancement using deep denoising autoencoder
Liao et al. Noise adaptive speech enhancement using domain adversarial training
Zhang et al. Sensing to hear: Speech enhancement for mobile devices using acoustic signals
Ding et al. Ultraspeech: Speech enhancement by interaction between ultrasound and speech
US8655656B2 (en) Method and system for assessing intelligibility of speech represented by a speech signal
Xiao et al. The NTU-ADSC systems for reverberation challenge 2014
Hussain et al. Ensemble hierarchical extreme learning machine for speech dereverberation
Fu et al. Svoice: Enabling voice communication in silence via acoustic sensing on commodity devices
Shah et al. Novel MMSE DiscoGAN for cross-domain whisper-to-speech conversion
Saleem et al. Multi-objective long-short term memory recurrent neural networks for speech enhancement
Gallardo et al. I-vector speaker verification for speech degraded by narrowband and wideband channels
JP6268916B2 (en) Abnormal conversation detection apparatus, abnormal conversation detection method, and abnormal conversation detection computer program
Guo et al. Robust speaker identification via fusion of subglottal resonances and cepstral features
Chhetri et al. Speech Enhancement: A Survey of Approaches and Applications
Mamun et al. CFTNet: Complex-valued frequency transformation network for speech enhancement
Fu et al. Ultrasr: Silent speech reconstruction via acoustic sensing
Du et al. End-to-end model for speech enhancement by consistent spectrogram masking
Xiang et al. A two-stage deep representation learning-based speech enhancement method using variational autoencoder and adversarial training