[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Huang et al., 2019 - Google Patents

Intel Far-Field Speaker Recognition System for VOiCES Challenge 2019.

Huang et al., 2019

View PDF
Document ID
7184272683492917668
Author
Huang J
Bocklet T
Publication year
Publication venue
Interspeech

External Links

Snippet

This paper describes Intel's speaker recognition systems for the VOiCES from a Distance Challenge 2019. Our submission consists of a Resnet50, and four Xvector systems trained with different data augmentation and input features. Our novel contributions include the use …
Continue reading at www.isca-archive.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/144Training of HMMs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/10Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation

Similar Documents

Publication Publication Date Title
Liu et al. GMM and CNN hybrid method for short utterance speaker recognition
Xie et al. Utterance-level aggregation for speaker recognition in the wild
Qin et al. Hi-mia: A far-field text-dependent speaker verification database and the baselines
Nugraha et al. Multichannel audio source separation with deep neural networks
Kwon et al. The ins and outs of speaker recognition: lessons from VoxSRC 2020
Heigold et al. End-to-end text-dependent speaker verification
Zhao et al. Wasserstein GAN and waveform loss-based acoustic model training for multi-speaker text-to-speech synthesis systems using a WaveNet vocoder
Huang et al. Intel Far-Field Speaker Recognition System for VOiCES Challenge 2019.
Cai et al. Within-sample variability-invariant loss for robust speaker recognition under noisy environments
Qin et al. The INTERSPEECH 2020 far-field speaker verification challenge
CN110047504B (en) Speaker identification method under identity vector x-vector linear transformation
Wang et al. Discriminative neural embedding learning for short-duration text-independent speaker verification
CN103794207A (en) Dual-mode voice identity recognition method
Hsu et al. Scalable factorized hierarchical variational autoencoder training
CN109427328A (en) A kind of multicenter voice recognition methods based on filter network acoustic model
Bai et al. Speaker verification by partial AUC optimization with mahalanobis distance metric learning
Pardede et al. Convolutional neural network and feature transformation for distant speech recognition
CN117746908A (en) Voice emotion recognition method based on time-frequency characteristic separation type transducer cross fusion architecture
Cai et al. The DKU system for the speaker recognition task of the 2019 VOiCES from a distance challenge
Kataria et al. Deep feature cyclegans: Speaker identity preserving non-parallel microphone-telephone domain adaptation for speaker verification
Wang et al. Cross-domain adaptation with discrepancy minimization for text-independent forensic speaker verification
Zhang et al. Multi-level transfer learning from near-field to far-field speaker verification
Dowerah et al. Joint optimization of diffusion probabilistic-based multichannel speech enhancement with far-field speaker verification
Zheng et al. The speakin speaker verification system for far-field speaker verification challenge 2022
Zheng et al. Unisound system for voxceleb speaker recognition challenge 2023