Sahidullah et al., 2016 - Google Patents
Robust speaker recognition with combined use of acoustic and throat microphone speechSahidullah et al., 2016
View PDF- Document ID
- 4672451994735923533
- Author
- Sahidullah M
- Gonzalez Hautamäki R
- Lehmann T
- Kinnunen T
- Tan Z
- Hautamäki V
- Parts R
- Pitkänen M
- Publication year
External Links
Snippet
Accuracy of automatic speaker recognition (ASV) systems degrades severely in the presence of background noise. In this paper, we study the use of additional side information provided by a body-conducted sensor, throat microphone. Throat microphone signal is much …
- 210000003800 Pharynx 0 title abstract description 34
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/10—Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shiota et al. | Voice liveness detection algorithms based on pop noise caused by human breath for automatic speaker verification | |
Sahidullah et al. | Robust speaker recognition with combined use of acoustic and throat microphone speech | |
Sahidullah et al. | Robust voice liveness detection and speaker verification using throat microphones | |
Prabakaran et al. | A review on performance of voice feature extraction techniques | |
CN110767239A (en) | Voiceprint recognition method, device and equipment based on deep learning | |
Tapkir et al. | Novel spectral root cepstral features for replay spoof detection | |
Burgos | Gammatone and MFCC features in speaker recognition | |
Singh et al. | MFCC VQ based speaker recognition and its accuracy affecting factors | |
Nandyal et al. | MFCC based text-dependent speaker identification using BPNN | |
Bagul et al. | Text independent speaker recognition system using GMM | |
Choi et al. | Selective background adaptation based abnormal acoustic event recognition for audio surveillance | |
Costa et al. | Speech and phoneme segmentation under noisy environment through spectrogram image analysis | |
Venkatesan et al. | Binaural classification-based speech segregation and robust speaker recognition system | |
Guo et al. | Robust speaker identification via fusion of subglottal resonances and cepstral features | |
Sukor et al. | Speaker identification system using MFCC procedure and noise reduction method | |
Kalgaonkar et al. | Ultrasonic doppler sensor for speaker recognition | |
Kamble et al. | Emotion recognition for instantaneous Marathi spoken words | |
Zhu et al. | Multimodal speech recognition with ultrasonic sensors | |
Paul et al. | Speech recognition of throat microphone using MFCC approach | |
Zhang et al. | Articulatory movement features for short-duration text-dependent speaker verification | |
Tsuge et al. | Bone-and air-conduction speech combination method for speaker recognition | |
Islam et al. | A Novel Approach for Text-Independent Speaker Identification Using Artificial Neural Network | |
Camarena-Ibarrola et al. | Speaker identification through spectral entropy analysis | |
Jain et al. | Speech features analysis and biometric person identification in multilingual environment | |
Ishac et al. | A text-dependent speaker-recognition system |