Temko et al., 2008 - Google Patents
Fuzzy integral based information fusion for classification of highly confusable non-speech soundsTemko et al., 2008
View PDF- Document ID
- 3594554618183830297
- Author
- Temko A
- Macho D
- Nadeu C
- Publication year
- Publication venue
- Pattern Recognition
External Links
Snippet
Acoustic event classification may help to describe acoustic scenes and contribute to improve the robustness of speech technologies. In this work, fusion of different information sources with the fuzzy integral (FI), and the associated fuzzy measure (FM), are applied to the …
- 230000004927 fusion 0 title abstract description 54
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Temko et al. | Fuzzy integral based information fusion for classification of highly confusable non-speech sounds | |
Barchiesi et al. | Acoustic scene classification: Classifying environments from the sounds they produce | |
Nasir et al. | Multimodal and multiresolution depression detection from speech and facial landmark features | |
Temko et al. | Classification of acoustic events using SVM-based clustering schemes | |
Deb et al. | Emotion classification using segmentation of vowel-like and non-vowel-like regions | |
Gelzinis et al. | Automated speech analysis applied to laryngeal disease categorization | |
Nalini et al. | Music emotion recognition: The combined evidence of MFCC and residual phase | |
Sharma et al. | Acoustic model adaptation using in-domain background models for dysarthric speech recognition | |
Hu et al. | Separation of singing voice using nonnegative matrix partial co-factorization for singer identification | |
Tirronen et al. | The effect of the MFCC frame length in automatic voice pathology detection | |
Sefara | The effects of normalisation methods on speech emotion recognition | |
Waldekar et al. | Classification of audio scenes with novel features in a fused system framework | |
Lampropoulos et al. | Evaluation of MPEG-7 descriptors for speech emotional recognition | |
Mehrabani et al. | Singing speaker clustering based on subspace learning in the GMM mean supervector space | |
Tulics et al. | Artificial neural network and svm based voice disorder classification | |
Sarria-Paja et al. | Fusion of bottleneck, spectral and modulation spectral features for improved speaker verification of neutral and whispered speech | |
Dumpala et al. | Significance of speaker embeddings and temporal context for depression detection | |
Ratanpara et al. | Singer identification using MFCC and LPC coefficients from Indian video songs | |
Deb et al. | Classification of speech under stress using harmonic peak to energy ratio | |
Latha et al. | Deep learning-based acoustic feature representations for dysarthric speech recognition | |
Karthikeyan et al. | A stacked convolutional neural network framework with multi-scale attention mechanism for text-independent voiceprint recognition | |
Pedro et al. | Quantile Acoustic Vectors vs. MFCC Applied to Speaker Verification | |
Jitendra et al. | An ensemble model of CNN with Bi-LSTM for automatic singer identification | |
Al Mojaly et al. | Detection and classification of voice pathology using feature selection | |
Chauhan et al. | Text-independent speaker recognition system using feature-level fusion for audio databases of various sizes |