Temko et al., 2008 - Google Patents

Fuzzy integral based information fusion for classification of highly confusable non-speech sounds

Temko et al., 2008

Document ID: 3594554618183830297
Author: Temko A; Macho D; Nadeu C
Publication year: 2008
Publication venue: Pattern Recognition

External Links

Cited by

Snippet

Acoustic event classification may help to describe acoustic scenes and contribute to improve the robustness of speech technologies. In this work, fusion of different information sources with the fuzzy integral (FI), and the associated fuzzy measure (FM), are applied to the …

Continue reading at upcommons.upc.edu (PDF) (other versions)

230000004927 fusion 0 title abstract description 54

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes

Similar Documents

Publication	Publication Date	Title
Temko et al.	2008	Fuzzy integral based information fusion for classification of highly confusable non-speech sounds
Barchiesi et al.	2015	Acoustic scene classification: Classifying environments from the sounds they produce
Nasir et al.	2016	Multimodal and multiresolution depression detection from speech and facial landmark features
Temko et al.	2006	Classification of acoustic events using SVM-based clustering schemes
Deb et al.	2017	Emotion classification using segmentation of vowel-like and non-vowel-like regions
Gelzinis et al.	2008	Automated speech analysis applied to laryngeal disease categorization
Nalini et al.	2016	Music emotion recognition: The combined evidence of MFCC and residual phase
Sharma et al.	2013	Acoustic model adaptation using in-domain background models for dysarthric speech recognition
Hu et al.	2015	Separation of singing voice using nonnegative matrix partial co-factorization for singer identification
Tirronen et al.	2022	The effect of the MFCC frame length in automatic voice pathology detection
Sefara	2019	The effects of normalisation methods on speech emotion recognition
Waldekar et al.	2018	Classification of audio scenes with novel features in a fused system framework
Lampropoulos et al.	2012	Evaluation of MPEG-7 descriptors for speech emotional recognition
Mehrabani et al.	2013	Singing speaker clustering based on subspace learning in the GMM mean supervector space
Tulics et al.	2019	Artificial neural network and svm based voice disorder classification
Sarria-Paja et al.	2018	Fusion of bottleneck, spectral and modulation spectral features for improved speaker verification of neutral and whispered speech
Dumpala et al.	2021	Significance of speaker embeddings and temporal context for depression detection
Ratanpara et al.	2015	Singer identification using MFCC and LPC coefficients from Indian video songs
Deb et al.	2016	Classification of speech under stress using harmonic peak to energy ratio
Latha et al.	2023	Deep learning-based acoustic feature representations for dysarthric speech recognition
Karthikeyan et al.	2024	A stacked convolutional neural network framework with multi-scale attention mechanism for text-independent voiceprint recognition
Pedro et al.	2014	Quantile Acoustic Vectors vs. MFCC Applied to Speaker Verification
Jitendra et al.	2023	An ensemble model of CNN with Bi-LSTM for automatic singer identification
Al Mojaly et al.	2014	Detection and classification of voice pathology using feature selection
Chauhan et al.	2023	Text-independent speaker recognition system using feature-level fusion for audio databases of various sizes