[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Ganapathy et al., 2012 - Google Patents

Temporal resolution analysis in frequency domain linear prediction

Ganapathy et al., 2012

View HTML
Document ID
8900086826768209790
Author
Ganapathy S
Hermansky H
Publication year
Publication venue
The Journal of the Acoustical Society of America

External Links

Snippet

Frequency domain linear prediction (FDLP) is a technique for auto-regressive modeling of Hilbert envelopes. In this letter, the resolution properties of the FDLP model are investigated using synthetic signals with impulses immersed in noise. The effect of various factors are …
Continue reading at pubs.aip.org (HTML) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/90Pitch determination of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor

Similar Documents

Publication Publication Date Title
Alku et al. Formant frequency estimation of high-pitched vowels using weighted linear prediction
Zahorian et al. A spectral/temporal method for robust fundamental frequency tracking
Das et al. Exploring different attributes of source information for speaker verification with limited test data
Ghosh et al. A generalized smoothness criterion for acoustic-to-articulatory inversion
Schädler et al. Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition
Dissen et al. Formant estimation and tracking: A deep learning approach
Sheikhan et al. Using DTW neural–based MFCC warping to improve emotional speech recognition
Mittal et al. Study of characteristics of aperiodicity in Noh voices
Ramakrishnan et al. Voice source characterization using pitch synchronous discrete cosine transform for speaker identification
Meer Automatic alignment for New Englishes: Applying state-of-the-art aligners to Trinidadian English
Hermus et al. Perceptual audio modeling with exponentially damped sinusoids
Williamson et al. Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality
Haridas et al. A novel approach to improve the speech intelligibility using fractional delta-amplitude modulation spectrogram
Guglani et al. Automatic speech recognition system with pitch dependent features for Punjabi language on KALDI toolkit
Hoang et al. Blind phone segmentation based on spectral change detection using Legendre polynomial approximation
Priyadarshani et al. Dynamic time warping based speech recognition for isolated Sinhala words
Meyer et al. Effect of speech-intrinsic variations on human and automatic recognition of spoken phonemes
Ganapathy et al. Modulation frequency features for phoneme recognition in noisy speech
Ganapathy et al. Temporal envelope compensation for robust phoneme recognition using modulation spectrum
Prathosh et al. Estimation of voice-onset time in continuous speech using temporal measures
Kadiri et al. Mel-frequency cepstral coefficients derived using the zero-time windowing spectrum for classification of phonation types in singing
Gowda et al. Quasi-closed phase forward-backward linear prediction analysis of speech for accurate formant detection and estimation
Mesgarani et al. Toward optimizing stream fusion in multistream recognition of speech
Jokinen et al. Estimating the spectral tilt of the glottal source from telephone speech using a deep neural network
Ganapathy et al. Temporal resolution analysis in frequency domain linear prediction