Kawahara, 2003 - Google Patents

Exemplar-based voice quality analysis and control using a high quality auditory morphing procedure based on straight

Kawahara, 2003

Document ID: 2284381152354447013
Author: Kawahara H
Publication year: 2003
Publication venue: ISCA Tutorial and Research Workshop on Voice Quality: Functions, Analysis and Synthesis

External Links

Cited by

Snippet

This paper tries to introduce a new strategy and tools for voice quality research that complements conventional approaches. A very high-quality speech analysis, modification and synthesis procedure STRAIGHT, which is basically a channel VOCODER based on a …

Continue reading at www.isca-archive.org (PDF) (other versions)

238000000034 method 0 title abstract description 25

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/90—Pitch determination of speech signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions

Similar Documents

Publication	Publication Date	Title
Kawahara et al.	2005	Underlying principles of a high-quality speech manipulation system STRAIGHT and its application to speech segregation
Kawahara et al.	2001	Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
Morise et al.	2016	World: a vocoder-based high-quality speech synthesis system for real-time applications
Banbrook et al.	1999	Speech characterization and synthesis by nonlinear methods
Roux et al.	2017	Chronset: An automated tool for detecting speech onset
Kawahara et al.	1999	Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
Ye et al.	2006	Quality-enhanced voice morphing using maximum likelihood transformations
US20020173962A1 (en)	2002-11-21	Method for generating pesonalized speech from text
EP3042377B1 (en)	2023-01-11	Method and system for generating advanced feature discrimination vectors for use in speech recognition
Rao	2010	Voice conversion by mapping the speaker-specific features using pitch synchronous approach
Hosom et al.	2003	Intelligibility of modifications to dysarthric speech
Tamburini	2003	Automatic prosodic prominence detection in speech using acoustic features: an unsupervised system.
d'Alessandro et al.	1998	Effectiveness of a periodic and aperiodic decomposition method for analysis of voice sources
RU2427044C1 (en)	2011-08-20	Text-dependent voice conversion method
Sanchez et al.	2014	Hierarchical modeling of F0 contours for voice conversion
Gussenhoven et al.	1998	On the speaker-dependence of the perceived prominence of F0peaks
Kiefte et al.	2013	Vowel perception in normal speakers
Kawahara	2003	Exemplar-based voice quality analysis and control using a high quality auditory morphing procedure based on straight
CN108369803A (en)	2018-08-03	The method for being used to form the pumping signal of the parameter speech synthesis system based on glottal model
Wagner	2008	A comprehensive model of intonation for application in speech synthesis
Karjigi et al.	2012	Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling
Tamburini	2005	Automatic prominence identification and prosodic typology.
Ru et al.	2003	The synergy between speech production and perception
Potisuk et al.	1995	Speaker-independent automatic classification of Thai tones in connected speech by analysis-synthesis method
JP3358139B2 (en)	2002-12-16	Voice pitch mark setting method