Story, 2013 - Google Patents
Phrase-level speech simulation with an airway modulation model of speech productionStory, 2013
View HTML- Document ID
- 16104088194350065805
- Author
- Story B
- Publication year
- Publication venue
- Computer speech & language
External Links
Snippet
Artificial talkers and speech synthesis systems have long been used as a means of understanding both speech production and speech perception. The development of an airway modulation model is described that simulates the time-varying changes of the glottis …
- 230000000051 modifying 0 title abstract description 56
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B21/00—Teaching, or communicating with, the blind, deaf or mute
- G09B21/001—Teaching or communicating with blind persons
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Story | Phrase-level speech simulation with an airway modulation model of speech production | |
Hanson et al. | Towards models of phonation | |
Story | A parametric model of the vocal tract area function for vowel and consonant simulation | |
Gobl et al. | 11 voice source variation and its communicative functions | |
O'shaughnessy | Speech communications: Human and machine (IEEE) | |
Howard et al. | Modeling the development of pronunciation in infant speech acquisition | |
Birkholz et al. | Simulation of losses due to turbulence in the time-varying vocal system | |
Kochanski et al. | Quantitative measurement of prosodic strength in Mandarin | |
Elie et al. | Extension of the single-matrix formulation of the vocal tract: Consideration of bilateral channels and connection of self-oscillating models of the vocal folds with a glottal chink | |
Reddy et al. | Two-stage intonation modeling using feedforward neural networks for syllable based text-to-speech synthesis | |
Bellegarda et al. | Statistical prosodic modeling: from corpus design to parameter estimation | |
Story et al. | A model of speech production based on the acoustic relativity of the vocal tract | |
Toutios et al. | Articulatory synthesis based on real-time magnetic resonance imaging data. | |
Rasilo et al. | Feedback and imitation by a caregiver guides a virtual infant to learn native phonemes and the skill of speech inversion | |
Story et al. | Relation of vocal tract shape, formant transitions, and stop consonant identification | |
Hill et al. | Low-level articulatory synthesis: A working text-to-speech solution and a linguistic tool1 | |
Kröger et al. | Articulatory synthesis of speech and singing: State of the art and suggestions for future research | |
Arnela et al. | Finite element simulation of/asa/in a three-dimensional vocal tract using a simplified aeroacoustic source model | |
Teixeira et al. | Simulation of human speech production applied to the study and synthesis of European Portuguese | |
Elie et al. | Acoustic impact of the gradual glottal abduction degree on the production of fricatives: A numerical study | |
Story et al. | An acoustically-driven vocal tract model for stop consonant production | |
Story et al. | A preliminary study of voice quality transformation based on modifications to the neutral vocal tract area function | |
Scully et al. | Analysis and simulation of a speaker's aerodynamic and acoustic patterns for fricatives | |
Kröger | Computer-implemented articulatory models for speech production: A review | |
Birkholz et al. | How modeling entrance loss and flow separation in a two-mass model affects the oscillation and synthesis quality |