WO2004072951A1 - Multiple speech synthesizer using pitch alteration method - Google Patents
Multiple speech synthesizer using pitch alteration method Download PDFInfo
- Publication number
- WO2004072951A1 WO2004072951A1 PCT/KR2003/001238 KR0301238W WO2004072951A1 WO 2004072951 A1 WO2004072951 A1 WO 2004072951A1 KR 0301238 W KR0301238 W KR 0301238W WO 2004072951 A1 WO2004072951 A1 WO 2004072951A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pitch
- speech
- voice
- signal
- synthesis
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 30
- 230000004075 alteration Effects 0.000 title claims 2
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 10
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 10
- 238000001514 detection method Methods 0.000 claims description 12
- 238000004458 analytical method Methods 0.000 claims description 2
- 238000001308 synthesis method Methods 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 6
- 238000004519 manufacturing process Methods 0.000 abstract description 5
- 230000000737 periodic effect Effects 0.000 abstract description 5
- 230000005284 excitation Effects 0.000 abstract description 4
- 230000005236 sound signal Effects 0.000 abstract description 4
- 210000001260 vocal cord Anatomy 0.000 abstract description 3
- 230000001755 vocal effect Effects 0.000 abstract description 3
- 239000011295 pitch Substances 0.000 description 53
- 238000004891 communication Methods 0.000 description 2
- 230000003111 delayed effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 210000004704 glottis Anatomy 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
Definitions
- This invention could be divided into speech communication technology or Audio Signal process because it is about transferring a voice to group voice by alternating pitch.
- Technique that is used is synthesizing from one voice to one voice with different pitch, and it has disadvantage that could not synthesis diverse voices.
- This invention amend this advantages and make diverse voices.
- Fig. 1 is sound production model.
- the input from the lung through the vocal cords to vocal tract could be divided into two, and it is possible that voiced sound is used for impulse train signal, which is based on periodic of pitch, and unvoiced is used for modeling with random noises .
- voice signal One that switch those two signals is doubled by input signal energy, and the one is through filter to make voice signal. If you interpret the voice signal according to voice alternation model, it contains formant information, which is excitation information that appears human's characteristics and emotions, and vocal tract filter that shows the communication. The pitch that shows excitation information is produced by vibration of vocal cords, and it is parameter that reacts sensitively by human's auditory system. So it is used to distinguish voices of people, and it gives huge effect on naturalness of voice signal. If you alter these pitches, you could produce diverse synthesis sound.
- Pitch alternation system organized as fig. 2.
- the interpretation of pitch alternation system detect the original signal ' s pitches from microphone input and purpose signal's pitches, and sent them to alternation rule production part. In alternation rule production part, they use them to determine rate of changing pitch, and fine suitable alternating system.
- This pitch alternation regulation offers to real pitch alternation, and changing pitch of original signal by rate of changing pitch. Also synthesizing is using this to produce changed synthetic voice. In this process, accurate detection of pitch and little difference of pitch alternation are necessary. There are many suggestions about pitch detection of voice signal. For example, one of famous method is Autocorrelation method, and it is the method that calculate the function between neighboring speech waveform to detect periodical cycle of waveform(References) .
- Alternating pitch has to be processed after complete pitch detection. Also many methods of alternating pitch had been suggested. For example, there is a method of PSOLA(Pitch Synchronous overLap and Add) 3. The PSOLA method widely separates speech waveform in time domain by pitch periodic unit and reconstructs overlapped waveform.
- FIG. 1 shows the existing speech production model block fig. ;
- FIG. 2 shows represents general pitch altering system block fig.
- FIG. 3 shows Pitch altering system block fig.
- FIG. 4 shows Pitch point detection block fig.
- FIG. 5 shows Pitch alternation system block fig.
- FIG. 6 shows Organization of multiple-speech synthesizer hardware
- FIG. 7 shows Flow chart of multiple-speech synthesizer software.
- Fig. 3 is the pitch alternation system block fig. that applied to this invention. In this invention, for pitch detection, they used method as fig. 4.
- Fig. 6 represents the equipment that receives the analog-shaped voice signal (600) from microphone, and it changes pitch and synthesis voices.
- the voices that recorded as shape of analog (600) is amplified at an amplifier (601), and going through the Low Pass Filter to remove aliasing effect. Also it passes through analog-digital convertor to achieve quantization and coding, then the voices changed into PCM shaped digital signal. Last process are occurred in software or firmware at CPU or DSP.
- the computer manager could use the other equipment that constructed outside, and it could use outside memory to save management result or input digital signal.
- the multiple voice synthesized digital signal by altering pitch software in CPU would be converted into analog shaped signal which is sampled. If you pass this signal through Low Pass Filter, it would be the analog signal without quantization noises. Also if you amplify that signal with right rates, it would be analog signal that could be listened through speaker.
- Fig. 7 represents flow chart of multiple speech synthesizer that used in this invention.
- the data sample (701) from Analog-digital converter (ADC) managed to one unit of frame.
- ADC Analog-digital converter
- the data sample is interpreted whether it is voiced sound or not, and if it is not, (703) Buffer Rates would be calculated.
- a memory buffer that needed for standing by the data that managed is called Ring Buffer (710) .
- the voiced sound section would generate power slowly to alternate pitch in right way, but in non-voiced sound section, time would be fastening up to solve the postponement.
- Pitch Point Detection Process If the data is placed on section of voiced sound, you need to detect the pitch cycle by Pitch Point Detection Process. There are many suggestions accepted about Pitch Point detection process of sound signal in past 40 years. For example, for pitch detection, Autocorrelation method is commonly used, and there is the way to detect the periodic waveform cycle by calculating the relationship between neighboring speech waveform.
- the voiced data that completely managed is saved on Ring Buffer (709) , and it would output the data through the digital-analog converter by saved order.
- the function of the multiple voiced synthesizers is managed immediately. Thus, you have to finish the treatment after downloading the dating from analog frame, but before downloading the other data.
- this invention is to synthesis the multiple voices from a voice by alternating pitch which is important parameter.
- the voices' information technique is selected that top 10 technique in 21 century by M.I.T, and top 10 bright prospect technique by Samsung Economy invent Center. Except the importance of technique, economy of voice technology will develop with highest rate.
- the interior voice technological economy is in starting level, and the scale is predicted about 20 billion dollars. However, the rate of development is constantly over 50 percents, so by 2005, the scale of the interior voiced technological economy will reach over 100 billion dollars.
- this invention is essential to be used, such as a cheer synthesizer that gives effect of group cheer from only one person's voice, congratulated synthesizer at birthday or party places, rotation singing toy, etc. Also effect sound of movie or play, and House protection system for working people could be produced. The modulation of sound could help to imitate voices of famous actors or cartoon characters, such as Mask-man. The prediction of an effect would be large.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Electrophonic Musical Instruments (AREA)
Abstract
This invent is about implementing a speech synthesizer that changing pitch, important voice characteristics of speech parameter, so that a voice on the microphone could be synthesis into several different voices. Generally speaking, voice is identified as a signal that comes through the excitation and the filter in the speech production model, and the excitation signal could be modeled by using the formant components. The shape of formant changes by geometrical shape of vocal tract. The pitch is produced by periodic vibration of the vocal cords, and it is parameter that reacts sensitively of human's auditory sense. Because of those characteristics, pitch is used to distinguish speaker's sound signal, and it gives huge effect on naturalness of sound signal. Thus, accurate interpretation of pitch is the essential element that choose the tone quality of speech synthesis and speech coding.
Description
MULTIPLE SPEECH SYNTHESIZERUSING PITCH ALTERATIONMETHOD
Technical Field
This invention could be divided into speech communication technology or Audio Signal process because it is about transferring a voice to group voice by alternating pitch. Technique that is used is synthesizing from one voice to one voice with different pitch, and it has disadvantage that could not synthesis diverse voices. This invention amend this advantages and make diverse voices.
Background Art
This invention suggests that transferring a speech to multiple-speech by changing pitch, which is important sound characteristic parameter. Fig. 1 is sound production model. The input from the lung through the vocal cords to vocal tract could be divided into two, and it is possible that voiced sound is used for impulse train signal, which is based on periodic of pitch, and unvoiced is used for modeling with random noises .
One that switch those two signals is doubled by input signal energy, and the one is through filter to make voice
signal. If you interpret the voice signal according to voice alternation model, it contains formant information, which is excitation information that appears human's characteristics and emotions, and vocal tract filter that shows the communication. The pitch that shows excitation information is produced by vibration of vocal cords, and it is parameter that reacts sensitively by human's auditory system. So it is used to distinguish voices of people, and it gives huge effect on naturalness of voice signal. If you alter these pitches, you could produce diverse synthesis sound.
Disclosure of the Invention
Pitch alternation system organized as fig. 2. The interpretation of pitch alternation system detect the original signal ' s pitches from microphone input and purpose signal's pitches, and sent them to alternation rule production part. In alternation rule production part, they use them to determine rate of changing pitch, and fine suitable alternating system.
This pitch alternation regulation offers to real pitch alternation, and changing pitch of original signal by rate of changing pitch. Also synthesizing is using this to produce changed synthetic voice. In this process, accurate detection
of pitch and little difference of pitch alternation are necessary. There are many suggestions about pitch detection of voice signal. For example, one of famous method is Autocorrelation method, and it is the method that calculate the function between neighboring speech waveform to detect periodical cycle of waveform(References) .
Alternating pitch has to be processed after complete pitch detection. Also many methods of alternating pitch had been suggested. For example, there is a method of PSOLA(Pitch Synchronous overLap and Add) 3. The PSOLA method widely separates speech waveform in time domain by pitch periodic unit and reconstructs overlapped waveform.
Brief Description of the Drawings
FIG. 1 shows the existing speech production model block fig. ;
FIG. 2 shows represents general pitch altering system block fig. ; FIG. 3 shows Pitch altering system block fig.;
FIG. 4 shows Pitch point detection block fig.; FIG. 5 shows Pitch alternation system block fig.; FIG. 6 shows Organization of multiple-speech synthesizer hardware;
FIG. 7 shows Flow chart of multiple-speech synthesizer software.
Best Mode for Carrying Out the Invention
Fig. 3 is the pitch alternation system block fig. that applied to this invention. In this invention, for pitch detection, they used method as fig. 4.
First, pass the sound through the appeared filter by linear predictive coefficient that emphasis the high frequency area through pre-emphasis filter, and then apply the periodic characteristic and the amplitude characteristics of glottis from each interpretation area to accomplish the pitch detection process. As above, examine the pitch and use examined pitch with PSOLA process to combine sounds and -to alter sounds, 140%, 120% extended pitch and 80%, 60% shrieked pitch and produced. If you synthesis those pitches with a little time differences, you can produce multiple voice synthesized sound.
Hardware equipments organization
Fig. 6 represents the equipment that receives the analog-shaped voice signal (600) from microphone, and it changes pitch and synthesis voices. The voices that recorded
as shape of analog (600) is amplified at an amplifier (601), and going through the Low Pass Filter to remove aliasing effect. Also it passes through analog-digital convertor to achieve quantization and coding, then the voices changed into PCM shaped digital signal. Last process are occurred in software or firmware at CPU or DSP.
During digital treatment process, the computer manager could use the other equipment that constructed outside, and it could use outside memory to save management result or input digital signal.
The multiple voice synthesized digital signal by altering pitch software in CPU would be converted into analog shaped signal which is sampled. If you pass this signal through Low Pass Filter, it would be the analog signal without quantization noises. Also if you amplify that signal with right rates, it would be analog signal that could be listened through speaker.
Software Management Process Multiple Speech Synthesizer using Pitch Alternation
Method is added software or firmware that using pitch alternation method rather than using original single pitch alternation. Fig. 7 represents flow chart of multiple speech synthesizer that used in this invention.
The data sample (701) from Analog-digital converter (ADC) managed to one unit of frame. First, the data sample is interpreted whether it is voiced sound or not, and if it is not, (703) Buffer Rates would be calculated. A memory buffer that needed for standing by the data that managed is called Ring Buffer (710) .
The data that managed Buffer Rate, which is Ring Buffer, represents the rates of the tone. If recent frame is not voiced sound, and the time that delayed in ring buffer is over the setting time (Ex. BT=1.5), the program will need to shorten management tone process in order to make handling time shorter. By their process, you can bring time back to routine that delayed when Multiple Pitch Alternation processed. The voiced sound section would generate power slowly to alternate pitch in right way, but in non-voiced sound section, time would be fastening up to solve the postponement.
There are many suggestions to measure the current frame to interpret if it is voiced sound, and you could make easy process with invent energy level. Thus if the average energy of current frame gets below the standard value, it would be non- oiced sound.
If the data is placed on section of voiced sound, you need to detect the pitch cycle by Pitch Point Detection
Process. There are many suggestions accepted about Pitch Point detection process of sound signal in past 40 years. For example, for pitch detection, Autocorrelation method is commonly used, and there is the way to detect the periodic waveform cycle by calculating the relationship between neighboring speech waveform.
Also, to restrict the range of changing intonation in voiced sound section (Ex. Max 1.3x), detect the pitch cycle from constant voiced sound section, and get the rate of changing per frame. If the changing rate is higher, it moderates the voice by using alternation pitch cycle (706) . Pitch cycle alternation is based on Pitch cycle detection. Also there are many solution suggested about alternation the pitch cycle. In this invention, PSOLA that divide the speech waveform widely by unit of pitch cycle in time section and turn reiterate waveform to reform is used to achieve multiple pitch alternation.
The voiced data that completely managed is saved on Ring Buffer (709) , and it would output the data through the digital-analog converter by saved order. The function of the multiple voiced synthesizers is managed immediately. Thus, you have to finish the treatment after downloading the dating from analog frame, but before downloading the other data.
References
MyungJin Bae, SangHyo Lee, 'Digital Speech Analysis', Published by Dong-Young, 1998
MyungJin Bae, 'Digital Speech Synthesis', Published by Dong-Young, 1999
MyungJin Bae, 'Digital Speech Coding', Published by Dong-Young, 2000
Rabiner and Schefer, ' Digital Signal Processing of speech Signals', Prentice Hall, 1978 HyungBin Park, MyungJin Bae, "On a Detection of Pitch
Point for Voice Color Conversion", J. Acoust, Society, Korea, Vol. 19, No.l, pp. 1, 49-152, July 7-8, 2000.
Industrial Applicability
As above explanation, this invention is to synthesis the multiple voices from a voice by alternating pitch which is important parameter. The voices' information technique is selected that top 10 technique in 21 century by M.I.T, and top 10 bright prospect technique by Samsung Economy invent Center. Except the importance of technique, economy of voice technology will develop with highest rate. At present, the interior voice technological economy is in starting level, and the scale is predicted about 20 billion dollars. However,
the rate of development is constantly over 50 percents, so by 2005, the scale of the interior voiced technological economy will reach over 100 billion dollars.
For this result, this invention is essential to be used, such as a cheer synthesizer that gives effect of group cheer from only one person's voice, congratulated synthesizer at birthday or party places, rotation singing toy, etc. Also effect sound of movie or play, and House protection system for working people could be produced. The modulation of sound could help to imitate voices of famous actors or cartoon characters, such as Mask-man. The prediction of an effect would be large.
Claims
Claims Multiple speech synthesizer by using pitch alteration method comprising to synthesis the multiple voices from a voiced by the alternating pitch which is important parameter, and that applied to the method to control the beam of the sound immediately to the time section, and to keep the characteristics of voices and accuracy, they use pitch point detection method that used Linear Predictive Analysis, which could detect based on core pitch of the voice and also in the time section, they applied PSOLA synthesis method for alternating pitch immediately so that multiple voices that changed from a voice could synthesis there.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2003-0009198 | 2003-02-13 | ||
KR1020030009198A KR20030031936A (en) | 2003-02-13 | 2003-02-13 | Mutiple Speech Synthesizer using Pitch Alteration Method |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2004072951A1 true WO2004072951A1 (en) | 2004-08-26 |
Family
ID=29578508
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2003/001238 WO2004072951A1 (en) | 2003-02-13 | 2003-06-24 | Multiple speech synthesizer using pitch alteration method |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR20030031936A (en) |
WO (1) | WO2004072951A1 (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2498812A (en) * | 2012-01-30 | 2013-07-31 | China Ind Ltd | Providing an time delayed and pitched shifted accompaniment to a sound produced by a user |
US11043226B2 (en) | 2017-11-10 | 2021-06-22 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Apparatus and method for encoding and decoding an audio signal using downsampling or interpolation of scale parameters |
US11127408B2 (en) | 2017-11-10 | 2021-09-21 | Fraunhofer—Gesellschaft zur F rderung der angewandten Forschung e.V. | Temporal noise shaping |
US11217261B2 (en) | 2017-11-10 | 2022-01-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Encoding and decoding audio signals |
US11315583B2 (en) | 2017-11-10 | 2022-04-26 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoders, audio decoders, methods and computer programs adapting an encoding and decoding of least significant bits |
US11315580B2 (en) | 2017-11-10 | 2022-04-26 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio decoder supporting a set of different loss concealment tools |
US11380341B2 (en) | 2017-11-10 | 2022-07-05 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Selecting pitch lag |
US11462226B2 (en) | 2017-11-10 | 2022-10-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Controlling bandwidth in encoders and/or decoders |
US11545167B2 (en) | 2017-11-10 | 2023-01-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Signal filtering |
US11562754B2 (en) | 2017-11-10 | 2023-01-24 | Fraunhofer-Gesellschaft Zur F Rderung Der Angewandten Forschung E.V. | Analysis/synthesis windowing function for modulated lapped transformation |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100912339B1 (en) * | 2007-05-10 | 2009-08-14 | 주식회사 케이티 | Apparatus and Method of training a minority speech data |
CN109712634A (en) * | 2018-12-24 | 2019-05-03 | 东北大学 | A kind of automatic sound conversion method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08202395A (en) * | 1995-01-31 | 1996-08-09 | Matsushita Electric Ind Co Ltd | Pitch converting method and its device |
US5787398A (en) * | 1994-03-18 | 1998-07-28 | British Telecommunications Plc | Apparatus for synthesizing speech by varying pitch |
KR20020084765A (en) * | 2001-05-03 | 2002-11-11 | (주)디지텍 | Method for synthesizing voice |
-
2003
- 2003-02-13 KR KR1020030009198A patent/KR20030031936A/en not_active Application Discontinuation
- 2003-06-24 WO PCT/KR2003/001238 patent/WO2004072951A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5787398A (en) * | 1994-03-18 | 1998-07-28 | British Telecommunications Plc | Apparatus for synthesizing speech by varying pitch |
JPH08202395A (en) * | 1995-01-31 | 1996-08-09 | Matsushita Electric Ind Co Ltd | Pitch converting method and its device |
KR20020084765A (en) * | 2001-05-03 | 2002-11-11 | (주)디지텍 | Method for synthesizing voice |
Non-Patent Citations (1)
Title |
---|
PATENT ABSTRACTS OF JAPAN * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2498812A (en) * | 2012-01-30 | 2013-07-31 | China Ind Ltd | Providing an time delayed and pitched shifted accompaniment to a sound produced by a user |
US11043226B2 (en) | 2017-11-10 | 2021-06-22 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Apparatus and method for encoding and decoding an audio signal using downsampling or interpolation of scale parameters |
US11127408B2 (en) | 2017-11-10 | 2021-09-21 | Fraunhofer—Gesellschaft zur F rderung der angewandten Forschung e.V. | Temporal noise shaping |
US11217261B2 (en) | 2017-11-10 | 2022-01-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Encoding and decoding audio signals |
US11315583B2 (en) | 2017-11-10 | 2022-04-26 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoders, audio decoders, methods and computer programs adapting an encoding and decoding of least significant bits |
US11315580B2 (en) | 2017-11-10 | 2022-04-26 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio decoder supporting a set of different loss concealment tools |
US11380339B2 (en) | 2017-11-10 | 2022-07-05 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoders, audio decoders, methods and computer programs adapting an encoding and decoding of least significant bits |
US11380341B2 (en) | 2017-11-10 | 2022-07-05 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Selecting pitch lag |
US11386909B2 (en) | 2017-11-10 | 2022-07-12 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoders, audio decoders, methods and computer programs adapting an encoding and decoding of least significant bits |
US11462226B2 (en) | 2017-11-10 | 2022-10-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Controlling bandwidth in encoders and/or decoders |
US11545167B2 (en) | 2017-11-10 | 2023-01-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Signal filtering |
US11562754B2 (en) | 2017-11-10 | 2023-01-24 | Fraunhofer-Gesellschaft Zur F Rderung Der Angewandten Forschung E.V. | Analysis/synthesis windowing function for modulated lapped transformation |
US12033646B2 (en) | 2017-11-10 | 2024-07-09 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Analysis/synthesis windowing function for modulated lapped transformation |
Also Published As
Publication number | Publication date |
---|---|
KR20030031936A (en) | 2003-04-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cook | Real sound synthesis for interactive applications | |
US6298322B1 (en) | Encoding and synthesis of tonal audio signals using dominant sinusoids and a vector-quantized residual tonal signal | |
JP2763322B2 (en) | Audio processing method | |
WO2004072951A1 (en) | Multiple speech synthesizer using pitch alteration method | |
JP2018004870A (en) | Speech synthesis device and speech synthesis method | |
JPH02201500A (en) | Voice synthesizing device | |
JP3966074B2 (en) | Pitch conversion device, pitch conversion method and program | |
JP5560769B2 (en) | Phoneme code converter and speech synthesizer | |
JP5360489B2 (en) | Phoneme code converter and speech synthesizer | |
JPS58147798A (en) | Voice synthesizer | |
EP1543497A1 (en) | Method of synthesis for a steady sound signal | |
Kim et al. | On a speech multiple system implementation for speech synthesis | |
JP7088403B2 (en) | Sound signal generation method, generative model training method, sound signal generation system and program | |
JP2005539261A (en) | Method for controlling time width in speech synthesis | |
JPS62102294A (en) | Voice coding system | |
Zieliński et al. | Speech Compression and Recognition | |
JP5481957B2 (en) | Speech synthesizer | |
JPS59176782A (en) | Digital sound apparatus | |
Yazu et al. | The speech synthesis system for an unlimited Japanese vocabulary | |
CN114974271A (en) | Voice reconstruction method based on sound channel filtering and glottal excitation | |
Kwan et al. | Pitch-excited ARMA lattice model for speech synthesis | |
JPS60113299A (en) | Voice synthesizer | |
Kim et al. | On the Implementation of Gentle Phone’s Function Based on PSOLA Algorithm | |
JP2010160289A (en) | Midi (r) karaoke system which automatically corrects interval | |
JP2003255930A (en) | Encoding method for sound signal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AT US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
122 | Ep: pct application non-entry in european phase |