US3335225A - Formant period tracker - Google Patents
Formant period tracker Download PDFInfo
- Publication number
- US3335225A US3335225A US346185A US34618564A US3335225A US 3335225 A US3335225 A US 3335225A US 346185 A US346185 A US 346185A US 34618564 A US34618564 A US 34618564A US 3335225 A US3335225 A US 3335225A
- Authority
- US
- United States
- Prior art keywords
- formant
- larynx
- excitation
- deriving
- response
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 210000000867 larynx Anatomy 0.000 claims description 47
- 230000005284 excitation Effects 0.000 claims description 44
- 208000019300 CLIPPERS Diseases 0.000 description 7
- 208000021930 chronic lymphocytic inflammation with pontine perivascular enhancement responsive to steroids Diseases 0.000 description 7
- 241001237259 Campanella <basidiomycete fungus> Species 0.000 description 4
- 230000008878 coupling Effects 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 4
- 238000005859 coupling reaction Methods 0.000 description 4
- 238000005259 measurement Methods 0.000 description 4
- 230000006835 compression Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000000737 periodic effect Effects 0.000 description 3
- 238000000926 separation method Methods 0.000 description 2
- 241000331231 Amorphocerini gen. n. 1 DAD-2008 Species 0.000 description 1
- 206010010071 Coma Diseases 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 210000004704 glottis Anatomy 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
Definitions
- Formant frequency measurements are provided, for formant tracking in a speech compression system, by passing the frequency range of the formant of interest in the incoming speech wave to produce a periodic waveform representative of that formant, and detecting the length of the time interval over which an integral number of consecutive half cycles of the periodic waveform occur, only within the time segment between successive pitch excitation discontinuities of the speech wave.
- the present invention relates generally to speech analyzing systems. More particularly, the invention relates to a device for deriving information regarding the frequency of energy in one of the speech formants as measured by the period of the damped sinusoid following each larynx excitation.
- This set of waves which occurs for voiced utterences, includes frequency components that generally lie in three ranges or formants; these ranges for the average male being 200 to 1,000 c.p.s., 800 to 2,300 c.p.s., and 2,300 to 3,800 c.p.s.
- Each time the larynx is reexcited the previous set of sinusoidal waves is usually completely damped because the Q of the previously existing resonant cavity drops virtually to zero in response to opening of the glottis.
- there is virtualy no phase interference between waves deriving from adjacent larynx excitations and the damped sinusoids are easily identified by filters that segment the frequency ranges occupied by the formants.
- formant frequencies can be ascertained by measuring the period of the damped sinusoid following each larynx excitation in the formant of interest. This period is inversely proportional to the formant frequency. The period can be measured as a function of the time it takes a predetermined number of half cycles of the damped sinusoid to be completed. The length of each half cycle is measured in response to the time duration between adjacent crossings of the sinusoid as it crosses a zero reference. As long as the number of predetermined half cycles is such that the damped sinusoid is still being generated while the period measurement is being con-ducted, this approach is satisfactory.
- the larynx is excited while the time duration of the half cycles is being measured, the information derived would be meaningless because there is no phase relation between the waves of adjacent larynx excitations.
- the number of half cycles can frequently be two or more. But for the first formant of high pitched female speakers, who frequently excite their larynx before the completion of a full formant frequency period, after the main larynx or pitch excitation, the number must .be limited to one.
- Another object of the invention is to provide a system for measuring formant period in response to only the continuous portions of the damped sinusoid deriving from a larynx excitation.
- An additional object is to provide a formant period measuring device that provides accurate information for low and high pitched speakers and is insensitive to phase relations of adjacent damped sinusoidal segments deriving from successive larynx excitations.
- FIGURE 1 is a block diagram of a preferred embodiment of the present invention wherein formant period is measured in response to the completion of a full cycle of the damped sinusoid following larynx excitation;
- FIGURES 2A-2F are wave forms to aid in describing the operation of FIGURE '1;
- FIGURE 3 is a block diagram of a portion of the circuitry of FIGURE 1;
- FIGURE 4 is a block diagram of a further embodiment of the invention wherein formant periodis measured in response to the completion of a half cycle of the damped sinusoid following larynx excitation;
- FIGURES 5A-5F are wave forms to aid describing the operation of FIGURE 4.
- FIGURE 1 of the drawings wherein speech source 11 feeds amplifier 12 having slow response AGC which normalizes, to a certain extent, the signal amplitude applied in parallel to pitch pulse extractor 13 and formant filter 14.
- Filter 14 is a band pass filter having a center frequency and width commensurate with the formant being analyzed; for the first formant it is a filter having a pass band between 200 and 1000 c.p.s.
- the waveform deriving from filter 14 is a series of damped sinusoids centered about axis 15, as indicated in FIG- URE 2A. In response to each larynx excitation deriving from speech source 11, there is produced a large amplitude positive wave 16.
- the time separation between peak values of adjacent ones of Waves 16 is generally termed the pitch period. Following each positive wave 16 there is produced an exponentially damped sinusoid 17 having a repetition rate equal to the formant frequency of the wave passing through filter 14.
- the time separation between axis crossings of adjacent negative going segments of wave 17, i.e., between points 18 and 19, is inversely proportional to the formant frequency of interest. According to the present invent-ion the time between these axis crossings or a predetermined number of other crossings, is measured to provide an indication of formant frequency.
- the waveform of FIGURE 2A deriving from formant filter 14 is applied to infinite clipper 21 which generates the rectangular wave of FIGURE 2B.'In response to any segment of the wave of FIGURE 2A being above axis 15, clipper 21. derives a positive voltage of constant amplitude. Clipper 21 produces a negative voltage of constant amplitude as soon as the wave of FIGURE 2A goes below axis 15. Thereby, the wave derived by clipper 21 consists of a series of constant amplitude positive and negative voltages which are in phase with the variations of the damped sinusoid illustrated in FIGURE 2A.
- the output of clipper 21, FIGURE 2B, is applied to RC differentiator 22 that derives positive and negative going pulses in response to the positive and negative going edges, respectively, of the infinitely clipped wave. These pulses are applied to half wave rectifier 23 comprising diode 24 and load resistor 25. Diode 24 is poled such that only the negative pulses are passed to resistor 25. In consequence, the waveform across resistor 25 comprises a series of negative going pulses, FIGURE 2C, each of the pulses being derived simultaneously with a negative going crossing of the damped sinusoid across axis 15, FIGURE 2A.
- the output of rectifier 23 is selectively coupled through bi-stable gate 26 to the input of bistable flip flop 27.
- pitch pulse extractor 13 derives the positive pulse indicated in FIGURE 2D. Extractor 13 may take any conventional form, such as described by Gruenz, Jr. et al. in an article published in the Journal of Acoustic Society of America, September 1949, p. 487.
- the positive pulses deriving from extractor 13 are applied to one input of gate 26.
- gate 26 opens and stays open to enable the axis crossing pulses, FIGURE 2C, to be applied to flip flop 27.
- flip flop 27 is switched so it derives the positively going rectangular wave 28, FIGURE 2E.
- Flip flop 27 remains in this state until the next pulse is generated by rectifier 23.
- flip flop 27 is restored to its initial state so its output voltage goes negatively and voltage level 29 is reached.
- sample and hold integrator 32 and reciprocal function generator 33 are provided to convert the wave of FIGURE 2E into a voltage directly proportional to the formant frequency between each larynx excitation that causes wave segments 16.
- the output of flip flop 27 on lead 34 is applied to pitch sync, sample and hold integrator 32.
- circuit 32 includes a resettable integrator 35 having a time constant selected such that rectangular wave 28 is converted into a substantially linear sawtooth for all frequencies in the formant range of interest.
- the output of reset integrator 35 is coupled to sample and hold circuit 36 that is retriggered in response to each pitch period impulse deriving from extractor 13.
- the pitch pe-' riod impulses are also applied as reset pulses to integra- 4 tor 35 via delay element 37.
- the length of time introduced by delay 37 is such 35 is at a level indicative of the time duration of the preceding wave 28 and is then reset to a Zero level prior to the occurrence of the leading edge of the next wave 28.
- the output of circuit 36 comprises a series of varying amplitude steps, the level of each being propor tional to the preceding formant period.
- the voltage deriving from circuit 32 is coupled to function generator 33 that derives a signal equal in value to the reciprocal ofits input amplitude, hence directly proportional to the formant frequency.
- FIGURE 1 While the circuit of FIGURE 1 is suitable for many speakers in the first and second formants, it does not provide an accurate measurement for some people who speak at a very high pitch. The inaccuracy is caused in such cases because the larynx may be excited before occurrence of the second negative going axis crossing 19, FIG- URE 2A. To cure this situation, the system of FIGURE 4, wherein the period between adjacent axis crossings is used to measure formant frequency, was developed.
- phase splitter 61 and full wave rectifier 62 have been substituted for half wave rectifier 23.
- the opposite polarity outputs of phase splitter 61 are applied to the cathodes of diodes 63 and 64 in rectifier 62 so that the voltage across load resistor 65 appears as a series of negative pulses, one pulse being derived in response to each axis crossing of the wave deriving from filter 14.
- FIGURES SA-SF waveforms of FIGURES SA-SF. It'is assumed that the speech analysis is being performed for a high pitched speaker sothat a larynx excitation, indicated by large positive waveseg ment 16, occurs before a pair of negative going axis crossings.
- the rectangular wave deriving from flip flop 27 is coupled to circuits 32 and 33.
- the wave is converted into a series of analog voltages, each of which represents the formant frequency characteristic of the particular larynx excitation.
- Function generator 33 is calibrated with appropriate valued resistors so that the voltage deriving from it is directly proportional to formant frequency, rather than twice the formant frequency.
- a system for measuring the formant frequency of speech resulting from larynx excitations followed by damped sinusoidal Waves comprising means responsive to a speech formant for deriving therefrom an indication of the length of time required for a predetermined number of half cycles of the damped sinusoidal wave following the larynx excitation, and means for enabling said deriving means in response to each that the output of integrator larynx excitation and for disabling said indicating means before the next larynx excitation.
- a system for measuring formant frequency of speech resulting from larynx excitations followed by damped sinusoidal waves comprising means responsive to speech formants for deriving therefrom an indication of the length of time required for a predetermined number of half cycles of the damped sinusoidal wave following the larynx excitation, and means for enabling said deriving means in response to each larynx excitation and for disabling said indicating means in response to completion of said predetermined number of half cycles.
- a formant filter responsive to said speech energy, said filter deriving a damped sinusoid in response to each larynx excitation causing said speech energy, said sinusoid oscillating about a reference axis, means for deriving an indication of the length of time required for a predetermined number of axis crossings of said sinusoid following each larynx excitation, and means for enabling said deriving means in response to each larynx excitation and for disabling said indicating means before the next larynx excitation.
- a formant filter responsive to said speech energy, said filter deriving a damped sinusoid in response to each larynx excitation causing said speech energy, said sinusoid oscillating about a reference axis, means for deriving an indication of the length of time required for a predetermined number of axis crossings of said sinusoid following each larynx excitation, and means for enabling said deriving means in response to each larynx excitation and for disabling said indicating means in response to completion of said predetermined number of said predetermined number of half cycles.
- a formant filter responsive to said speech energy, said filter deriving a damped sinusoid in response to each larynx excitation causing said speech energy, said sinusoid oscillating about a reference axis, means for deriving an impulse in response to each axis crossing of said sinusoid, means for measuring the time duration between adjacent ones of said impulses, means for deriving an indication in response to each larynx excitation, means for coupling said indication to said measuring means to enable said measuring means, and means for disabling said measuring means in response to the second impulse occurring after said indication.
- a predetermined 10 in a system for measuring the formant frequency of speech energy, the combination comprising a formant filterresponsive to said speech energy, said filter deriving a damped sinusoid in response to each larynx excitation causing said speech energy, said sinusoid oscillating about a reference axis, means for deriving an impulse in response to alternate axis crossings of said sinusoid, means for measuring the time duration between adjacent ones of said impulses, means for deriving an indication in response to each larynx excitation, means for coupling said indication to said measuring means to enable said measuring means, and means for disabling said measuring means in response to the second impulse occurring after said indication.
- a formant filter responsive to said speech energy, said filter deriving a damped sinusoid in response to each larynx excitation causing said speech energy, said sinusoid oscillating about a reference axis, means for deriving an impulse in response to each cycle of said sinusoid attaining a predetermined phase position, means for measuring the time duration between adjacent ones of said impulses, means for deriving an indication in response to each larynx excitation, means for coupling said indication to said measuring means to enable said measuring means, and means for disabling said measuring means in response to the second impulse occurring after said indication.
- measuring means includes means for deriving a signal of constant value between adjacent larynx excitations, said signal being proportional to the time between said adjacent ones of said impulses occurring during the previous pair of larynx excitations.
- said measuring means comprising filter means responsive to incoming speech signal for passing the frequency range of the formant of interest to derive therefrom a periodic waveform representative of said formant,
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Prostheses (AREA)
Description
I I3 M I l' l l'"l DELAY &-- FOEMANT PERIOD '2 I g- 8, 1967 S. J- CAMPANELLA ETAL 3,335,225
FORMANT PER IOD TRACKER Filed Feb. 20, 1964 2 Sheets-Sheet l 1G. .5. Eslfi DISABLE '23 Qqffi $7 FORMANT INFINITE AXIs -CROSS\NG GATE FF I FILTER CLIPPER DIFF 4 2 22 4 ENABLE 34: PITCH SYNC SAMPLE -HOLD INTEGRHTO 32 TF FUNCTION l3 {a GENERATOR F SPEECH PULSE A 2 PITCH soUQCE mITIzAcToR I FOEMANT I G 3 FIEQESEED INF'INITELY 35 CUPPED 34 255m SAMPLE TF M H AND use. coma IN EGRQTOR HOLD AX\$ cIzossIue PITCH PEQIOD IMPULSE F? QETUIZLI I INVENTORS SJQSEPII CAMPANELLA 6N DAVID C CouLTER ATTORNEY5 g- 8, 1967 s. J- CAMPANELLA ETAL 3,335,225
FORMANT PER I OD TRACKER 2 Sheets-Sheet 2 Filed Feb. 20, 1964 AXIS *CROSSING DIPF INFINITE CLIPPER FORMANT FILTER PH'CH PULSE EXTERCTOR V2 FOBMANT PERIOD '1 INVENTORS S. JOSEPH CAMPANELLA g- DAVID C. CouLTER BYW x Q ATTORNEYS United States Patent 3,335,225 FORMANT PERIOD TRACKER Samuel Joseph Campanella and David C. Coulter, Springfield, Va., assignors to Melpar, Inc., Falls Church, Va., a corporation of Delaware Filed Feb. 20, 1964, Ser. No. 346,185 13 Claims. (Cl. 179-1) ABSTRACT OF THE DISCLOSURE Formant frequency measurements are provided, for formant tracking in a speech compression system, by passing the frequency range of the formant of interest in the incoming speech wave to produce a periodic waveform representative of that formant, and detecting the length of the time interval over which an integral number of consecutive half cycles of the periodic waveform occur, only within the time segment between successive pitch excitation discontinuities of the speech wave.
The present invention relates generally to speech analyzing systems. More particularly, the invention relates to a device for deriving information regarding the frequency of energy in one of the speech formants as measured by the period of the damped sinusoid following each larynx excitation.
In speech bandwith compression and recognition systems, an important but frequently overlooked parameter is the frequency of each formant that arises in response to each larynx excitation. As is well known, each time the larynx is excited it produces a set of exponentially damped sinusoidal waves. This set of waves, which occurs for voiced utterences, includes frequency components that generally lie in three ranges or formants; these ranges for the average male being 200 to 1,000 c.p.s., 800 to 2,300 c.p.s., and 2,300 to 3,800 c.p.s. Each time the larynx is reexcited the previous set of sinusoidal waves is usually completely damped because the Q of the previously existing resonant cavity drops virtually to zero in response to opening of the glottis. Thus, there is virtualy no phase interference between waves deriving from adjacent larynx excitations and the damped sinusoids are easily identified by filters that segment the frequency ranges occupied by the formants.
We have found that formant frequencies can be ascertained by measuring the period of the damped sinusoid following each larynx excitation in the formant of interest. This period is inversely proportional to the formant frequency. The period can be measured as a function of the time it takes a predetermined number of half cycles of the damped sinusoid to be completed. The length of each half cycle is measured in response to the time duration between adjacent crossings of the sinusoid as it crosses a zero reference. As long as the number of predetermined half cycles is such that the damped sinusoid is still being generated while the period measurement is being con-ducted, this approach is satisfactory. If, however, the larynx is excited while the time duration of the half cycles is being measured, the information derived would be meaningless because there is no phase relation between the waves of adjacent larynx excitations. Thus, for low pitched speakers, the number of half cycles can frequently be two or more. But for the first formant of high pitched female speakers, who frequently excite their larynx before the completion of a full formant frequency period, after the main larynx or pitch excitation, the number must .be limited to one.
Because all measurements according to the present invention are made in response to measeurements on continuous waves, i.e. the damped sinusoid, the problem of pitch harmonic interference is avoided. Pitch harmonic interference arises in prior art devices because the clamped sinusoid will not generally be crosing the reference axis at the same time a new larynx excitation occurs. Hence, the waves of adjacent damped sinusoids are not usually continuous and the average number of axis crossings does not provide an accurate measure of each half cycle duration. A similar problem arises in attempts to measure formant frequency location from spectrum analysis data since the closest that a spectrum sample can fall to a formant location is a frequency equal to an integral multiple of the fundamental pitch frequency.
It is accordingly an object of the present invention to provide a new and improved system for deriving formant period information.
Another object of the invention is to provide a system for measuring formant period in response to only the continuous portions of the damped sinusoid deriving from a larynx excitation.
An additional object is to provide a formant period measuring device that provides accurate information for low and high pitched speakers and is insensitive to phase relations of adjacent damped sinusoidal segments deriving from successive larynx excitations.
It is a further object of the invention to provide a device for measuring formant period in repsonse to the time duration between a predetermined number of axis crossings of the damped sinusoid following a larynx excitation.
The above and still further objects, features and advantages of the present invention will become apparent upon consideration of the following detailed description of several specific embodiments thereof, especially when taken in conjunction with the accompanying drawings, wherein:
FIGURE 1 is a block diagram of a preferred embodiment of the present invention wherein formant period is measured in response to the completion of a full cycle of the damped sinusoid following larynx excitation;
FIGURES 2A-2F are wave forms to aid in describing the operation of FIGURE '1;
FIGURE 3 is a block diagram of a portion of the circuitry of FIGURE 1;
FIGURE 4 is a block diagram of a further embodiment of the invention wherein formant periodis measured in response to the completion of a half cycle of the damped sinusoid following larynx excitation; and
FIGURES 5A-5F are wave forms to aid describing the operation of FIGURE 4.
Reference is now made to FIGURE 1 of the drawings wherein speech source 11 feeds amplifier 12 having slow response AGC which normalizes, to a certain extent, the signal amplitude applied in parallel to pitch pulse extractor 13 and formant filter 14. Filter 14 is a band pass filter having a center frequency and width commensurate with the formant being analyzed; for the first formant it is a filter having a pass band between 200 and 1000 c.p.s. The waveform deriving from filter 14 is a series of damped sinusoids centered about axis 15, as indicated in FIG- URE 2A. In response to each larynx excitation deriving from speech source 11, there is produced a large amplitude positive wave 16. The time separation between peak values of adjacent ones of Waves 16 is generally termed the pitch period. Following each positive wave 16 there is produced an exponentially damped sinusoid 17 having a repetition rate equal to the formant frequency of the wave passing through filter 14. Thus, the time separation between axis crossings of adjacent negative going segments of wave 17, i.e., between points 18 and 19, is inversely proportional to the formant frequency of interest. According to the present invent-ion the time between these axis crossings or a predetermined number of other crossings, is measured to provide an indication of formant frequency.
The waveform of FIGURE 2A deriving from formant filter 14 is applied to infinite clipper 21 which generates the rectangular wave of FIGURE 2B.'In response to any segment of the wave of FIGURE 2A being above axis 15, clipper 21. derives a positive voltage of constant amplitude. Clipper 21 produces a negative voltage of constant amplitude as soon as the wave of FIGURE 2A goes below axis 15. Thereby, the wave derived by clipper 21 consists of a series of constant amplitude positive and negative voltages which are in phase with the variations of the damped sinusoid illustrated in FIGURE 2A.
The output of clipper 21, FIGURE 2B, is applied to RC differentiator 22that derives positive and negative going pulses in response to the positive and negative going edges, respectively, of the infinitely clipped wave. These pulses are applied to half wave rectifier 23 comprising diode 24 and load resistor 25. Diode 24 is poled such that only the negative pulses are passed to resistor 25. In consequence, the waveform across resistor 25 comprises a series of negative going pulses, FIGURE 2C, each of the pulses being derived simultaneously with a negative going crossing of the damped sinusoid across axis 15, FIGURE 2A. The output of rectifier 23 is selectively coupled through bi-stable gate 26 to the input of bistable flip flop 27.
For the initial positive going segment 16 of the damped sinusoid, gate 26 is closed, in a manner seen infra, to prevent coupling between the output of rectifier 23 and the input of flip flop 27. When the peak value of wave segment 16-is reached, pitch pulse extractor 13 derives the positive pulse indicated in FIGURE 2D. Extractor 13 may take any conventional form, such as described by Gruenz, Jr. et al. in an article published in the Journal of Acoustic Society of America, September 1949, p. 487.
The positive pulses deriving from extractor 13 are applied to one input of gate 26. Thereby, gate 26 opens and stays open to enable the axis crossing pulses, FIGURE 2C, to be applied to flip flop 27. In response to the first pulse immediately following opening of gate 26, indicative of axis crossing 18, flip flop 27 is switched so it derives the positively going rectangular wave 28, FIGURE 2E. Flip flop 27 remains in this state until the next pulse is generated by rectifier 23. In response to the next pulse, derived in response to axis crossing 19, flip flop 27 is restored to its initial state so its output voltage goes negatively and voltage level 29 is reached.
In response to flip flop 27 being restored to its initial state, it derives on lea-d 31 the pulse indicated in FIG- URE 2F. This pulse is coupled from flip flop 27 to the disable input of gate 26, and causes the gate to close. Thereby, the negative going pulses in the waveform of FIGURE 2C occurring between axis crossing 19 and the next large wave segment 16 have no effect on flip flop 27. It is thus seen that the time interval during which fiip flop 27 is activated to generate wave segment 28 is exactly equal to the period between axis crossings 18 and 19, hence inversely proportional to the form-ant frequency passing through filter 14.
To convert the wave of FIGURE 2E into a voltage directly proportional to the formant frequency between each larynx excitation that causes wave segments 16, sample and hold integrator 32 and reciprocal function generator 33 are provided. The output of flip flop 27 on lead 34, as depicted in FIGURE 2B, is applied to pitch sync, sample and hold integrator 32. As indicated in FIG- URE 3, circuit 32 includes a resettable integrator 35 having a time constant selected such that rectangular wave 28 is converted into a substantially linear sawtooth for all frequencies in the formant range of interest. The output of reset integrator 35 is coupled to sample and hold circuit 36 that is retriggered in response to each pitch period impulse deriving from extractor 13. The pitch pe-' riod impulses are also applied as reset pulses to integra- 4 tor 35 via delay element 37. The length of time introduced by delay 37 is such 35 is at a level indicative of the time duration of the preceding wave 28 and is then reset to a Zero level prior to the occurrence of the leading edge of the next wave 28. Thus, the output of circuit 36 comprises a series of varying amplitude steps, the level of each being propor tional to the preceding formant period. The voltage deriving from circuit 32 is coupled to function generator 33 that derives a signal equal in value to the reciprocal ofits input amplitude, hence directly proportional to the formant frequency.
While the circuit of FIGURE 1 is suitable for many speakers in the first and second formants, it does not provide an accurate measurement for some people who speak at a very high pitch. The inaccuracy is caused in such cases because the larynx may be excited before occurrence of the second negative going axis crossing 19, FIG- URE 2A. To cure this situation, the system of FIGURE 4, wherein the period between adjacent axis crossings is used to measure formant frequency, was developed.
The system of FIGURE 4 is identical to the one of FIGURE 1 except that phase splitter 61 and full wave rectifier 62 have been substituted for half wave rectifier 23. The opposite polarity outputs of phase splitter 61 are applied to the cathodes of diodes 63 and 64 in rectifier 62 so that the voltage across load resistor 65 appears as a series of negative pulses, one pulse being derived in response to each axis crossing of the wave deriving from filter 14.
To provide a better understanding of the operation of FIGURE 4, reference is now made to the waveforms of FIGURES SA-SF. It'is assumed that the speech analysis is being performed for a high pitched speaker sothat a larynx excitation, indicated by large positive waveseg ment 16, occurs before a pair of negative going axis crossings.
In response to each crossing of axis 15 about which generates a pulse at the beginning and end of each half cycle of the damped sinusoid except at the beginning of large amplitude segment 16. The first and second of these pulses for each larynx excitation are coupled to flip flop 27 since gate 26 is then open in response to the pitch period impulse, FIGURE 5D, deriving from circuit 13. The second pulse coupled to flip flop 27 causes the flip flop to change state to prevent gate 26 from passing additional pulses until the next pitch period impulse is generated. Thus, flip flop 27 stays in the switched state whereby wave 28 is generated for a time interval equal to one half the formant period.
The rectangular wave deriving from flip flop 27 is coupled to circuits 32 and 33. The wave is converted into a series of analog voltages, each of which represents the formant frequency characteristic of the particular larynx excitation. Function generator 33 is calibrated with appropriate valued resistors so that the voltage deriving from it is directly proportional to formant frequency, rather than twice the formant frequency.
While We have described and illustrated several specific embodiments of our invention, it will be clear that variations of the details of construction which are specifically illustrated and described may be resorted to Without departing from the true spirit and scope of the invention as defined in the appended claims.
We claim:
1. In a system for measuring the formant frequency of speech resulting from larynx excitations followed by damped sinusoidal Waves, the combination comprising means responsive to a speech formant for deriving therefrom an indication of the length of time required for a predetermined number of half cycles of the damped sinusoidal wave following the larynx excitation, and means for enabling said deriving means in response to each that the output of integrator larynx excitation and for disabling said indicating means before the next larynx excitation.
2. In a system for measuring formant frequency of speech resulting from larynx excitations followed by damped sinusoidal waves, the combination comprising means responsive to speech formants for deriving therefrom an indication of the length of time required for a predetermined number of half cycles of the damped sinusoidal wave following the larynx excitation, and means for enabling said deriving means in response to each larynx excitation and for disabling said indicating means in response to completion of said predetermined number of half cycles.
3. The system of claim 2 wherein said number equals 1.
4. The system of claim 2 wherein said number equals 2.
5. In a system for measuring the formant frequency of speech energy, the combination comprising a formant filter responsive to said speech energy, said filter deriving a damped sinusoid in response to each larynx excitation causing said speech energy, said sinusoid oscillating about a reference axis, means for deriving an indication of the length of time required for a predetermined number of axis crossings of said sinusoid following each larynx excitation, and means for enabling said deriving means in response to each larynx excitation and for disabling said indicating means before the next larynx excitation.
6. In a system for measuring the formant frequency of speech energy, the combination comprising a formant filter responsive to said speech energy, said filter deriving a damped sinusoid in response to each larynx excitation causing said speech energy, said sinusoid oscillating about a reference axis, means for deriving an indication of the length of time required for a predetermined number of axis crossings of said sinusoid following each larynx excitation, and means for enabling said deriving means in response to each larynx excitation and for disabling said indicating means in response to completion of said predetermined number of said predetermined number of half cycles.
7. The system of claim 6 wherein said predetermined number equals 1.
8. The system of claim 6 number equals 2.
9. In a system for measuring the formant frequency of speech energy, the combination comprising a formant filter responsive to said speech energy, said filter deriving a damped sinusoid in response to each larynx excitation causing said speech energy, said sinusoid oscillating about a reference axis, means for deriving an impulse in response to each axis crossing of said sinusoid, means for measuring the time duration between adjacent ones of said impulses, means for deriving an indication in response to each larynx excitation, means for coupling said indication to said measuring means to enable said measuring means, and means for disabling said measuring means in response to the second impulse occurring after said indication.
wherein said predetermined 10. In a system for measuring the formant frequency of speech energy, the combination comprising a formant filterresponsive to said speech energy, said filter deriving a damped sinusoid in response to each larynx excitation causing said speech energy, said sinusoid oscillating about a reference axis, means for deriving an impulse in response to alternate axis crossings of said sinusoid, means for measuring the time duration between adjacent ones of said impulses, means for deriving an indication in response to each larynx excitation, means for coupling said indication to said measuring means to enable said measuring means, and means for disabling said measuring means in response to the second impulse occurring after said indication.
11. In a system for measuring the formant frequency of speech energy, the combination comprising a formant filter responsive to said speech energy, said filter deriving a damped sinusoid in response to each larynx excitation causing said speech energy, said sinusoid oscillating about a reference axis, means for deriving an impulse in response to each cycle of said sinusoid attaining a predetermined phase position, means for measuring the time duration between adjacent ones of said impulses, means for deriving an indication in response to each larynx excitation, means for coupling said indication to said measuring means to enable said measuring means, and means for disabling said measuring means in response to the second impulse occurring after said indication.
12. The system of claim 11 wherein said measuring means includes means for deriving a signal of constant value between adjacent larynx excitations, said signal being proportional to the time between said adjacent ones of said impulses occurring during the previous pair of larynx excitations.
13. In a speech compression system,
means for measuring the frequency of speech formants,
said measuring means comprising filter means responsive to incoming speech signal for passing the frequency range of the formant of interest to derive therefrom a periodic waveform representative of said formant,
means responsive to said waveform for detecting the length of the time interval encompassed by an integral number of consecutive half cycles of said waveform,
means further responsive to said incoming speech signal for activating said detecting means only between successive pitch excitation discontinuities of said speech signal, and
means responsive to the detected time interval length for deriving therefrom an indication of formant frequency.
References Cited UNITED STATES PATENTS 3,020,344 2/1962 Prestigiacomo 1791 KATHLEEN H. CLAFFY, Primary Examiner. R. MURRAY, Assistant Examiner.
Claims (1)
1. IN A SYSTEM FOR MEASURING THE FORMANT FREQUENCY OF SPEED RESULTING FROM LARYNX EXCITATIONS FOLLOWED BY DAMPED SINUSOIDAL WAVES, THE COMBINATION COMPRISING MEANS RESPONSIVE TO A SPEECH FORMAT FOR DERIVING THEREFROM AN INDICATION OF THE LENGTH OF TIME REQUIRED FOR A PREDETERMINED NUMBER OF HALF CYCLES OF THE DAMPED SINUSOIDAL WAVE FOLLOWING THE LARYNX EXCITATION, AND MEANS FOR ENABLING SAID DERIVING MEANS IN RESPONSE TO EACH LARYNX EXCITATION AND FOR DISABLING SAID INDICATING MEANS BEFORE THE NEXT LARYNEX EXCITATION.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US346185A US3335225A (en) | 1964-02-20 | 1964-02-20 | Formant period tracker |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US346185A US3335225A (en) | 1964-02-20 | 1964-02-20 | Formant period tracker |
Publications (1)
Publication Number | Publication Date |
---|---|
US3335225A true US3335225A (en) | 1967-08-08 |
Family
ID=23358317
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US346185A Expired - Lifetime US3335225A (en) | 1964-02-20 | 1964-02-20 | Formant period tracker |
Country Status (1)
Country | Link |
---|---|
US (1) | US3335225A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3376386A (en) * | 1963-05-08 | 1968-04-02 | Fant Gunnar | Circuit arrangement for varying the band width of a filter in dependence of the voice fundamental frequency |
US3488442A (en) * | 1966-09-28 | 1970-01-06 | Philco Ford Corp | Single equivalent formant speech analysis system |
US3499986A (en) * | 1966-09-28 | 1970-03-10 | Philco Ford Corp | Speech synthesizer |
US3499987A (en) * | 1966-09-30 | 1970-03-10 | Philco Ford Corp | Single equivalent formant speech recognition system |
US3528011A (en) * | 1967-12-22 | 1970-09-08 | Gen Electric | Limited energy speech transmission and receiving system |
US3546584A (en) * | 1966-11-30 | 1970-12-08 | Standard Telephones Cables Ltd | Apparatus for analyzing a complex waveform containing pitch synchronous information |
US3553372A (en) * | 1965-11-05 | 1971-01-05 | Int Standard Electric Corp | Speech recognition apparatus |
US3573612A (en) * | 1967-11-16 | 1971-04-06 | Standard Telephones Cables Ltd | Apparatus for analyzing complex waveforms containing pitch synchronous information |
US3632877A (en) * | 1969-04-10 | 1972-01-04 | Singer Co | Helium voice translator utilizing either a glottal synchronous or a memory full reset signal |
US3920907A (en) * | 1974-07-03 | 1975-11-18 | Us Navy | Periodic signal detector |
WO1981003392A1 (en) * | 1980-05-19 | 1981-11-26 | J Reid | Improvements in signal processing |
US4833717A (en) * | 1985-11-21 | 1989-05-23 | Ricoh Company, Ltd. | Voice spectrum analyzing system and method |
US4862503A (en) * | 1988-01-19 | 1989-08-29 | Syracuse University | Voice parameter extractor using oral airflow |
US6093929A (en) * | 1997-05-16 | 2000-07-25 | Mds Inc. | High pressure MS/MS system |
US6219635B1 (en) | 1997-11-25 | 2001-04-17 | Douglas L. Coulter | Instantaneous detection of human speech pitch pulses |
US20110125483A1 (en) * | 2009-11-20 | 2011-05-26 | Manuel-Devadoss Johnson Smith Johnson | Automated Speech Translation System using Human Brain Language Areas Comprehension Capabilities |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3020344A (en) * | 1960-12-27 | 1962-02-06 | Bell Telephone Labor Inc | Apparatus for deriving pitch information from a speech wave |
-
1964
- 1964-02-20 US US346185A patent/US3335225A/en not_active Expired - Lifetime
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3020344A (en) * | 1960-12-27 | 1962-02-06 | Bell Telephone Labor Inc | Apparatus for deriving pitch information from a speech wave |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3376386A (en) * | 1963-05-08 | 1968-04-02 | Fant Gunnar | Circuit arrangement for varying the band width of a filter in dependence of the voice fundamental frequency |
US3553372A (en) * | 1965-11-05 | 1971-01-05 | Int Standard Electric Corp | Speech recognition apparatus |
US3488442A (en) * | 1966-09-28 | 1970-01-06 | Philco Ford Corp | Single equivalent formant speech analysis system |
US3499986A (en) * | 1966-09-28 | 1970-03-10 | Philco Ford Corp | Speech synthesizer |
US3499987A (en) * | 1966-09-30 | 1970-03-10 | Philco Ford Corp | Single equivalent formant speech recognition system |
US3546584A (en) * | 1966-11-30 | 1970-12-08 | Standard Telephones Cables Ltd | Apparatus for analyzing a complex waveform containing pitch synchronous information |
US3573612A (en) * | 1967-11-16 | 1971-04-06 | Standard Telephones Cables Ltd | Apparatus for analyzing complex waveforms containing pitch synchronous information |
US3528011A (en) * | 1967-12-22 | 1970-09-08 | Gen Electric | Limited energy speech transmission and receiving system |
US3632877A (en) * | 1969-04-10 | 1972-01-04 | Singer Co | Helium voice translator utilizing either a glottal synchronous or a memory full reset signal |
US3920907A (en) * | 1974-07-03 | 1975-11-18 | Us Navy | Periodic signal detector |
WO1981003392A1 (en) * | 1980-05-19 | 1981-11-26 | J Reid | Improvements in signal processing |
US4833717A (en) * | 1985-11-21 | 1989-05-23 | Ricoh Company, Ltd. | Voice spectrum analyzing system and method |
US4862503A (en) * | 1988-01-19 | 1989-08-29 | Syracuse University | Voice parameter extractor using oral airflow |
US6093929A (en) * | 1997-05-16 | 2000-07-25 | Mds Inc. | High pressure MS/MS system |
US6219635B1 (en) | 1997-11-25 | 2001-04-17 | Douglas L. Coulter | Instantaneous detection of human speech pitch pulses |
US20110125483A1 (en) * | 2009-11-20 | 2011-05-26 | Manuel-Devadoss Johnson Smith Johnson | Automated Speech Translation System using Human Brain Language Areas Comprehension Capabilities |
USH2269H1 (en) * | 2009-11-20 | 2012-06-05 | Manuel-Devadoss Johnson Smith Johnson | Automated speech translation system using human brain language areas comprehension capabilities |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US3335225A (en) | Formant period tracker | |
US3553372A (en) | Speech recognition apparatus | |
US4039754A (en) | Speech analyzer | |
GB1586417A (en) | Method of and apparatus for the inductive measurment of fluid flow | |
US4359604A (en) | Apparatus for the detection of voice signals | |
GB1012765A (en) | Apparatus for the analysis of waveforms | |
US3617636A (en) | Pitch detection apparatus | |
US4217808A (en) | Determination of pitch | |
US3546584A (en) | Apparatus for analyzing a complex waveform containing pitch synchronous information | |
US3350651A (en) | Waveform converters | |
US3020344A (en) | Apparatus for deriving pitch information from a speech wave | |
US3743420A (en) | Method and apparatus for measuring the period of electrical signals | |
GB981383A (en) | Sound analyzing system | |
US3573612A (en) | Apparatus for analyzing complex waveforms containing pitch synchronous information | |
GB1260735A (en) | Vocoder speech transmission system | |
US3019387A (en) | Method and apparatus for damped wave analysis | |
US3127477A (en) | Automatic formant locator | |
US2901699A (en) | Frequency measuring instrument | |
US4253373A (en) | Tuning device for musical instruments | |
US4181949A (en) | Method of and apparatus for phase-sensitive detection | |
US3395345A (en) | Method and means for detecting the period of a complex electrical signal | |
US3776024A (en) | Densitometer components | |
US3471781A (en) | Apparatus for detecting the echoes of transmitted signals | |
SU1166011A1 (en) | Digital meter of q-factor of resonance system | |
SU1087918A1 (en) | Low-frequency device for conversion of phase shift to digital code |