[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

US2635146A - Speech analyzing and synthesizing communication system - Google Patents

Speech analyzing and synthesizing communication system Download PDF

Info

Publication number
US2635146A
US2635146A US133131A US13313149A US2635146A US 2635146 A US2635146 A US 2635146A US 133131 A US133131 A US 133131A US 13313149 A US13313149 A US 13313149A US 2635146 A US2635146 A US 2635146A
Authority
US
United States
Prior art keywords
frequency
wave
circuit
formant
energy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US133131A
Inventor
John C Steinberg
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AT&T Corp
Original Assignee
Bell Telephone Laboratories Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bell Telephone Laboratories Inc filed Critical Bell Telephone Laboratories Inc
Priority to US133131A priority Critical patent/US2635146A/en
Application granted granted Critical
Publication of US2635146A publication Critical patent/US2635146A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S84/00Music
    • Y10S84/09Filtering

Definitions

  • This invention relates to the analysis and synthesis of speech or similar complex signal waves.
  • the invention also relates to methods and means for transmitting with reduced frequency-range the intelligence that is contained in a train of such waves from one point where the waves are analyzed to another point where the waves are reconstructed.
  • Patent No. 2,243,527 patented May 27, 1941, to Homer W. Dudley.
  • the present invention is an improvement on this type of system and makes use of an analyzer and a synthesizer that are different from those shown in that patent.
  • speech signals may be viewed as relatively high-frequency sound waves that have been modulated at a relatively lowfrequency rate. They are commonly divided into two main classifications according to their voiced or unvoiced characteristics. In the voiced class, the speakers vocal cords are set into vibration by his breath stream, and there is produced a sound wave in which the components are in harmonic frequency relation to a fundamental wave component which may or may not actually be present in the sound. In this class,'the majority of the signal energy is concentrated in the lower frequencies.
  • the audible sound is produced by forcing the speakers breath through one or more constrictions in the vocal passageway to cause the air flow to become turbulent in such manner that the waves energy is distributed more or less continuously throughout its frequency spectrum, but with the majority of the energy at the higher end of its spectrum.
  • the audible-frequency components that are thus superimposed upon the speakers breath stream are then modulated in the vocal passage extending from the glottis to the mouth and nose openings in a manner which may be termed cavity modulation.
  • This cavity modulation process controls the shape of the modulated envelope of the sound wave and usually causes several such formants or vocal resonances to exist in a speech signal wave.
  • the spectra of the different speech sounds can be adequately represented as regards intelligibility by three, at most, of such frequency regions of wave enforcement.
  • the cavity modulation process being the result of physical movements of the body components which bound the vocal passageway, occurs at relatively low rates which are definable by frequency components in the region below 40 or 50 cycles per second.
  • definable as here used is intended to mean that the wave form of the modulation may be reconstructed with an accuracy that is sufficient for most signal reproduction purposes by frequency components that are not greater than the specified 40 or 50 cycles per second.
  • the complex signal wave is analyzed in much the same manner that the human ear is thought to analyze such waves. That is to say, the wave is analyzed on a time-frequency-intensity basis to determine the frequency position and the relative intensity of each of its principal formants or vocal resonances. Indicia of these frequency locations and relative intensities are derived and are utilized in controlling the synthesis or reconstruction of the original signal.
  • This analyzing process differs materially from the previously known analyzing processes in that it subdivides the original signal wave into only two or three principal bands to determine the positions of the vocal resonances instead of employing a great number of narrow band-width filters to subdivide the signal wave on a frequency basis. It also differs in that in addition to derivingindicia of the main energy groupings in a signal wave, there are derived indicia of the relative intensity, or energy content, of each vocal resonance or formant.
  • This synthesizing, or reconstructing, process differs from the prior art in that the energy of the reconstructed signal is not only grouped or concentrated at the same spectrum location as in the original signal, but they are individually controlled to the same relative order of magnitude as existed in the original signal.
  • the formant damping of all reconstructed signals is caused to change in substantial accordance with the way the damping of the formants changes with frequency in an average signal source. Average as here used is to be understood as being synonymous with the mean of a large number of representative sources.
  • Fig. 1 is a combined schematic and block diagram of the transmitting or analyzing apparatus of a system in accordance with the invention
  • Fig. 2 is a similar diagram of the receiving or synthesizing apparatus of the same system
  • Fig. 3 is an explanatory graph of the frequencyattenuation characteristics of certain resonant elements which are included in the above-referred to transmission paths and which will be later described;
  • Fig. 4 is a graph of the energy-frequency distribution of a characteristic vowel sound to which reference is made in the description herein.
  • pick-up device I which may be a microphone or any other suitable transducer for converting sound vibrations, mechanical vibrations, or light vibrations into electric vibrations.
  • the pick-up device Ill feeds into the volume operated ain adjusting device 12 which may be an amplifier of the type that is disclosed in Patent 2,019,577, patented November 5, 1935, to D. Mitchell et al.
  • the upper branch including the frequency counter l4 and full wave rectifier it, constitutes the pitch determining branch in which are produced indications of the frequency and the relative amplitude of the fundamental wave component when the input signal is of the voiced class.
  • the amplitude indication from this branch is also used in conjunction with the similar indication from the highest frequency formant-determining branch to distinguish between voiced and unvoiced input signals.
  • the three lower circuit branches including filters F2, F3, and F4 constitute the vocal resonance or formant-indicating portion of the system. They are substantially identical and for this reason, the arrangement of only one of them has been shown in detail.
  • the pitch determining or fundamental fre- 'quency control branch of the circuit extends from the output of amplifier l2 through wave filter F1 and an isolating amplifier.
  • Filter F1 may be so constructed that it asses the voiced frequencies below about 400 cycles per second.
  • Equalizer E1 is connected to the output of the amplifier. This equalizer E1, as disclosed in R. R. Riesz, Patent 2,183,243, December 12, 1939, has its loss increasing with frequency so as to insure that the fundamental frequency which may vary from about to 400 cycles is transmitted at a high power level compared to any upper harmonics that may be present.
  • Equalizer E1 is not an essential part of the invention, but its use is here indicated since it increases the certainty that frequency counter M will be actuated by the fundamental wave component and not by a prominent upper harmonic component.
  • the equalized wave from E1 is supplied to the input of frequency counter l4.
  • Rectifier I6 is also connected to the output of the isolating amplifier. It may be any suitable arrangement such as the indicated full-wave rectifying device and 50 cycle low-pass filter for producing in connecting circuit 30 a direct-current, the magnitude of which is substantially linearly proportional to the energy in the band of wave components.
  • This directcurrent is subject to the same amplitude fluctuations that occur in the band of signal wave components, which fluctuations do not contain components that are greatly in excess of those equivalent to 50 cycles per second. Variations in the amplitude of this direct-current control the transmission loss through the variable attenuator portion of variable amplifier 34 in the synthesizing apparatus in a manner and for the purpose which will be later described.
  • Frequency counter I4 which is connected to the output of equalizer E1 consists of a pulse producing circuit comprising the gas-filled tubes 38 and 4c and a rectifying, circuit which includes the diodes 54 and 56.
  • the input transformer T1 is a three winding transformer having two secondary windings 36, 31, the polarity senses of which are reversed so that the control grid end of one winding is positive when the signal voltage that is applied to the primary winding is in the positive half of its cycle and the control grid end of the other winding is positive when the signal voltage applied to the primary winding is in the negative half of its cycle.
  • Bias battery 43 is provided to bias the control electrodes of the gas-filled tubes by a suitable amount.
  • Resistors 42 and 46 are included in the cathode circuit of the gas-filled tubes 38, 40, as is indicated.
  • Capacitor 44 is connected between the cathodes of the two gas-filled tubes and is shunted by the capacitor-resistor combination comprising resistors 45, 41, and capacitors 48, 50.
  • the anode electrodes of diodes 54, 56 are connected to the junction of respective resistor-capacitor combinations, as shown, and
  • the cathode circuit of these diodes including load resistor 52 is connected to the mid-point of resistors 45, 41.
  • Inductors 51, 58, and capacitor 59 constitute a conventional 50-cycle low-pass filter.
  • the output of frequency counter i4 is connected over connecting pair 6! to the relaxation oscillator 62 in the synthesizing apparatus of Fig. 2. It is the function of frequency counter M to p10- prise in load resistor 52 a pulse of direct-current each time that the equalized signal wave from E1 crosses its time axis.
  • the formant or vocal resonance-determining portion of the analyzing circuit proceeds from the output of amplifier [2 to the inputs of wave filters F2, F3, and F4, the attenuation characteristics of which are so arranged that they pass wave components that normally contain the first, second, and third formants, respectively, of the signal wave.
  • filter Fa may transmit the frequency band from about 300 to 800 cycles per second.
  • Filter F2 may transmit frequencies in the band from 800 to 2300 cycles per second, and filter F4 may transmit wave frequencies in excess of 2300 cycles per second.
  • The-respective filtered Wave components may be amplified before they. are supplied to the inputs of frequency counter 18, 22, or'26 vand the inputs'of full wave rectifier 20, 24, or 28.
  • the connecting transmission paths between the analyzing apparatus of Fig. 1 and the synthesizing apparatus of Fig. 2 are here shown as a plurality of conductor pairs which may be limited-frequency lines or which may be any one of a number of other suitable types of transmission means.
  • the eight individual control currents, each of which has a band width that is equivalent to about 50 cycles per second, may be shifted into a single continuous-frequency band for transmission by radio or any other suitable means if it is desirable to so do.
  • a relaxation oscillator 62 provides a source of discontinuous electric waves of rectangular or saw tooth shape in which a great number of wave components are in harmonic frequency relation to the fundamental wave component. The frequency distribution of these wave components resembles the distribution of the components of the vocalcord waves or voiced" speech signals.
  • This source of rectangular waves is occasionally referred to as the buzz source.
  • It may be a conventional multivibrator oscillator of which several suitable types are known in the art, or it may be 'alrelaxation oscillator such as is shown in Fig. 3 of Patent 2,183,248, December 12, 1939, to R. R. Reisz.
  • the frequency of the fundamental wave of this oscillator is determined by the magnitude of the direct-current that flows through a .resistor in the oscillators grid circuit.
  • This resistor terminates connecting pair BO'and the direct-current which flows through the grid resistor is the current that is produced by the equal amplitude current pulses in load resistor 52 in the fundamental frequency control branch of the analyzing circuit.
  • each wave source 62, 64 is connected through isolating amplifiers to the contacts of the unbiased polar relay 66.
  • Theintensity or energy-level indicating currents from the fundamental frequency branch and the high frequency or third formant branch of the analyzing circuit act in opposition in the windings of this relay.
  • the up er relay winding is connected across connecting circuit 68 and receives energy from the third formant branch of the analyzer circuit as detected by rectifier 28 (Fig. 1).
  • the lower relay winding is connected across connecting pair 30 and receives energy from the fundamental frequency branch of the analyzer circuit of Fig. 1.
  • relay 66 will move its armatures in accordance with whichever type of signal energy is predominant in the original signal wave at any given instant, and connect the energy from either relaxation oscillator 62 Or noise generator 64 to the synthesizing circuit branches of the receiving apparatus.
  • the fundamental frequency reconstruction branch of the circuit comprises filter F which is a low-pass filter having substantially the same frequency attenuation characteristic as filter F1 in the analyzing apparatus.
  • the output from this filter is connected through an isolating amplifier to the input of a variable amplification unit 34.
  • Variable amplifier 34 comprises a sub stantially constant gain amplifier section including vacuum tube 94 and a variable-attenuation unit 32 of the type that is commonly designated as a vario-losser. Together, these component constitute'a volume control circuit, the output level of which is regulated or controlled in accordance with the magnitude of the fundamental signal energy that is detected or rectified in full-Wave rectifier I6 of Fig. 1.
  • This detected energy is received over connecting pair 38 and is impressed across the unilaterally-conducting elements 93, 95 in vario losser 32.
  • This vario losser may be any one of several well-known types and may suitably be as shown in which the unilateral devices 93, 95 may be composed of any suitable non-ohmic substance such as tellurium-copper.
  • Such unilateral devices are commonly called varistors. They are arranged in a bridge-connected circuit between input transformer T2 and output transformer T3. The resistance of these varistor units to alternating current is inversely proportional, over a wide range of current values, to the magnitude of the direct-current that flows through them.
  • the transmission loss between input transformer T2 and output transformer T3 is directly controlled by the magnitude of the current in connecting pair 39 and varies in inverse relation to changes in the magnitude of this current.
  • the amplifier section comprising vacuum tube 94 is substantially a constant-gain device and, therefore, the overall transmission equivalent between input transformer T2 and output transformer T4 depends upon the magnitude of the energy detected in the fundamental frequency branch of the analyzing circuit.
  • the first, second, and third formant control branches of the synthesizing circuit comprise shaping networks I0, I2, and I4 together with their variable amplifying units 96, 98, and I00, respectively.
  • the first, second, and third formant control branches may be identical with those of the first formant control branch.
  • Each shaping network comprises a direct coupled amplifier, a variable resonance unit, and a fixed gain amplifying section.
  • the direct coupled amplifier comprises vacuum tubes 84, 86 and their connecting circuits.
  • a potentiometer 88 forms a termination for connecting circuit 90 and is included in the control electrode-cathode circuit of vacuum tube 86.
  • this potentiometer is connected to the cathode of this tube and the movable contact is connected to the control electrode of this tube.
  • the control electrode of tube 86* is made more negative with respect to its cathode. This action increases the anode potential of this tube and also increases the control electrode potential of tube 84 and the anode current flow in this tube. This increased anode current flows through the bias control windings 8
  • the biasing current is varied in inductor 80, the alternating current resistance of winding 82 varies and hence the figure of merit of this coil varies. Variations in this figure of merit when combined with damping resistor I6 cause a broadening or sharpening of the resonant peak and hence a variation in the damping factor of these resonances.
  • the value of resistor I6 in each network may be chosen of such value that the damping that is imparted to the formant produced by the network varies with changes in the frequency position of the formant in substantially the same manner as the mean damping characteristic of the corresponding formant of a large number of representative vocal tracts.
  • the formant damping resembles that of an average voice over the entire range of the synthesized signal.
  • it may occasionally be desirabl to include a small resistor (not shown) in series with the coil 82 and capacitor I8 of the anti-resonant circuit. This resistor will ordinarily not exceed two or three ohms.
  • the insertion-loss of the variable resonant unit of shaping network 10 is indicated by curve I06 of Fig. 4 for the condition when the anode current flowing in bias control windings BI is adjusted so that the resonant frequency of this combination is about 750 cycles per second.
  • Curves I08 .and H0 or this same figure show the insertionloss of similar resonant combinations in shapin networks '12 and 14, respectively, when they are adjusted for resonant frequencies of about 1500 and 2550 cycles, respectively.
  • the effect of resistor 16 and the varying figure of merit in this circuit in controlling the damping factor may be seen from the following observed results on one tested embodiment of the invention when the circuit constants were chosen to simulate speech signals.
  • the damping of the resonant unit in the first formant-shaping network varied from 1300 decibels per second to 3000 decibels per second as the resonant frequency was varied from 300 to 900 cycles per second.
  • the damping of the resonant unit in the second formant-shaping network varied from 8000 decibels per second to 11,000 decibels per second as the resonant frequency was changed from 900 to 2700 cycles per. second.
  • the damping of the resonant unit in the third formant shaping network changed from 10,000 decibels to 15,000 decibels per second as the resonant frequency was changed from '3000 to 9000 cycles per second.
  • the resonant units in shaping networks 10, 12, and 14 are substantially the same except that the values of damping resistor 16, the capacitor '18 and inductance 82 are varied to cause the frequency of resonance of each resonant combination to coincide-with the minimum frequency of the frequency range that is to be covered by that unit.
  • the electron discharge device 19 was a conventional type 608G vacuum tube and in which the undesignated circuit components were of conventional values, the following values were found to be suitable:
  • Inductance 82 is so constructed that when the minimum anode current flows through its bias control windings 8
  • the anode current in vacuum tube 84 is suitably controlled to produce this desired in ductance value for each of the shaping networks. As the current output from the respective frequency counter in the analyzer circuit increases, corresponding to an increase in the frequency of the analyzed formant, the inductive value of the winding 32 is decreased and the resonant frequency of the resonant combination is suitably increased.
  • the selectively-shaped wave components from shaping network may be transmitted through an isolating amplifier to the variable amplifying unit 96, the circuit details of which are identical to those that have been previously described in connection with the variable amplifying unit 34 in the fundamental frequency reproduction branch of the circuit. Except for the above-noted differences in the values of the damping resistors and resonant combination, the formant-control branches for the second and third formants are identical with the above-described first formantcontrol branch.
  • a conventional mixing amplifier m2 is connected to the Qlltput pf each or the '10 three formant control branches and to the output of the fundamental frequency reproduction branch to suitably combine the shaped wave components before they are further amplified and applied to a conventional sound reproducer I04.
  • the magnitude of one of these currents in each formant-analyzing branch indicates the frequency of the maximum-amplitude wave component in the respective frequency subband that corresponds to a vocal resonance
  • the magnitude of the other current in this formantanalyzing branch indicates the total signal energy that is contained in the respective frequency subband.
  • the fundamental-frequency indicating directcurrent in connecting pair 60 controls the grid bias in relaxation oscillator 52 in such fashion that this oscillator produces a discontinuous wave having a fundamental frequency of about 150 cycles per second.
  • the energy or intensity indicating current in connecting pair 30 is supplied t the lower winding of relay 66 where it opposes and overcomes the effect in the upper winding of this relay of the energy indicating current in the third formant connecting pair 68.
  • Relay 66 therefore, moves its armatures to their lower contacts and the output of oscillator 62 is simultaneously supplied to the input of the 400-cycle low-pass filter F5 and the formantshaping networks 10, 12, and M. In each of these shaping networks, the frequency indicating current as received over connecting pair 90, etc.
  • damping factors while not exact replicas of corresponding damping factors in the original signal wave, closely resemble these factors.
  • the damping of each resonant unit was observed to change from a specified value to a relatively much higher value as the inductance of winding 82 was changed to cause the resonant frequency of the parallel combination to change from its lowest to its highest operating condition.
  • the exact amounts by which these damping factors changed has been previously described in connection with the circuit details of these shaping networks.
  • the formant-control currents are supplied to the shaping networks 1D, 12, and 14, the intensity or energy-level control currents from the respective analyzer branches are supplied over the individual connecting pairs 30, 92, etc. to the variable amplification units 34, 96, 98, and I00.
  • the resistance to alternating currents of the unilaterally-conducting devices 93, 95 in the vario-losser 32 decreases as the direct current that flows through these devices increased.
  • the transmission loss through the appropriate vario-losser 32 is decreased and the amplitude of the shaped synthesizing wave component is increased.
  • the wave components from all branches are united or mixed in mixing amplifier I02. These combined wave components may be further amplified, if desired, before they actuate the sound reproducer I04 to construct a synthesized sound signal in which the frequency-intensity distribution is substantially the same as is shown in Fig. 3 for the original sound signal.
  • analyzing means productive from each formant of a first electrical quantity which varies in accordance with which wave component is of maximum amplitude in said formant and a second electrical quantity which varies in accordance with the amount of signal energy contained in said formant, a source of electric waves having a plurality of wave components, a sound reproducer responsive to said wave components, a plurality of variable transmission paths interconnecting said wave source and said reproducer, each of said paths including variable frequency-selective and damping means and variable volume-control means that limit the transmission of wave components therethrough, said variable frequencyselective and damping means comprising a resistor in series with said path and a parallel connected inductor-capacitor shunted across said path, the effective inductance and the resistance to alternating components of said inductor being variable in response to said first electric quantity, said resistor and inductor being so proportioned that their combined resistance
  • analyzing means productive from each formant of an electrical quantity the magnitude of which varies in accordance with the varying frequency of the wave component of maximum amplitude in said formant, a source of electric waves having a plurality of wave components, a sound reproducer responsive to said wave components, a circuit of variable resonance frequency and variable damping between said source of electric waves and said reproducer for each formant, each of said variable resonance frequency circuits comprising a variable inductor, the effective inductance value and the resistance to alternating wave components of which is responsive to said produced electric quantity, and said variable damping means including a resistor connected to said variable inductor and being so proportioned relative to said inductor that their combined resistance to alternating wave components varies in a preassigned manner as the magnitude of said controlling electric quantity is varied.
  • an analyzing station which comprises means for analyzing speech to provide indicia of the frequency and energy of the fundamental component of said speech and similar indicia of the frequency and energy of each of a plurality of higher frequency formants of said speech, means for transmitting said indicia to a synthesizing station and, at said synthesizing station, means for differentially combining the indicia of the speech fundamental energy and of the highest formant energy to provide a control signal, a source of harmonically related sound frequency oscillations having a fundamental frequency, a source of noise energy, a sound reproducer adapted to be energized from said sources, frequency selective means equal in number to the speech fundamental and formants connected intermediate said sources and said reproducer, amplifying means associated with each frequency selective means, means for selectively connecting said two sources alternatively to said frequency selective means under control of said control signal, means for varying each of said frequency selective means under control of one of said frequency indicia, and means for varying

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Electrophonic Musical Instruments (AREA)

Description

, i f g vvv April 14, 1953 J c, STEINBERG 2,635,146
SPEECH ANALYZING AND SYNTHESIZING COMMUNICATION SYSTEM Filed Dec. 15, 1949 a Sheets-Sheet 1' &
FIG. x 3% AM? I /6 SPEECH INPU T 2a AMP 68 INVENTOR J c. STE/NBERG 77. X9.
ATTORNEY April 14, 1953' Y J. c, STEINBERG I 1 5 SPEECH ANALYZING AND SYNTHESIZING COMMUNICATION SYSTEM Filed Dec. 15, 1949 v 3 Shee'iQs-Sheet 2 REC.
ATTORNEY April 14, 1953 J c, sTElNBERG 2,635,146
SPEECH ANALYZING AND SYNTHESIZING COMMUNICATION SYSTEM Filed Dec. 15, 1949 a Sheets-She't a o n 1 u v FREQUENCY /N CYCLES PEP SECOND 3O f a W I: 3 E E I l i g k 3 l0 '6! O V I 1 1 1 1 FREQUENCY/N CYCLES PEI? SECOND /N l E N TOR J. c. STEM/BERG Patented Apr. 14, 1 953 SPEECH ANALYZING AND SYNTHESIZING COMMUNICATION SYSTEM John C. Steinberg, Short Hills, N. J assignor to Bell Telephone Laboratories, Incorporated, New York, N. Y., a corporation of New York Application December 15, 1949, Serial No. 133,131
3 Claims. 1
This invention relates to the analysis and synthesis of speech or similar complex signal waves. The invention also relates to methods and means for transmitting with reduced frequency-range the intelligence that is contained in a train of such waves from one point where the waves are analyzed to another point where the waves are reconstructed.
A system of this general type is disclosed and claimed in Patent No. 2,243,527, patented May 27, 1941, to Homer W. Dudley. The present invention is an improvement on this type of system and makes use of an analyzer and a synthesizer that are different from those shown in that patent.
It is well recognized that speech signals may be viewed as relatively high-frequency sound waves that have been modulated at a relatively lowfrequency rate. They are commonly divided into two main classifications according to their voiced or unvoiced characteristics. In the voiced class, the speakers vocal cords are set into vibration by his breath stream, and there is produced a sound wave in which the components are in harmonic frequency relation to a fundamental wave component which may or may not actually be present in the sound. In this class,'the majority of the signal energy is concentrated in the lower frequencies. In the unvoiced class, the audible sound is produced by forcing the speakers breath through one or more constrictions in the vocal passageway to cause the air flow to become turbulent in such manner that the waves energy is distributed more or less continuously throughout its frequency spectrum, but with the majority of the energy at the higher end of its spectrum. The audible-frequency components that are thus superimposed upon the speakers breath stream are then modulated in the vocal passage extending from the glottis to the mouth and nose openings in a manner which may be termed cavity modulation. Depending upon the positions of the larnyx, soft palate, lower jaw, tongue, and lips, certain of the frequencies are transmitted through the vocal passage with greater efficiency than are other frequencies; and as a result, certain of the frequency components or overtones appear to be reenforced relative to the others. These frequency regions of wave reenforcement are sometimes called vocal resonances or formants. For the purpose of this description, these terms shall be taken to be synonymous and to be indicative of the frequency locations of the major portions of the waves energy. The formants shift in frequency and strength from sound to sound, and may even shift during some sounds, because of changes in the sizes and shapes of the vocal passage cavities, which changes are caused by movements of the tongue, lips, and other components. This cavity modulation process controls the shape of the modulated envelope of the sound wave and usually causes several such formants or vocal resonances to exist in a speech signal wave. In general, the spectra of the different speech sounds can be adequately represented as regards intelligibility by three, at most, of such frequency regions of wave enforcement. The cavity modulation process, being the result of physical movements of the body components which bound the vocal passageway, occurs at relatively low rates which are definable by frequency components in the region below 40 or 50 cycles per second. The term definable as here used, is intended to mean that the wave form of the modulation may be reconstructed with an accuracy that is sufficient for most signal reproduction purposes by frequency components that are not greater than the specified 40 or 50 cycles per second.
In accordance with the subject invention, the complex signal wave is analyzed in much the same manner that the human ear is thought to analyze such waves. That is to say, the wave is analyzed on a time-frequency-intensity basis to determine the frequency position and the relative intensity of each of its principal formants or vocal resonances. Indicia of these frequency locations and relative intensities are derived and are utilized in controlling the synthesis or reconstruction of the original signal. This analyzing process differs materially from the previously known analyzing processes in that it subdivides the original signal wave into only two or three principal bands to determine the positions of the vocal resonances instead of employing a great number of narrow band-width filters to subdivide the signal wave on a frequency basis. It also differs in that in addition to derivingindicia of the main energy groupings in a signal wave, there are derived indicia of the relative intensity, or energy content, of each vocal resonance or formant.
In the synthesizing process, electric energy having frequency components that are distributed over the spectrum of the signal wave is transmitted through a plurality of variable and selective parallel transmission paths. In each of these paths, this electric energy is shaped to resemble one of the formants or vocal resonances of the original signal wave. The energy within each shaped formant is also adjusted to the same amount relative to the energy of the other formants as existed in the original signal wave. This frequency-shaped and intensity-controlled electric energy is then supplied to a conventional sound reproducer to reconstruct the original sound wave in the usual manner. This synthesizing, or reconstructing, process differs from the prior art in that the energy of the reconstructed signal is not only grouped or concentrated at the same spectrum location as in the original signal, but they are individually controlled to the same relative order of magnitude as existed in the original signal. In addition, the formant damping of all reconstructed signals is caused to change in substantial accordance with the way the damping of the formants changes with frequency in an average signal source. Average as here used is to be understood as being synonymous with the mean of a large number of representative sources.
A more complete understanding of the invention may be gained from the following description of one of its preferred embodiments, when considered in conjunction with the drawing, in which:
Fig. 1 is a combined schematic and block diagram of the transmitting or analyzing apparatus of a system in accordance with the invention;
Fig. 2 is a similar diagram of the receiving or synthesizing apparatus of the same system;
Fig. 3 is an explanatory graph of the frequencyattenuation characteristics of certain resonant elements which are included in the above-referred to transmission paths and which will be later described; and
Fig. 4 is a graph of the energy-frequency distribution of a characteristic vowel sound to which reference is made in the description herein.
Referring now to Fig. 1, speech or other sounds that are to be analyzed enter the system through pick-up device I which may be a microphone or any other suitable transducer for converting sound vibrations, mechanical vibrations, or light vibrations into electric vibrations. The pick-up device Ill feeds into the volume operated ain adjusting device 12 which may be an amplifier of the type that is disclosed in Patent 2,019,577, patented November 5, 1935, to D. Mitchell et al. The output from amplifier 12, which is substantially constant irrespective of variations in the level of the signals received from microphone [0, passes into four separate circuit branches. The upper branch, including the frequency counter l4 and full wave rectifier it, constitutes the pitch determining branch in which are produced indications of the frequency and the relative amplitude of the fundamental wave component when the input signal is of the voiced class. The amplitude indication from this branch is also used in conjunction with the similar indication from the highest frequency formant-determining branch to distinguish between voiced and unvoiced input signals. The three lower circuit branches including filters F2, F3, and F4 constitute the vocal resonance or formant-indicating portion of the system. They are substantially identical and for this reason, the arrangement of only one of them has been shown in detail. The
remaining two channels are symbolically indi- The pitch determining or fundamental fre- 'quency control branch of the circuit extends from the output of amplifier l2 through wave filter F1 and an isolating amplifier. Filter F1 may be so constructed that it asses the voiced frequencies below about 400 cycles per second. Equalizer E1 is connected to the output of the amplifier. This equalizer E1, as disclosed in R. R. Riesz, Patent 2,183,243, December 12, 1939, has its loss increasing with frequency so as to insure that the fundamental frequency which may vary from about to 400 cycles is transmitted at a high power level compared to any upper harmonics that may be present. Equalizer E1 is not an essential part of the invention, but its use is here indicated since it increases the certainty that frequency counter M will be actuated by the fundamental wave component and not by a prominent upper harmonic component. The equalized wave from E1 is supplied to the input of frequency counter l4. Rectifier I6 is also connected to the output of the isolating amplifier. It may be any suitable arrangement such as the indicated full-wave rectifying device and 50 cycle low-pass filter for producing in connecting circuit 30 a direct-current, the magnitude of which is substantially linearly proportional to the energy in the band of wave components. This directcurrent is subject to the same amplitude fluctuations that occur in the band of signal wave components, which fluctuations do not contain components that are greatly in excess of those equivalent to 50 cycles per second. Variations in the amplitude of this direct-current control the transmission loss through the variable attenuator portion of variable amplifier 34 in the synthesizing apparatus in a manner and for the purpose which will be later described.
Frequency counter I4 which is connected to the output of equalizer E1 consists of a pulse producing circuit comprising the gas-filled tubes 38 and 4c and a rectifying, circuit which includes the diodes 54 and 56. The input transformer T1 is a three winding transformer having two secondary windings 36, 31, the polarity senses of which are reversed so that the control grid end of one winding is positive when the signal voltage that is applied to the primary winding is in the positive half of its cycle and the control grid end of the other winding is positive when the signal voltage applied to the primary winding is in the negative half of its cycle. Bias battery 43 is provided to bias the control electrodes of the gas-filled tubes by a suitable amount. Resistors 42 and 46 are included in the cathode circuit of the gas-filled tubes 38, 40, as is indicated. Capacitor 44 is connected between the cathodes of the two gas-filled tubes and is shunted by the capacitor-resistor combination comprising resistors 45, 41, and capacitors 48, 50. The anode electrodes of diodes 54, 56 are connected to the junction of respective resistor-capacitor combinations, as shown, and
.the cathode circuit of these diodes including load resistor 52 is connected to the mid-point of resistors 45, 41. Inductors 51, 58, and capacitor 59 constitute a conventional 50-cycle low-pass filter. The output of frequency counter i4 is connected over connecting pair 6! to the relaxation oscillator 62 in the synthesizing apparatus of Fig. 2. It is the function of frequency counter M to p10- duce in load resistor 52 a pulse of direct-current each time that the equalized signal wave from E1 crosses its time axis.
The manner in which this circuit accomplishes this purpose may be visualized if it is assumed that at any given instant the upper gas-filled tube 38 is conducting saturation current and the lower gas-filled tube 40 is non-conductive; Under these conditions, there is a potential drop across cathode resistor 42, which potential difference charges capacitor 44 and the shunting capacitors 48, 50. If the polarity of the signal wave in the primary winding is now reversed, the potential of the control electrode of tube 38 becomes negative and that of the control electrode of tube 40 becomes positive. When the positive potential of this latter control electrode reaches such a value that saturation current starts to flow in tube 40, a large potential difference is created across cathode resistor 46. Because capacitors 44, 48, and 50 cannot instantaneously adjust their charges to this revised potential condition, the cathode of tube 38 is momentarily raised to a potential which is positive with respect to its anode, and since its control electrode is now negative with respect to the cathode, current conduction in this tube is extinguished. Extinction of current con-- duction in tube 38 removes the potential difference across cathode resistor 42 and reverses potential differences which cause current conduction in one or the other of diodes 54, 55
with a consequent current pulse in load resistor 52. Since the charge on these capacitors is reversed each time that current conduction in gas tubes 38, 40 is changed, which change occurs twice for each cycle of the applied signal wave, it follows that two current pulses are produced in'load resistor 52 for each cycle of the signal wave. saturation current fiows in gas tubes 38, 40, the
Furthermore, since the same value of potential charge of capacitors 48, 50 and the current that fiows through resistors 45, 41 are of the same magnitude each time the polarity of the charge is reversed. It follows, therefore, that the current pulses in load resistor 52 are each of the same magnitude, and the value of these pulses, as averaged by the 50-cycle lowpass filter, is a direct indication of the number of reversals in-and, hence, the frequency of the fundamental wave component. This averaged current is supplied over connecting pair 60 to control the frequency of oscillation of relaxation oscillator 62 in the synthesizing apparatus (Fig. 2).
The formant or vocal resonance-determining portion of the analyzing circuit proceeds from the output of amplifier [2 to the inputs of wave filters F2, F3, and F4, the attenuation characteristics of which are so arranged that they pass wave components that normally contain the first, second, and third formants, respectively, of the signal wave. For example, filter Fa may transmit the frequency band from about 300 to 800 cycles per second. Filter F2 may transmit frequencies in the band from 800 to 2300 cycles per second, and filter F4 may transmit wave frequencies in excess of 2300 cycles per second. The-respective filtered Wave components may be amplified before they. are supplied to the inputs of frequency counter 18, 22, or'26 vand the inputs'of full wave rectifier 20, 24, or 28. These frequency counters and the rectiflers may be identical to the corresponding units 14 and 16 in the previouslydescribed pitch determining or fundamental frequency control branch of the system. For this reason the circuit details of only the first formant-determining branch of the circuit are shown. It will be understood that the circuit arrangements of the various units in the second and third formant-determining circuit branches are the same as those shown for the first branch and which have been described in connection with the fundamental frequency branch of the analyzer. The slowly fluctuating control current outputs from these circuits branches, which currents are indicative of either the frequency position of a respective formant or the energy content of the wave components in that formant, are supplied over connecting transmission paths 9!], 92 etc., as shown, to the synthesizing apparatus of Fig. 2 where they exercise control functions in the production of a synthetic signal wave which is substantially a replica of the original signal wave. For simplicity of disclosure, the connecting transmission paths between the analyzing apparatus of Fig. 1 and the synthesizing apparatus of Fig. 2 are here shown as a plurality of conductor pairs which may be limited-frequency lines or which may be any one of a number of other suitable types of transmission means. Thus, the eight individual control currents, each of which has a band width that is equivalent to about 50 cycles per second, may be shifted into a single continuous-frequency band for transmission by radio or any other suitable means if it is desirable to so do.
Referring now to Fig. 2, a relaxation oscillator 62 provides a source of discontinuous electric waves of rectangular or saw tooth shape in which a great number of wave components are in harmonic frequency relation to the fundamental wave component. The frequency distribution of these wave components resembles the distribution of the components of the vocalcord waves or voiced" speech signals. This source of rectangular waves is occasionally referred to as the buzz source. It may be a conventional multivibrator oscillator of which several suitable types are known in the art, or it may be 'alrelaxation oscillator such as is shown in Fig. 3 of Patent 2,183,248, December 12, 1939, to R. R. Reisz. As is disclosed therein, the frequency of the fundamental wave of this oscillator is determined by the magnitude of the direct-current that flows through a .resistor in the oscillators grid circuit. This resistor terminates connecting pair BO'and the direct-current which flows through the grid resistor is the current that is produced by the equal amplitude current pulses in load resistor 52 in the fundamental frequency control branch of the analyzing circuit.
patented May 27, 1941.
T The output of each wave source 62, 64 is connected through isolating amplifiers to the contacts of the unbiased polar relay 66. Theintensity or energy-level indicating currents from the fundamental frequency branch and the high frequency or third formant branch of the analyzing circuit act in opposition in the windings of this relay. The up er relay winding is connected across connecting circuit 68 and receives energy from the third formant branch of the analyzer circuit as detected by rectifier 28 (Fig. 1). The lower relay winding is connected across connecting pair 30 and receives energy from the fundamental frequency branch of the analyzer circuit of Fig. 1. Because the major portion of the signal energies of the voiced and unvoiced types of sound signals reside at opposite ends of the frequency spectrum, it is clear that relay 66 will move its armatures in accordance with whichever type of signal energy is predominant in the original signal wave at any given instant, and connect the energy from either relaxation oscillator 62 Or noise generator 64 to the synthesizing circuit branches of the receiving apparatus. i
The fundamental frequency reconstruction branch of the circuit comprises filter F which is a low-pass filter having substantially the same frequency attenuation characteristic as filter F1 in the analyzing apparatus. The output from this filter is connected through an isolating amplifier to the input of a variable amplification unit 34. Variable amplifier 34 comprises a sub stantially constant gain amplifier section including vacuum tube 94 and a variable-attenuation unit 32 of the type that is commonly designated as a vario-losser. Together, these component constitute'a volume control circuit, the output level of which is regulated or controlled in accordance with the magnitude of the fundamental signal energy that is detected or rectified in full-Wave rectifier I6 of Fig. 1. This detected energy is received over connecting pair 38 and is impressed across the unilaterally-conducting elements 93, 95 in vario losser 32. This vario losser may be any one of several well-known types and may suitably be as shown in which the unilateral devices 93, 95 may be composed of any suitable non-ohmic substance such as tellurium-copper. Such unilateral devices are commonly called varistors. They are arranged in a bridge-connected circuit between input transformer T2 and output transformer T3. The resistance of these varistor units to alternating current is inversely proportional, over a wide range of current values, to the magnitude of the direct-current that flows through them. Therefore, the transmission loss between input transformer T2 and output transformer T3 is directly controlled by the magnitude of the current in connecting pair 39 and varies in inverse relation to changes in the magnitude of this current. The amplifier section comprising vacuum tube 94 is substantially a constant-gain device and, therefore, the overall transmission equivalent between input transformer T2 and output transformer T4 depends upon the magnitude of the energy detected in the fundamental frequency branch of the analyzing circuit.-
The first, second, and third formant control branches of the synthesizing circuit comprise shaping networks I0, I2, and I4 together with their variable amplifying units 96, 98, and I00, respectively. As in the case of the analyzing portion of the circuit, the details of only the first formant control circuit are shown and described herein. It will be understood that, except as noted, the circuit details of the second and third formant control branches may be identical with those of the first formant control branch. Each shaping network comprises a direct coupled amplifier, a variable resonance unit, and a fixed gain amplifying section. The direct coupled amplifier comprises vacuum tubes 84, 86 and their connecting circuits. A potentiometer 88 forms a termination for connecting circuit 90 and is included in the control electrode-cathode circuit of vacuum tube 86. One end of this potentiometer is connected to the cathode of this tube and the movable contact is connected to the control electrode of this tube. As the output current from frequency counter 18 (Fig. 1) increases, corresponding to an increase in the frequency of the first signal wave formant, the control electrode of tube 86* is made more negative with respect to its cathode. This action increases the anode potential of this tube and also increases the control electrode potential of tube 84 and the anode current flow in this tube. This increased anode current flows through the bias control windings 8| of the variable inductor to change the effective inductive value of Winding 82 which is included in a resonant circuit combination with capacitor I8. Increased current fiow in bias control winding 8| increases the flux saturation in the coil core of inductor 80 and reduces the effective inductance of winding 82. This resonant circuit is shunted between the control electrode and cathode electrode of the fixed gain amplifier 19 in such manner that the lossfrequency characteristic of these combined units, that is, the resonant combination and amplifier I9, is controlled by the resonant frequency of the capacitor IS-inductance 82 combination. At the frequency of resonance, this combination pre-- sents maximum impedance between the control grid and cathode electrodes and hence the overall loss is at a minimum. Resistor 16 provides a damping factor which changes as the resonant frequency of the combination is changed. This result arises by virtue of the fact that as the biasing current is varied in inductor 80, the alternating current resistance of winding 82 varies and hence the figure of merit of this coil varies. Variations in this figure of merit when combined with damping resistor I6 cause a broadening or sharpening of the resonant peak and hence a variation in the damping factor of these resonances. The value of resistor I6 in each network may be chosen of such value that the damping that is imparted to the formant produced by the network varies with changes in the frequency position of the formant in substantially the same manner as the mean damping characteristic of the corresponding formant of a large number of representative vocal tracts. Thus, the formant damping resembles that of an average voice over the entire range of the synthesized signal. In order to simulate a particular damping characteristic, it may occasionally be desirabl to include a small resistor (not shown) in series with the coil 82 and capacitor I8 of the anti-resonant circuit. This resistor will ordinarily not exceed two or three ohms.
The insertion-loss of the variable resonant unit of shaping network 10 is indicated by curve I06 of Fig. 4 for the condition when the anode current flowing in bias control windings BI is adjusted so that the resonant frequency of this combination is about 750 cycles per second. Curves I08 .and H0 or this same figure show the insertionloss of similar resonant combinations in shapin networks '12 and 14, respectively, when they are adjusted for resonant frequencies of about 1500 and 2550 cycles, respectively. The effect of resistor 16 and the varying figure of merit in this circuit in controlling the damping factor may be seen from the following observed results on one tested embodiment of the invention when the circuit constants were chosen to simulate speech signals. The damping of the resonant unit in the first formant-shaping network varied from 1300 decibels per second to 3000 decibels per second as the resonant frequency was varied from 300 to 900 cycles per second. The damping of the resonant unit in the second formant-shaping network varied from 8000 decibels per second to 11,000 decibels per second as the resonant frequency was changed from 900 to 2700 cycles per. second. The damping of the resonant unit in the third formant shaping network changed from 10,000 decibels to 15,000 decibels per second as the resonant frequency was changed from '3000 to 9000 cycles per second.
The resonant units in shaping networks 10, 12, and 14 are substantially the same except that the values of damping resistor 16, the capacitor '18 and inductance 82 are varied to cause the frequency of resonance of each resonant combination to coincide-with the minimum frequency of the frequency range that is to be covered by that unit. In one tested embodiment of the invention in which the electron discharge device 19 was a conventional type 608G vacuum tube and in which the undesignated circuit components were of conventional values, the following values were found to be suitable:
Inductance 82 is so constructed that when the minimum anode current flows through its bias control windings 8|, which windings are reversed so that current fluctuations in them do not produce induced fluctuations in winding 82, the winding 82 has the correct inductive value corresponding to a desired minimum resonant frequency. The anode current in vacuum tube 84 is suitably controlled to produce this desired in ductance value for each of the shaping networks. As the current output from the respective frequency counter in the analyzer circuit increases, corresponding to an increase in the frequency of the analyzed formant, the inductive value of the winding 32 is decreased and the resonant frequency of the resonant combination is suitably increased.
The selectively-shaped wave components from shaping network may be transmitted through an isolating amplifier to the variable amplifying unit 96, the circuit details of which are identical to those that have been previously described in connection with the variable amplifying unit 34 in the fundamental frequency reproduction branch of the circuit. Except for the above-noted differences in the values of the damping resistors and resonant combination, the formant-control branches for the second and third formants are identical with the above-described first formantcontrol branch. A conventional mixing amplifier m2 is connected to the Qlltput pf each or the '10 three formant control branches and to the output of the fundamental frequency reproduction branch to suitably combine the shaped wave components before they are further amplified and applied to a conventional sound reproducer I04.
The combined operation of the circuits of Figs. 1 and 2 will now be described, assuming that the vowel a as in at is being analyzed and synthesized. The instantaneous energy-frequency distribution of this sound may be approximately as is shown by the frequencyintensity graph of Fig. 3. From this graph, it will be noted that the fundamental frequency of this complex Wave is about cycles per second; that the maximum energy is centered in the low-frequency or first formant at about 750 cycles per second and that second and third vocal resonances or formants occur at about 1500 and 2550 cycles per second, respectively. The damping of the first, second, and third formants is approximately 4100, 9300, and 15,000 decibels per second, respectively. This wave, when analyzed in the circuit of Fig. 1, produces at the output of the fundamental frequency counter 14 and in connecting pair 60 a current the magnitude of which is representative of the 150- cycle per second fundamental wave component and at the output of the energy detecting rectifier l6 and in connecting pair 30 a current the magnitude of which is representative of the signal energy that is contained in the signal frequency spectrum below 400 cycles per second. Similarly, direct currents that are identifying of the frequency position of and the energy level of the first, second, and third formants are produced in connecting pairs 90, 92 and the other respective connected pairs by the action of frequency counters I8, 22, and 26, and rectifiers 20, 24, and 23 in the formant-analyzing branches of the circuit. As was previously explained, the magnitude of one of these currents in each formant-analyzing branch indicates the frequency of the maximum-amplitude wave component in the respective frequency subband that corresponds to a vocal resonance, and the magnitude of the other current in this formantanalyzing branch indicates the total signal energy that is contained in the respective frequency subband.
The fundamental-frequency indicating directcurrent in connecting pair 60 controls the grid bias in relaxation oscillator 52 in such fashion that this oscillator produces a discontinuous wave having a fundamental frequency of about 150 cycles per second. The energy or intensity indicating current in connecting pair 30 is supplied t the lower winding of relay 66 where it opposes and overcomes the effect in the upper winding of this relay of the energy indicating current in the third formant connecting pair 68. Relay 66, therefore, moves its armatures to their lower contacts and the output of oscillator 62 is simultaneously supplied to the input of the 400-cycle low-pass filter F5 and the formantshaping networks 10, 12, and M. In each of these shaping networks, the frequency indicating current as received over connecting pair 90, etc. from the associated analyzer branch controls the grid bias of vacuum tube 86 in the direct coupled amplifier and hence the effective value of inductance 82 in the resonant frequency combination comprising capacitor 18 and inductance 82. Inductance 82 in each shaping network is controlled in such manner that together with its capacitor 78 they form a resonant circuit and present maximum shunting impedance to wave components of the same values as those that produced the control currents in the formant analyzing branches of the circuit. In this described instance, in which the signal energy distribution is as shown in the graph of Fig. 3, the insertion-loss of the resonant circuits in the respective shaping networks as measured between the input and output of the shaping network is as shown by curves I06, I08, and I II], respectively, of Fig. 4. The shape of these curves shows the insertion-loss of the resonant combination and, therefore, are inverted from the shapes that are imparted to the synthesizing wave components when they are passed through the shaping networks. An inspection of these curves shows that they insert minimum loss at frequencies which correspond to the vocal resonance points in the signal wave of Fig. 3.
It will also be noted that the damping factors, while not exact replicas of corresponding damping factors in the original signal wave, closely resemble these factors. In this described embodiment, the damping of each resonant unit was observed to change from a specified value to a relatively much higher value as the inductance of winding 82 was changed to cause the resonant frequency of the parallel combination to change from its lowest to its highest operating condition. The exact amounts by which these damping factors changed has been previously described in connection with the circuit details of these shaping networks. At the same time that the formant-control currents are supplied to the shaping networks 1D, 12, and 14, the intensity or energy-level control currents from the respective analyzer branches are supplied over the individual connecting pairs 30, 92, etc. to the variable amplification units 34, 96, 98, and I00.
As was previously stated, the resistance to alternating currents of the unilaterally-conducting devices 93, 95 in the vario-losser 32 decreases as the direct current that flows through these devices increased. Hence, as the energy in the' analyzed signal subband increases, the transmission loss through the appropriate vario-losser 32 is decreased and the amplitude of the shaped synthesizing wave component is increased. After the synthesizing wave components from oscillator 62 have been selected in accordance with the desired frequency and shaped or damped by the respective shaping network to resemble a particular voice resonance or formant in the original signal wave, and the energy content of each such shaped formant has been adjusted in the respective variable amplifying circuit 34, 95, etc. to a value relative to the other formants that is substantially the same as the relative energy level of the corresponding original formant, the wave components from all branches are united or mixed in mixing amplifier I02. These combined wave components may be further amplified, if desired, before they actuate the sound reproducer I04 to construct a synthesized sound signal in which the frequency-intensity distribution is substantially the same as is shown in Fig. 3 for the original sound signal.
The foregoing is a description of the operation of the system when a voiced sound signal is being analyzed and synthesized. If now the analyzed sound signal is changed to the unvoiced variety, the signal energy in the third or high-frequency formant will exceed that in the fundamental frequency band below 400 cycles per second and the detected energy in connecting pair 68 will exceed that in pair 30. This reversal of energy distribution will cause polar relay 66 to operate its armatures to its upper set of contacts and to connect the output of noise generator. 64 to the inputs of the fundamental frequency and formant-control branches of the synthesizing apparatus (Fig. 2). Under these circumstances it is possible that the frequency counter M in the fundamental frequency or pitch control branch of the analyzer circuit (Fig. 1) may be intermittently actuated to produce a variable direct current in connecting circuit 60. This variable direct current will control oscillator 62 (Fig. 2) in the previously described manner, but because the polar relay has transferred its armature connections to the upper set of contacts, the spurious output from oscillator 62 is not connected to the synthesizing equipment. In all other respects the operation of the synthesizing apparatus is the same as in the case of the voiced type of sound signal.
From the foregoing description it is evident that the successful practice of this invention does not necessarily depend upon the use of three formant analyzing and reconstructing circuit branches as was described in the foregoing preferred embodiment of the invention. In some instances the intelligence of the message signal may be suitably transmitted through the use of only two such circuit branches. It should also be understood that the formant-shaping circuits 79, i2 and 14 need not employ the type of variable resonant circuit that has been described in this preferred embodiment. In this described embodiment the simple resonant shunt circuit has been utilized to facilitate the description, since it employs only one bias control circuit. If desirable, a variable band-pass type of filter may be used in this portion of the circuit without departin in any way from the scope of this invention. Similarly, a different type of frequency counting circuit from that which was described in the analyzer branches may be employed with efficiency equal to that of the described type Therefore, it is to be expected that circuit variations which do not depart from the spirit and scope of the described invention will occur to those skilled in the related art.
What is claimed is:
1. In a signal analyzing and synthesizing system, an input for signal waves in which the energy is grouped in a plurality of formants, analyzing means productive from each formant of a first electrical quantity which varies in accordance with which wave component is of maximum amplitude in said formant and a second electrical quantity which varies in accordance with the amount of signal energy contained in said formant, a source of electric waves having a plurality of wave components, a sound reproducer responsive to said wave components, a plurality of variable transmission paths interconnecting said wave source and said reproducer, each of said paths including variable frequency-selective and damping means and variable volume-control means that limit the transmission of wave components therethrough, said variable frequencyselective and damping means comprising a resistor in series with said path and a parallel connected inductor-capacitor shunted across said path, the effective inductance and the resistance to alternating components of said inductor being variable in response to said first electric quantity, said resistor and inductor being so proportioned that their combined resistance to alternating wave components varies in a preassigned manner as the frequency of said components changes, and means responsive to said second electric quantity to control said variable volumecontrol means and regulate the energy content of the Wave components transmitted through said path.
2. In a signal analyzing and synthesizing system, an input for signal waves in which the energy is grouped in a plurality of formants, analyzing means productive from each formant of an electrical quantity the magnitude of which varies in accordance with the varying frequency of the wave component of maximum amplitude in said formant, a source of electric waves having a plurality of wave components, a sound reproducer responsive to said wave components, a circuit of variable resonance frequency and variable damping between said source of electric waves and said reproducer for each formant, each of said variable resonance frequency circuits comprising a variable inductor, the effective inductance value and the resistance to alternating wave components of which is responsive to said produced electric quantity, and said variable damping means including a resistor connected to said variable inductor and being so proportioned relative to said inductor that their combined resistance to alternating wave components varies in a preassigned manner as the magnitude of said controlling electric quantity is varied.
3. In a system for the transmission of speech by indicia of speech characteristics, an analyzing station which comprises means for analyzing speech to provide indicia of the frequency and energy of the fundamental component of said speech and similar indicia of the frequency and energy of each of a plurality of higher frequency formants of said speech, means for transmitting said indicia to a synthesizing station and, at said synthesizing station, means for differentially combining the indicia of the speech fundamental energy and of the highest formant energy to provide a control signal, a source of harmonically related sound frequency oscillations having a fundamental frequency, a source of noise energy, a sound reproducer adapted to be energized from said sources, frequency selective means equal in number to the speech fundamental and formants connected intermediate said sources and said reproducer, amplifying means associated with each frequency selective means, means for selectively connecting said two sources alternatively to said frequency selective means under control of said control signal, means for varying each of said frequency selective means under control of one of said frequency indicia, and means for varying each of said amplifying means under control of one of said energy indicia.
JOHN C. STEINBERG.
References Cited in the file of this patent UNITED STATES PATENTS Riesz Sept. 19, 1950
US133131A 1949-12-15 1949-12-15 Speech analyzing and synthesizing communication system Expired - Lifetime US2635146A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US133131A US2635146A (en) 1949-12-15 1949-12-15 Speech analyzing and synthesizing communication system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US133131A US2635146A (en) 1949-12-15 1949-12-15 Speech analyzing and synthesizing communication system

Publications (1)

Publication Number Publication Date
US2635146A true US2635146A (en) 1953-04-14

Family

ID=22457154

Family Applications (1)

Application Number Title Priority Date Filing Date
US133131A Expired - Lifetime US2635146A (en) 1949-12-15 1949-12-15 Speech analyzing and synthesizing communication system

Country Status (1)

Country Link
US (1) US2635146A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2810787A (en) * 1952-05-22 1957-10-22 Itt Compressed frequency communication system
US2817711A (en) * 1954-05-10 1957-12-24 Bell Telephone Labor Inc Band compression system
US2857465A (en) * 1955-11-21 1958-10-21 Bell Telephone Labor Inc Vocoder transmission system
US2866001A (en) * 1957-03-05 1958-12-23 Caldwell P Smith Automatic voice equalizer
US2890285A (en) * 1955-10-25 1959-06-09 Bell Telephone Labor Inc Narrow band transmission of speech
US2891111A (en) * 1957-04-12 1959-06-16 Flanagan James Loton Speech analysis
US2927969A (en) * 1954-10-20 1960-03-08 Bell Telephone Labor Inc Determination of pitch frequency of complex wave
US2928902A (en) * 1957-05-14 1960-03-15 Vilbig Friedrich Signal transmission
US2943152A (en) * 1957-11-07 1960-06-28 Joseph C R Licklider Audio pitch control
US3012102A (en) * 1957-04-01 1961-12-05 North Electric Co Automatic telephone system
US3094586A (en) * 1960-02-12 1963-06-18 Ibm Signal conversion circuits
US3102928A (en) * 1960-12-23 1963-09-03 Bell Telephone Labor Inc Vocoder excitation generator
US3171406A (en) * 1961-09-26 1965-03-02 Melpar Inc Heart beat frequency analyzer
US3176073A (en) * 1961-12-04 1965-03-30 Gen Dynamics Corp Buzz-hiss decision system for a channel vocoder
US3499986A (en) * 1966-09-28 1970-03-10 Philco Ford Corp Speech synthesizer
US3668294A (en) * 1969-07-16 1972-06-06 Tokyo Shibaura Electric Co Electronic synthesis of sounds employing fundamental and formant signal generating means
US4640134A (en) * 1984-04-04 1987-02-03 Bio-Dynamics Research & Development Corporation Apparatus and method for analyzing acoustical signals
US4783807A (en) * 1984-08-27 1988-11-08 John Marley System and method for sound recognition with feature selection synchronized to voice pitch

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2151091A (en) * 1935-10-30 1939-03-21 Bell Telephone Labor Inc Signal transmission
US2183248A (en) * 1939-12-12 Wave translation
US2243526A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech
US2243089A (en) * 1939-05-13 1941-05-27 Bell Telephone Labor Inc System for the artificial production of vocal or other sounds
US2243527A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech
US2458227A (en) * 1941-06-20 1949-01-04 Hartford Nat Bank & Trust Co Device for artificially generating speech sounds by electrical means
US2466880A (en) * 1946-12-17 1949-04-12 Bell Telephone Labor Inc Speech analysis and synthesis system
US2522539A (en) * 1948-07-02 1950-09-19 Bell Telephone Labor Inc Frequency control for synthesizing systems

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2183248A (en) * 1939-12-12 Wave translation
US2151091A (en) * 1935-10-30 1939-03-21 Bell Telephone Labor Inc Signal transmission
US2243089A (en) * 1939-05-13 1941-05-27 Bell Telephone Labor Inc System for the artificial production of vocal or other sounds
US2243526A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech
US2243527A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech
US2458227A (en) * 1941-06-20 1949-01-04 Hartford Nat Bank & Trust Co Device for artificially generating speech sounds by electrical means
US2466880A (en) * 1946-12-17 1949-04-12 Bell Telephone Labor Inc Speech analysis and synthesis system
US2522539A (en) * 1948-07-02 1950-09-19 Bell Telephone Labor Inc Frequency control for synthesizing systems

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2810787A (en) * 1952-05-22 1957-10-22 Itt Compressed frequency communication system
US2817711A (en) * 1954-05-10 1957-12-24 Bell Telephone Labor Inc Band compression system
US2927969A (en) * 1954-10-20 1960-03-08 Bell Telephone Labor Inc Determination of pitch frequency of complex wave
US2890285A (en) * 1955-10-25 1959-06-09 Bell Telephone Labor Inc Narrow band transmission of speech
US2857465A (en) * 1955-11-21 1958-10-21 Bell Telephone Labor Inc Vocoder transmission system
US2866001A (en) * 1957-03-05 1958-12-23 Caldwell P Smith Automatic voice equalizer
US3012102A (en) * 1957-04-01 1961-12-05 North Electric Co Automatic telephone system
US2891111A (en) * 1957-04-12 1959-06-16 Flanagan James Loton Speech analysis
US2928902A (en) * 1957-05-14 1960-03-15 Vilbig Friedrich Signal transmission
US2943152A (en) * 1957-11-07 1960-06-28 Joseph C R Licklider Audio pitch control
US3094586A (en) * 1960-02-12 1963-06-18 Ibm Signal conversion circuits
US3102928A (en) * 1960-12-23 1963-09-03 Bell Telephone Labor Inc Vocoder excitation generator
US3171406A (en) * 1961-09-26 1965-03-02 Melpar Inc Heart beat frequency analyzer
US3176073A (en) * 1961-12-04 1965-03-30 Gen Dynamics Corp Buzz-hiss decision system for a channel vocoder
US3499986A (en) * 1966-09-28 1970-03-10 Philco Ford Corp Speech synthesizer
US3668294A (en) * 1969-07-16 1972-06-06 Tokyo Shibaura Electric Co Electronic synthesis of sounds employing fundamental and formant signal generating means
US4640134A (en) * 1984-04-04 1987-02-03 Bio-Dynamics Research & Development Corporation Apparatus and method for analyzing acoustical signals
US4783807A (en) * 1984-08-27 1988-11-08 John Marley System and method for sound recognition with feature selection synchronized to voice pitch

Similar Documents

Publication Publication Date Title
US2635146A (en) Speech analyzing and synthesizing communication system
US2151091A (en) Signal transmission
Riesz Differential intensity sensitivity of the ear for pure tones
US3030450A (en) Band compression system
US2243527A (en) Production of artificial speech
US2181265A (en) Signaling system
US2458227A (en) Device for artificially generating speech sounds by electrical means
US2243526A (en) Production of artificial speech
US2570701A (en) Harmonic-selecting apparatus
US2315248A (en) Pseudo-extension of frequency bands
US2824906A (en) Transmission and reconstruction of artificial speech
US2315249A (en) Pseudo-extension of frequency bands
US3197560A (en) Frequency measuring system
US3668322A (en) Dynamic presence equalizer
US3787778A (en) Electrical filters enabling independent control of resonance of transisition frequency and of band-pass, especially for speech synthesizers
US2593694A (en) Wave analyzer for determining fundamental frequency of a complex wave
US3268660A (en) Synthesis of artificial speech
US2819341A (en) Transmission and reconstruction of artificial speech
US2287401A (en) All-frequency generator
US2522539A (en) Frequency control for synthesizing systems
US2235733A (en) Auditory masking method
US2406825A (en) Privacy system for speech transmission
US2593698A (en) Apparatus for determining pitch frequency in a complex wave
Colpitts Scientific research applied to the telephone transmitter and receiver
US2370385A (en) Method of acoustic measurement and apparatus therefor