[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

US5029509A - Musical synthesizer combining deterministic and stochastic waveforms - Google Patents

Musical synthesizer combining deterministic and stochastic waveforms Download PDF

Info

Publication number
US5029509A
US5029509A US07/431,594 US43159489A US5029509A US 5029509 A US5029509 A US 5029509A US 43159489 A US43159489 A US 43159489A US 5029509 A US5029509 A US 5029509A
Authority
US
United States
Prior art keywords
generating
sound
stochastic
sequence
waveform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US07/431,594
Inventor
Xavier Serra
Julius Smith
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Leland Stanford Junior University
Original Assignee
Leland Stanford Junior University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Leland Stanford Junior University filed Critical Leland Stanford Junior University
Priority to US07/431,594 priority Critical patent/US5029509A/en
Assigned to BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY, THE reassignment BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY, THE ASSIGNMENT OF ASSIGNORS INTEREST. Assignors: SERRA, XAVIER
Assigned to BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY, THE reassignment BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY, THE ASSIGNMENT OF ASSIGNORS INTEREST. Assignors: SMITH, JULIUS
Priority to PCT/US1990/002200 priority patent/WO1990013887A1/en
Priority to AU55328/90A priority patent/AU5532890A/en
Application granted granted Critical
Publication of US5029509A publication Critical patent/US5029509A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/02Means for controlling the tone frequencies, e.g. attack or decay; Means for producing special musical effects, e.g. vibratos or glissandos
    • G10H1/06Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour
    • G10H1/12Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour by filtering complex waveforms
    • G10H1/125Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour by filtering complex waveforms using a digital filter
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H7/00Instruments in which the tones are synthesised from a data store, e.g. computer organs
    • G10H7/08Instruments in which the tones are synthesised from a data store, e.g. computer organs by calculating functions or polynomial approximations to evaluate amplitudes at successive sample points of a tone waveform
    • G10H7/10Instruments in which the tones are synthesised from a data store, e.g. computer organs by calculating functions or polynomial approximations to evaluate amplitudes at successive sample points of a tone waveform using coefficients or parameters stored in a memory, e.g. Fourier coefficients
    • G10H7/105Instruments in which the tones are synthesised from a data store, e.g. computer organs by calculating functions or polynomial approximations to evaluate amplitudes at successive sample points of a tone waveform using coefficients or parameters stored in a memory, e.g. Fourier coefficients using Fourier coefficients
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/025Envelope processing of music signals in, e.g. time domain, transform domain or cepstrum domain
    • G10H2250/031Spectrum envelope processing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/055Filters for musical processing or musical effects; Filter responses, filter architecture, filter coefficients or control parameters therefor
    • G10H2250/061Allpass filters
    • G10H2250/065Lattice filter, Zobel network, constant resistance filter or X-section filter, i.e. balanced symmetric all-pass bridge network filter exhibiting constant impedance over frequency
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/131Mathematical functions for musical analysis, processing, synthesis or composition
    • G10H2250/211Random number generators, pseudorandom generators, classes of functions therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/131Mathematical functions for musical analysis, processing, synthesis or composition
    • G10H2250/215Transforms, i.e. mathematical transforms into domains appropriate for musical signal processing, coding or compression
    • G10H2250/235Fourier transform; Discrete Fourier Transform [DFT]; Fast Fourier Transform [FFT]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/131Mathematical functions for musical analysis, processing, synthesis or composition
    • G10H2250/261Window, i.e. apodization function or tapering function amounting to the selection and appropriate weighting of a group of samples in a digital signal within some chosen time interval, outside of which it is zero valued
    • G10H2250/291Kaiser windows; Kaiser-Bessel Derived [KBD] windows, e.g. for MDCT
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S84/00Music
    • Y10S84/09Filtering

Definitions

  • the present invention relates generally to musical synthesizers and particularly to methods and systems for analyzing sound signals and for synthesizing new sound signals.
  • a shortcoming of prior art musical synthesizers is that such synthesizers generally try to use a single model to represent all musical sounds. It is very difficult to get a single model to faithfully represent the wide range of musical sounds. It is also important to provide a model for representing sounds which makes it possible and practical to reproduce and transform the sounds generated by the synthesizer.
  • the present invention uses a model with two very different types of elements to represent two different aspects of musical sounds.
  • the present invention is a musical sound analyzer and synthesizer which is based on a model that considers a sound to be composed of two types of elements: a deterministic component plus a stochastic component.
  • the deterministic component is represented as a series of sinusoids, with an amplitude and a frequency function for each sinusoid.
  • the stochastic component is represented as a series of magnitude spectral envelopes. From this representation sounds can be synthesized that, in the absence of modifications, can behave as perceptual identities, that is, they are perceptually equal to the original sound.
  • stored representations of sounds can be easily modified in a musical synthesizer to create a wide variety of new sounds.
  • FIG. 1 is a block diagram of a musical sound analyzer in accordance with the present invention.
  • FIG. 2 is a block diagram of a musical sound synthesizer in accordance with the present invention.
  • FIG. 3 is a block diagram of a second preferred embodiment of a musical sound analyzer in accordance with the present invention.
  • FIG. 4 is a block diagram of a second preferred embodiment of a musical sound synthesizer in accordance with the present invention.
  • FIG. 5 is a block diagram of a third preferred embodiment of a musical sound synthesizer in accordance with the present invention.
  • the present invention's analysis and synthesis technique is based on the short-time Fourier transform (STFT), from which the relevant magnitude peaks are detected and assigned to a number of frequency trajectories.
  • STFT short-time Fourier transform
  • the deterministic component is obtained from these trajectories with an additive synthesis technique. More specifically, the deterministic component is a set of sound partials which represent the deterministic component of a limited time sample of the waveform being analyzed.
  • the spectra of the deterministic component are subtracted from the spectra of the original waveform.
  • the result is a residual spectra which, in turn, can be approximated by a series of amplitude envelopes. These envelopes represent the stochastic component.
  • the stochastic component is synthesized by multiplying the spectrum of white noise with these frequency envelopes and performing an inverse-STFT.
  • the model used by the present invention assumes that the input sound s(t) is the sum of a series of sinusoids plus a noise signal e(t): ##EQU1## where A r (t) and ⁇ r (t) are the instantaneous amplitude and phase of each sinusoid and e(t) is the noise signal. R is the number of sinusoids used in the series to represent the sound.
  • the model used in the present invention also assumes that the sinusoids are stable partials of the sound s(t) and that each one can be characterized by its amplitude and frequency.
  • the instantaneous phase is then taken to be the integral of the instantaneous frequency ⁇ r (t), and therefore satisfies ##EQU2## where ⁇ (t) is the frequency in radians, and r is the sinusoid number.
  • Equation 1 The residual e(t) in Equation 1 is also simplified by assuming it is a stochastic signal. Such an assumption allows us to model the residual as filtered white noise: ##EQU3## where u(t) is white noise and h(t) is the impulse response of a slowly time varying filter. That is, the residual is modeled by the convolution of white noise with a frequency shaping filter.
  • the analysis, transformation and synthesis techniques of the present invention are based on the above model which combines deterministic and stochastic elements for representing sounds.
  • FIG. 1 shows a sound analyzer 100 in accordance with the present invention.
  • the first step in analyzing a sound signal is to break it into a series of time frames, sometimes called windows.
  • a clock generator 102 generates a sequence of window signals which are used by gate 104 to divide the sound waveform into separate time frames.
  • the time frames are analyzed by a fast Fourier Transformer (FFT) so as to generate a set of complex spectra values.
  • FFT fast Fourier Transformer
  • the FFT 106 uses the short-time Fourier Transform because this technique uses relatively short time frames (e.g. 50 milliseconds per time frame).
  • a "Kaiser window” is used to smooth the outer edges of each time frame.
  • the length (i.e., duration) of the windows depends on the lowest frequency ⁇ r (t) that is being tracked.
  • the window has a duration of at least four or five cycles of the lowest frequency that is to be tracked --in order to accommodate for the time-frequency trade-off associated with STFT.
  • the size of the sample buffer used by the STFT should be at least double the size of the window (i.e., double the number of samples collected during each window) because a big "zero-padding" in the buffer improves the performance of the technique.
  • a complex to real number converter 108 converts the complex spectra generated by the FFT 106 into a set of magnitude spectra for each time frame.
  • a peak detector and sound partial analyzer 110 finds the highest peaks in the magnitude spectra and performs a parabolic interpolation to refine the frequency and amplitude values generated. Each identified peak has a frequency and a magnitude value. The peaks from a series of time frames are then organized into pairs of frequency and magnitude trajectories, each pair of which represents a sound partial. Thus the analyzer 110 extracts the stable sinusoids present in the original sound (the deterministic component). The frequency and magnitude trajectories are typically stored for use in a music synthesizer, as will be described below.
  • the stochastic part of the waveform is generated as follows. First, the deterministic component of the original waveform is regenerated from the frequency and magnitude trajectories by reversing the process that was used to generate them. In particular, a sinewave generator 120 converts the frequency and magnitude trajectories into a "deterministic waveform".
  • the deterministic waveform is then gated by gate 122 with the window signals from clock generator 102.
  • the Fourier Transform of the deterministic waveform is then generated by a fast Fourier Transform 124 using the same STFT technique as was used to analyze the original waveform.
  • the FFT 124 generates a set of complex spectra, which are converted in to magnitude spectra by a complex to real number converter 126.
  • the magnitude spectrum of the deterministic signal is then subtracted from the magnitude spectrum of the original waveform by subtractor 128, yielding a residual spectrum.
  • an envelope generator 130 generates a line segment approximation 132 of the residual signal's spectral envelope--i.e., the envelope of the residual power spectrum output by the magnitude spectra subtractor 128. These envelopes represent the stochastic signal portion of the original waveform.
  • FIG. 2 shows a sound synthesizer 200 in accordance with the present invention.
  • Various sets of sound signals as represented by the sound analyzer shown in FIG. 1, are stored in memories 202 and 204.
  • Memory 202 stores pairs of magnitude and frequency trajectories, each pair representing a sound partial.
  • Memory 204 stores residual spectral envelopes corresponding to the magnitude and frequency trajectories in memory 202.
  • these memories 202 and 204 each store a series of values for producing sound signals in a corresponding series of time frames.
  • memory 202 which govern the deterministic waveform to be generated
  • spectral envelope i.e., a set of frequency and magnitude values
  • the deterministic or sinusoidal component of the synthesized sound is generated using selected ones of the magnitude and frequency trajectories stored in memory 202.
  • the trajectories may be transformed or manipulated by a frequency trajectory transformer 206 and a magnitude trajectory transformer 208.
  • These transformers 206 and 208 may stretch a trajectory in time, perform linear or even nonlinear transformations, or may add, subtract and weight various partials from the database of partials in the memory 202.
  • the transformers 206 and 208 alter the acoustic qualities of the deterministic waveform generated by the synthesizer 200, and thereby add to the range and quality of sounds that can be generated.
  • each trajectory output by the transformers 206 and 280 is converted into a sine wave by one of a set of sine wave generators 210.
  • sine wave generators are provided so that several partials can be generated simultaneously. These sine waves are combined by sine wave adder 212, resulting in the generation of the deterministic portion of the synthesized waveform.
  • the stochastic part of the synthesized sound is generated by creating a complex spectra out of the spectral envelope of the magnitude spectra residual, or its modification, and doing an inverse STFT.
  • the stored spectral envelopes in memory 204 may be transformed by a spectral envelope transformer 220.
  • the resulting envelope becomes the magnitude portion of the stochastic signal.
  • the transformer 220 alters the acoustic qualities of the stochastic waveform generated by the synthesizer 200, and thereby adds to the range and quality of sounds that can be generated.
  • the STFT of a windowed white noise signal is computed using a noise generator 222, signal gate 224 for windowing or gating the noise signal, and an FFT 226.
  • a phase generator converts the complex spectra output by the FFT into phase spectra values. These phase spectra and the magnitude values representing the spectral envelope are expressed in polar coordinates (i.e., real values).
  • the polar coordinate values are converted into complex spectra by a polar-to-rectangular coordinate converter 230.
  • the resulting complex spectra are then inverse Fourier transformed by an inverse-FFT 232 to generate the stochastic waveform.
  • the process of generating the stochastic waveform corresponds to the filtering of white noise by a filter with a frequency response equal to the spectral envelope.
  • the stochastic signal circuitry 222-232 is essentially a white noise filter.
  • the stochastic and deterministic waveforms are added by adder 240 to generate the complete synthesized waveform.
  • adder 240 By proper selection of input trajectories and transformations, one can generate a very wide range of sounds using the synthesizer 200.
  • FIG. 3 shows a second and somewhat more complicated signal analyzer 300 than the one shown in FIG. 1.
  • the signal model used by this second analyzer assumes that the input sound s(t) is the sum o a series of sinusoids plus a noise signal e(t): ##EQU4## where R is the number of sinusoids used to represent the deterministic portion of the sound, A r (t) is the instantaneous amplitude and ⁇ r (t) is the instantaneous phase of each sinusoid.
  • the residual signal e(t) is the difference between the signal and the sinusoidal or deterministic part.
  • the instantaneous phase is defined by ##EQU5## where ⁇ (t) is the frequency in radians, r is the sinusoid number, ⁇ r (0) is the initial phase value, and ⁇ r is a fixed phase offset.
  • a clock generator 302 generates a sequence of window signals which are used by gate 304 to divide the sound waveform into separate time frames.
  • the time frames are analyzed by a fast Fourier Transformer (FFT) so as to generate a set of complex spectra values.
  • FFT 306 uses the short-time Fourier Transform, as described above with reference to FIG. 1.
  • a rectangular to polar coordinate converter 308 converts the complex spectra generated by the FFT 306 into a set of magnitude spectra for each time frame. Then a peak detector and sound partial analyzer 310 finds the highest peaks in the magnitude spectra and performs a parabolic interpolation to refine the frequency and amplitude values generated. Each identified peak has a frequency, phase and a magnitude value. The peaks from a series of time frames are then organized into sets of frequency, phase and magnitude trajectories, each set of which represents a sound partial. Thus the analyzer 310 extracts the stable sinusoids present in the original sound (the deterministic component). The frequency, phase and magnitude trajectories may be stored for use in a music synthesizer, as described above.
  • the deterministic portion of the sound signal is regenerated by using a phase interpolator 312 to generate the instantaneous phase of the regenerated deterministic signal, and a linear interpolator 314 to generate the instantaneous magnitude of the regenerated deterministic signal.
  • the instantaneous phase signal is used to control the shape of a sinusoidal signal generated by a sine wave generator 316, and then a multiplier 318 amplifies the resulting sine wave to match the amplitude indicated by the instantaneous amplitude output by interpolator 314.
  • This waveform generation process is performed on several sound partials simultaneously by a corresponding number of interpolators 312-314, sine wave generators 316, and multipliers 318. These sound partials are combined by sine wave adder 320 to generate the deterministic element of the input waveform.
  • the deterministic signal is subtracted from the input waveform by subtractor 330 to generate a residual signal on line 332.
  • the residual signal may be modeled as a stochastic signal using the same technique as in the first signal analyzer: by performing an STFT on the residual signal, computing the magnitude spectra, and then generating an envelope approximation of the magnitude spectra.
  • FIG. 4 shows a second and somewhat simpler sound synthesizer 400 than the one shown in FIG. 2.
  • synthesizer 400 uses the same apparatus for generating the deterministic portion of the synthesized sound as shown in FIG. 2; only the stochastic waveform circuitry has been changed from that shown in FIG. 2.
  • the noise generator circuitry 222-228 in FIG. 2 is replaced with a simple random number generator 402 that produces a set of phase values between ⁇ and - ⁇ .
  • the random number generator 402 provides a set of values ⁇ (k) each of which is equal to a randomly selected number between ⁇ and - ⁇ , and where number of data points for each time frame corresponds to the number of input values needed by the inverse FFT 232.
  • the spectral envelope transformer 220 provides a set of interpolated values A(k) which represent the interpolated magnitudes of the spectral envelope at each of the data points (i.e., frequency points) needed by the inverse FFT 252.
  • interpolated values are calculated from the stored spectral envelope obtain from memory 204. Note that frequency magnitudes in the stored spectral envelope from memory 204 may not correspond exactly to the data points needed by the inverse FFT 232, requiring the calculation of interpolated values for those data points.
  • the random number generator 402 and the transformer 220 provide a set values ⁇ A(1), ⁇ (1) ⁇ , ⁇ A(2), ⁇ (2) ⁇ , ⁇ A(n), ⁇ (n) ⁇ , where n is the number of data points needed by the inverse FFT 232.
  • the values for each time frame are converted from polar coordinates to rectangular coordinates by converter 230, because the inverse FFT 232 requires complex data values as its input values.
  • the resulting complex spectra are converted into a sequence of sampled data values by an inverse FFT 232.
  • These sampled data values are the time domain signal that represents the stochastic part of the synthesized signal for one time domain.
  • the data samples generated by the inverse FFT 232 are windowed by a windowing buffer 404.
  • This windowing buffer 404 typically overlaps and mathematically adds data samples from neighboring windows (i.e., time frames) with appropriate weighting factors.
  • the time domain data samples for each time frame could be used for four time frames, with the values output from by the windowing buffer 404 being equal to one fourth of the data sample values from the current time frame, plus one fourth of the data sample values from the previous three time frames.
  • the weighting factors could correspond to a Gaussian or a Hanning window.
  • the resulting data values output by the windowing buffer 404 comprise a stochastic waveform that is combined with the deterministic waveform to form a synthesized waveform.
  • the noise synthesis system and method shown in FIG. 4 is very flexible in terms of being able to manipulate the shape of the stochastic waveform and is easier to implement in a real time system than the synthesizer of FIG. 2 because the FFT 226 in FIG. 2 has been eliminated.
  • FIG. 5 shows a third and even simpler sound-synthesizer 500 than the ones shown in FIGS. 2 and 4.
  • the spectral envelopes for the residual signals were effectively represented by a line segment approximation of the spectral envelope. This is because the spectral envelopes were represented by a set of magnitude values for a number of discrete frequency values. In a typical implementation of the synthesizer in FIG. 4, a set of perhaps fifteen values would be stored to represent the magnitude of the spectral envelope at fifteen frequencies. The remainder of the spectral envelope is formed or computed by linearly interpolating between the stored values.
  • the spectral envelope is represented using a LPC (linear predictive coding) model instead of a set of magnitude values.
  • LPC linear predictive coding
  • any spectral envelope can be approximated or represented by a set of LPC coefficients.
  • any set of LPC coefficients which correspond to an all-pole filter (also known as an IIR or infinite impulse response filter), can be converted into lattice filter coefficients using well known conversion algorithms. See, for example, Markel, J. D. and Gray, A. H. Linear Prediction of Speech, Springer-Verlag, New York (1976), which is hereby incorporated by reference.
  • memory 502 stores the spectral envelopes for each of a series of time frames in the form of lattice filter coefficients (shown as kl through kp if FIG. 5).
  • lattice filter coefficients shown as kl through kp if FIG. 5.
  • Transformer 504 performs a windowing type of function by interpolating the lattice coefficient values between time frames so as to provide smooth transitions over time.
  • the resulting lattice coefficients are loaded into a lattice filter 506.
  • the lattice filter 506 filters white noise generated by a noise generator 508 and outputs the stochastic waveform that is combined with the deterministic waveform to form a synthesized waveform.
  • This embodiment of the present invention has the advantage of requiring less data storage than the other embodiments, and also substitutes a lattice filter for the inverse FFT in those embodiments, all of which makes this embodiment less expensive and simpler to implement that the other embodiments.
  • the primary tradeoff is that this embodiment is less flexible in terms of its ability to manipulate the stored spectral envelopes for generating a modified stochastic waveform.

Landscapes

  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Electrophonic Musical Instruments (AREA)

Abstract

A musical sound analyzer and synthesizer uses a model that considers a sound to be composed of two types of elements: a deterministic component plus a stochastic component. The deterministic component is represented as a series of sinusoids, with an amplitude and a frequency function for each sinusoid. The stochastic component is represented as a series of magnitude spectral envelopes. From this representation, sounds can be synthesized that, in the absence of modifications, can behave as perceptual identities, that is, they are perceptually equal to the original sound. In addition, stored representations of sounds can be easily modified in a musical synthesizer to create a wide variety of new sounds.

Description

This application is a continuation in part of application Ser. No. 07/350,114, filed May 10, 1989 and now abandoned.
The present invention relates generally to musical synthesizers and particularly to methods and systems for analyzing sound signals and for synthesizing new sound signals.
BACKGROUND OF THE INVENTION
A shortcoming of prior art musical synthesizers is that such synthesizers generally try to use a single model to represent all musical sounds. It is very difficult to get a single model to faithfully represent the wide range of musical sounds. It is also important to provide a model for representing sounds which makes it possible and practical to reproduce and transform the sounds generated by the synthesizer. The present invention uses a model with two very different types of elements to represent two different aspects of musical sounds.
SUMMARY OF THE INVENTION
In summary, the present invention is a musical sound analyzer and synthesizer which is based on a model that considers a sound to be composed of two types of elements: a deterministic component plus a stochastic component. The deterministic component is represented as a series of sinusoids, with an amplitude and a frequency function for each sinusoid. The stochastic component is represented as a series of magnitude spectral envelopes. From this representation sounds can be synthesized that, in the absence of modifications, can behave as perceptual identities, that is, they are perceptually equal to the original sound. In addition, stored representations of sounds can be easily modified in a musical synthesizer to create a wide variety of new sounds.
BRIEF DESCRIPTION OF THE DRAWINGS
Additional objects and features of the invention will be more readily apparent from the following detailed description and appended claims when taken in conjunction with the drawings, in which:
FIG. 1 is a block diagram of a musical sound analyzer in accordance with the present invention.
FIG. 2 is a block diagram of a musical sound synthesizer in accordance with the present invention.
FIG. 3 is a block diagram of a second preferred embodiment of a musical sound analyzer in accordance with the present invention.
FIG. 4 is a block diagram of a second preferred embodiment of a musical sound synthesizer in accordance with the present invention.
FIG. 5 is a block diagram of a third preferred embodiment of a musical sound synthesizer in accordance with the present invention.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention's analysis and synthesis technique is based on the short-time Fourier transform (STFT), from which the relevant magnitude peaks are detected and assigned to a number of frequency trajectories. The deterministic component is obtained from these trajectories with an additive synthesis technique. More specifically, the deterministic component is a set of sound partials which represent the deterministic component of a limited time sample of the waveform being analyzed.
Then, in order to obtain the stochastic component, the spectra of the deterministic component are subtracted from the spectra of the original waveform. The result is a residual spectra which, in turn, can be approximated by a series of amplitude envelopes. These envelopes represent the stochastic component. When synthesizing new sounds, the stochastic component is synthesized by multiplying the spectrum of white noise with these frequency envelopes and performing an inverse-STFT.
The model used by the present invention assumes that the input sound s(t) is the sum of a series of sinusoids plus a noise signal e(t): ##EQU1## where Ar (t) and θr (t) are the instantaneous amplitude and phase of each sinusoid and e(t) is the noise signal. R is the number of sinusoids used in the series to represent the sound.
The model used in the present invention also assumes that the sinusoids are stable partials of the sound s(t) and that each one can be characterized by its amplitude and frequency. The instantaneous phase is then taken to be the integral of the instantaneous frequency ωr (t), and therefore satisfies ##EQU2## where ω(t) is the frequency in radians, and r is the sinusoid number.
The residual e(t) in Equation 1 is also simplified by assuming it is a stochastic signal. Such an assumption allows us to model the residual as filtered white noise: ##EQU3## where u(t) is white noise and h(t) is the impulse response of a slowly time varying filter. That is, the residual is modeled by the convolution of white noise with a frequency shaping filter.
The analysis, transformation and synthesis techniques of the present invention are based on the above model which combines deterministic and stochastic elements for representing sounds.
FIG. 1 shows a sound analyzer 100 in accordance with the present invention. The first step in analyzing a sound signal is to break it into a series of time frames, sometimes called windows. In particular, a clock generator 102 generates a sequence of window signals which are used by gate 104 to divide the sound waveform into separate time frames. The time frames are analyzed by a fast Fourier Transformer (FFT) so as to generate a set of complex spectra values. The FFT 106 uses the short-time Fourier Transform because this technique uses relatively short time frames (e.g. 50 milliseconds per time frame).
When computing the Fourier Transform, a "Kaiser window" is used to smooth the outer edges of each time frame. The length (i.e., duration) of the windows depends on the lowest frequency ωr (t) that is being tracked. In particular, the window has a duration of at least four or five cycles of the lowest frequency that is to be tracked --in order to accommodate for the time-frequency trade-off associated with STFT. Furthermore the size of the sample buffer used by the STFT should be at least double the size of the window (i.e., double the number of samples collected during each window) because a big "zero-padding" in the buffer improves the performance of the technique.
A complex to real number converter 108 converts the complex spectra generated by the FFT 106 into a set of magnitude spectra for each time frame.
A peak detector and sound partial analyzer 110 finds the highest peaks in the magnitude spectra and performs a parabolic interpolation to refine the frequency and amplitude values generated. Each identified peak has a frequency and a magnitude value. The peaks from a series of time frames are then organized into pairs of frequency and magnitude trajectories, each pair of which represents a sound partial. Thus the analyzer 110 extracts the stable sinusoids present in the original sound (the deterministic component). The frequency and magnitude trajectories are typically stored for use in a music synthesizer, as will be described below.
The stochastic part of the waveform is generated as follows. First, the deterministic component of the original waveform is regenerated from the frequency and magnitude trajectories by reversing the process that was used to generate them. In particular, a sinewave generator 120 converts the frequency and magnitude trajectories into a "deterministic waveform".
The deterministic waveform is then gated by gate 122 with the window signals from clock generator 102. The Fourier Transform of the deterministic waveform is then generated by a fast Fourier Transform 124 using the same STFT technique as was used to analyze the original waveform. Thus the FFT 124 generates a set of complex spectra, which are converted in to magnitude spectra by a complex to real number converter 126. The magnitude spectrum of the deterministic signal is then subtracted from the magnitude spectrum of the original waveform by subtractor 128, yielding a residual spectrum.
Finally, an envelope generator 130 generates a line segment approximation 132 of the residual signal's spectral envelope--i.e., the envelope of the residual power spectrum output by the magnitude spectra subtractor 128. These envelopes represent the stochastic signal portion of the original waveform.
FIG. 2 shows a sound synthesizer 200 in accordance with the present invention. Various sets of sound signals, as represented by the sound analyzer shown in FIG. 1, are stored in memories 202 and 204. Memory 202 stores pairs of magnitude and frequency trajectories, each pair representing a sound partial. Memory 204 stores residual spectral envelopes corresponding to the magnitude and frequency trajectories in memory 202.
More particularly, these memories 202 and 204 each store a series of values for producing sound signals in a corresponding series of time frames. Thus for each separate time frame there is a set of frequency and magnitude values stored in memory 202 which govern the deterministic waveform to be generated, and an spectral envelope (i.e., a set of frequency and magnitude values) is stored in memory 204 which governs the stochastic waveform to be generated.
The deterministic or sinusoidal component of the synthesized sound is generated using selected ones of the magnitude and frequency trajectories stored in memory 202. The trajectories may be transformed or manipulated by a frequency trajectory transformer 206 and a magnitude trajectory transformer 208. These transformers 206 and 208 may stretch a trajectory in time, perform linear or even nonlinear transformations, or may add, subtract and weight various partials from the database of partials in the memory 202. The transformers 206 and 208 alter the acoustic qualities of the deterministic waveform generated by the synthesizer 200, and thereby add to the range and quality of sounds that can be generated.
Of course, the original trajectories may be used untransformed. Each trajectory output by the transformers 206 and 280 is converted into a sine wave by one of a set of sine wave generators 210. Several sine wave generators are provided so that several partials can be generated simultaneously. These sine waves are combined by sine wave adder 212, resulting in the generation of the deterministic portion of the synthesized waveform.
The stochastic part of the synthesized sound is generated by creating a complex spectra out of the spectral envelope of the magnitude spectra residual, or its modification, and doing an inverse STFT. The stored spectral envelopes in memory 204 may be transformed by a spectral envelope transformer 220. The resulting envelope becomes the magnitude portion of the stochastic signal. The transformer 220 alters the acoustic qualities of the stochastic waveform generated by the synthesizer 200, and thereby adds to the range and quality of sounds that can be generated.
In order to generate the phase part of the spectrum for the stochastic signal, the STFT of a windowed white noise signal is computed using a noise generator 222, signal gate 224 for windowing or gating the noise signal, and an FFT 226. A phase generator converts the complex spectra output by the FFT into phase spectra values. These phase spectra and the magnitude values representing the spectral envelope are expressed in polar coordinates (i.e., real values). The polar coordinate values are converted into complex spectra by a polar-to-rectangular coordinate converter 230. The resulting complex spectra are then inverse Fourier transformed by an inverse-FFT 232 to generate the stochastic waveform. The process of generating the stochastic waveform corresponds to the filtering of white noise by a filter with a frequency response equal to the spectral envelope. Thus the stochastic signal circuitry 222-232 is essentially a white noise filter.
Finally, the stochastic and deterministic waveforms are added by adder 240 to generate the complete synthesized waveform. By proper selection of input trajectories and transformations, one can generate a very wide range of sounds using the synthesizer 200.
Second Preferred Embodiment of Signal Analyzer
FIG. 3 shows a second and somewhat more complicated signal analyzer 300 than the one shown in FIG. 1. Like the signal model used by the first analyzer, the signal model used by this second analyzer assumes that the input sound s(t) is the sum o a series of sinusoids plus a noise signal e(t): ##EQU4## where R is the number of sinusoids used to represent the deterministic portion of the sound, Ar (t) is the instantaneous amplitude and θr (t) is the instantaneous phase of each sinusoid. The residual signal e(t) is the difference between the signal and the sinusoidal or deterministic part.
However, in this model, the instantaneous phase is defined by ##EQU5## where ω(t) is the frequency in radians, r is the sinusoid number, θr (0) is the initial phase value, and φr is a fixed phase offset.
A clock generator 302 generates a sequence of window signals which are used by gate 304 to divide the sound waveform into separate time frames. The time frames are analyzed by a fast Fourier Transformer (FFT) so as to generate a set of complex spectra values. The FFT 306 uses the short-time Fourier Transform, as described above with reference to FIG. 1.
A rectangular to polar coordinate converter 308 converts the complex spectra generated by the FFT 306 into a set of magnitude spectra for each time frame. Then a peak detector and sound partial analyzer 310 finds the highest peaks in the magnitude spectra and performs a parabolic interpolation to refine the frequency and amplitude values generated. Each identified peak has a frequency, phase and a magnitude value. The peaks from a series of time frames are then organized into sets of frequency, phase and magnitude trajectories, each set of which represents a sound partial. Thus the analyzer 310 extracts the stable sinusoids present in the original sound (the deterministic component). The frequency, phase and magnitude trajectories may be stored for use in a music synthesizer, as described above.
Next, the deterministic portion of the sound signal is regenerated by using a phase interpolator 312 to generate the instantaneous phase of the regenerated deterministic signal, and a linear interpolator 314 to generate the instantaneous magnitude of the regenerated deterministic signal. The instantaneous phase signal is used to control the shape of a sinusoidal signal generated by a sine wave generator 316, and then a multiplier 318 amplifies the resulting sine wave to match the amplitude indicated by the instantaneous amplitude output by interpolator 314. This waveform generation process is performed on several sound partials simultaneously by a corresponding number of interpolators 312-314, sine wave generators 316, and multipliers 318. These sound partials are combined by sine wave adder 320 to generate the deterministic element of the input waveform.
Finally, the deterministic signal is subtracted from the input waveform by subtractor 330 to generate a residual signal on line 332. Thus the deterministic and residual portions of the input signal have been separated, and these two, if recombined, will be perceptually indistinguishable from the input waveform. Further, the residual signal may be modeled as a stochastic signal using the same technique as in the first signal analyzer: by performing an STFT on the residual signal, computing the magnitude spectra, and then generating an envelope approximation of the magnitude spectra.
Second Preferred Embodiment of Sound Synthesizer
FIG. 4 shows a second and somewhat simpler sound synthesizer 400 than the one shown in FIG. 2. In particular, synthesizer 400 uses the same apparatus for generating the deterministic portion of the synthesized sound as shown in FIG. 2; only the stochastic waveform circuitry has been changed from that shown in FIG. 2.
The noise generator circuitry 222-228 in FIG. 2 is replaced with a simple random number generator 402 that produces a set of phase values between π and -π. In other words, for each time frame in which sound is to be synthesized, the random number generator 402 provides a set of values θ(k) each of which is equal to a randomly selected number between π and -π, and where number of data points for each time frame corresponds to the number of input values needed by the inverse FFT 232. Similarly, the spectral envelope transformer 220 provides a set of interpolated values A(k) which represent the interpolated magnitudes of the spectral envelope at each of the data points (i.e., frequency points) needed by the inverse FFT 252. These interpolated values are calculated from the stored spectral envelope obtain from memory 204. Note that frequency magnitudes in the stored spectral envelope from memory 204 may not correspond exactly to the data points needed by the inverse FFT 232, requiring the calculation of interpolated values for those data points.
Together, the random number generator 402 and the transformer 220 provide a set values {A(1),θ(1)}, {A(2), θ(2)}, {A(n),θ(n)}, where n is the number of data points needed by the inverse FFT 232.
Next, the values for each time frame are converted from polar coordinates to rectangular coordinates by converter 230, because the inverse FFT 232 requires complex data values as its input values. The resulting complex spectra are converted into a sequence of sampled data values by an inverse FFT 232. These sampled data values are the time domain signal that represents the stochastic part of the synthesized signal for one time domain.
However, to provide for smooth transitions between time frames, the data samples generated by the inverse FFT 232 are windowed by a windowing buffer 404. This windowing buffer 404 typically overlaps and mathematically adds data samples from neighboring windows (i.e., time frames) with appropriate weighting factors. For example, the time domain data samples for each time frame could be used for four time frames, with the values output from by the windowing buffer 404 being equal to one fourth of the data sample values from the current time frame, plus one fourth of the data sample values from the previous three time frames. In another embodiment the weighting factors could correspond to a Gaussian or a Hanning window.
The resulting data values output by the windowing buffer 404 comprise a stochastic waveform that is combined with the deterministic waveform to form a synthesized waveform.
The noise synthesis system and method shown in FIG. 4 is very flexible in terms of being able to manipulate the shape of the stochastic waveform and is easier to implement in a real time system than the synthesizer of FIG. 2 because the FFT 226 in FIG. 2 has been eliminated.
Third Preferred Embodiment of Sound Synthesizer
FIG. 5 shows a third and even simpler sound-synthesizer 500 than the ones shown in FIGS. 2 and 4. In the previous embodiments, the spectral envelopes for the residual signals were effectively represented by a line segment approximation of the spectral envelope. This is because the spectral envelopes were represented by a set of magnitude values for a number of discrete frequency values. In a typical implementation of the synthesizer in FIG. 4, a set of perhaps fifteen values would be stored to represent the magnitude of the spectral envelope at fifteen frequencies. The remainder of the spectral envelope is formed or computed by linearly interpolating between the stored values.
In this synthesizer 500, the spectral envelope is represented using a LPC (linear predictive coding) model instead of a set of magnitude values. As is well known to those skilled in the art, any spectral envelope can be approximated or represented by a set of LPC coefficients. Furthermore, any set of LPC coefficients, which correspond to an all-pole filter (also known as an IIR or infinite impulse response filter), can be converted into lattice filter coefficients using well known conversion algorithms. See, for example, Markel, J. D. and Gray, A. H. Linear Prediction of Speech, Springer-Verlag, New York (1976), which is hereby incorporated by reference.
Thus, in FIG. 5, memory 502 stores the spectral envelopes for each of a series of time frames in the form of lattice filter coefficients (shown as kl through kp if FIG. 5). One advantage of storing a spectral envelope in the form of lattice filter coefficients is that less data points are needed (i.e., for each time frame), and therefore less storage is required. Transformer 504 performs a windowing type of function by interpolating the lattice coefficient values between time frames so as to provide smooth transitions over time. The resulting lattice coefficients are loaded into a lattice filter 506. The lattice filter 506 filters white noise generated by a noise generator 508 and outputs the stochastic waveform that is combined with the deterministic waveform to form a synthesized waveform.
This embodiment of the present invention has the advantage of requiring less data storage than the other embodiments, and also substitutes a lattice filter for the inverse FFT in those embodiments, all of which makes this embodiment less expensive and simpler to implement that the other embodiments. The primary tradeoff is that this embodiment is less flexible in terms of its ability to manipulate the stored spectral envelopes for generating a modified stochastic waveform.
While the present invention has been described with reference to a few specific embodiments, the description is illustrative of the invention and is not to be construed as limiting the invention. Various modifications may occur to those skilled in the art without departing from the true spirit and scope of the invention as defined by the appended claims.

Claims (17)

What is claimed is:
1. A sound waveform synthesizer, comprising:
storage means for storing data denoting a sequence of sound partials and data denoting a corresponding sequence of spectral envelopes;
sinusoidal waveform generator means coupled to said storage means for generating a sequence of first waveforms during a sequence of time frames, including means for generating sinusoidal waveforms during each said time frame corresponding to a selected one of said sound partials denoted by data stored in said storage means;
stochastic waveform generator means coupled to said storage means for generating a sequence of stochastic waveforms during said sequence of time frames, including means for generating stochastic waveforms during each said time frame having a spectral envelope corresponding to a selected one of said spectral envelopes denoted by data stored in said storage means; and
means for generating a synthesized sound waveform, including means for combining said first waveforms and said stochastic waveforms;
said stochastic waveform generator means including
noise generating means for generating a noise signal; and
filter means coupled to said storage means and said noise generating means for generating a stochastic waveform, including means for filtering said noise signal with a time varying frequency response during said sequence of time frames, said frequency response during each said time frame corresponding to a selected one of said spectral envelopes denoted by data stored in said storage means.
2. A sound waveform synthesizer as set forth in claim 1, wherein said data denoting a sequence of spectral envelopes includes data denoting a set of lattice filter coefficients for each of a sequence of time frames;
said filter means in said stochastic waveform generator means comprising
lattice filter means for filtering said noise signal with a time varying frequency response during said sequence of time frames, said frequency response during each said time frame corresponding to a selected one of said sets of lattice filter coefficients denoted by data storage in said storage means.
3. A sound waveform synthesizer as set forth in claim 1,
said noise generating means comprising random number generating means for generating a set of random phase values for each said time frame;
said filter means including:
stochastic spectra means for generating a set of complex spectral values for each said time frame, including means for combining said set of random phase values for each said time frame with a selected one of said spectral envelopes denoted by data stored in said storage means; and
inverse Fourier transform means coupled to said stochastic spectra means for generating a stochastic waveform for each said time frame by inverse fourier transforming said complex spectral values.
4. A sound waveform synthesizer as set forth in claim 1, further including
transform means coupling said storage means with said sinusoidal waveform generator means, including means for transforming selected ones of said sound partials stored in said trajectory storage means, thereby altering the acoustic qualities of said sequence of first waveforms.
5. A sound waveform synthesizer as set forth in claim 1, further including
envelope transform means coupling said storage means with said stochastic waveform generator means, including means for transforming selected ones of said spectral envelopes stored in said storage means, thereby altering the acoustic qualities of said sequence of stochastic waveforms.
6. A sound waveform synthesizer, comprising:
trajectory storage means for storing sound partials, including means for storing corresponding sets of magnitude and frequency trajectories, each set representing a sound partial;
envelope storage means for storing spectral envelopes, each spectral envelope corresponding to the stochastic portion of a predefined sound;
sinusoidal waveform generator means coupled to said trajectory storage means for generating a first waveform corresponding to selected sound partials stored in said trajectory storage means;
noise generating means for generating a noise signal;
filter means coupled to said envelope storage means and said noise generating means for generating a stochastic waveform, including means for filtering said noise signal with a frequency response equal to a selected spectral envelope stored in said envelope storage means; and
means for generating a synthesized sound waveform, including means for combining said first waveform and said stochastic waveform.
7. A sound waveform synthesizer as set forth in claim 6, further including
transform means coupling said trajectory storage means with said sinusoidal waveform generator means, including means for transforming selected ones of said sound partials stored in said trajectory storage means, thereby altering the acoustic qualities of said first waveform.
8. A sound waveform synthesizer as set forth in claim 6, further including
envelope transform means coupling said envelope storage means with said filter means, including means for transforming selected ones of said spectral envelopes stored in said envelope storage means, thereby altering the acoustic qualities of said stochastic waveform.
9. A method of generating sound waveforms, the steps of the method comprising:
storing data denoting a sequence of sound partials and data denoting a corresponding sequence of spectral envelopes;
generating a sequence of first waveforms during a sequence of time frames, including generating a plurality of sinusoidal waveforms during each said time frame corresponding to a selected one of said stored sound partials; and
generating a sequence of stochastic waveforms during said sequence of time frames, including generating stochastic waveforms during each said time frame having a spectral envelope corresponding to a selected one of said stored spectral envelopes; and
combining said first waveforms and said stochastic waveforms to generate a synthesized sound waveform;
said second generating step including the steps of
generating a noise signal; and
filtering said noise signal with a time varying frequency response during said sequence of time frames, said frequency response during each said time frame corresponding to a selected one of said stored spectral envelopes.
10. A method of generating sound waveforms, as set forth in claim 9, wherein said stored data denoting a sequence of spectral envelopes includes data denoting a set of lattice filter coefficients for each of a sequence of time frames;
said noise filtering step including the step of filtering said noise signal with a lattice filter employing time varying lattice filter coefficients corresponding to a sequence of said sets of lattice filter coefficients.
11. A method of generating sound waveforms, as set forth in claim 9, said second generating step including the steps of:
said noise generating step including generating a set of random phase values for each said time frame;
said noise filtering step including the steps of:
generating a set of complex spectral values by combining said set of random phase values for each said time frame with a selected one of said spectral envelopes denoted by said stored data; and
inverse fourier transforming said complex spectral values for each said time frame.
12. A method of generating sound waveforms, as set forth in claim 9, said first generating step including the step of transforming selected ones of said stored sound partials and thereby altering the acoustic qualities of said sequence of first waveforms.
13. A method of generating sound waveforms, as set forth in claim 9, said second generating step including the step of transforming selected ones of said stored spectral envelopes and thereby altering the acoustic qualities of said sequence of stochastic waveforms.
14. A sound waveform synthesizer, comprising:
storage means for storing data denoting a sequence of sound partials and data denoting a corresponding sequence of spectral envelopes;
sinusoidal component generator means coupled to said storage means for generating a sequence of sinusoidal waveform components during a sequence of time frames, including means for generating sinusoidal waveform components during each of said time frame corresponding to a selected one of said sound partials denoted by data stored in said storage means;
stochastic component generator means coupled to said storage means for generating a sequence of stochastic waveform components during said sequence of time frames, including means for generating stochastic waveform components during each said time frame having a spectral envelope corresponding to a selected one of said spectral envelopes denoted by data stored in said storage means; and
means for generating a synthesized sound waveform, including means for combining said sinusoidal waverform and stochastic waveform components;
said stochastic component generator means including:
noise generating means for generating a noise signal; and
noise shaping means coupled to said storage means and said noise generating means for combining said noise signal with selected ones of said spectral envelopes denoted by data stored in said storage means so as to generate spectrally shaped stochastic waveform components.
15. A sound waveform synthesizer as set forth in claim 14, wherein said noise shaping means comprises inverse fourier transforming means for generating a stochastic waveform for each said time frame by inverse fourier transforming said noise signal combined with selected ones of said spectral envelopes.
16. A sound waveform synthesizer as set forth in claim 14, further including
transform means coupling said storage means with said sinusoidal waveform generator means, including means for transforming selected ones of said sound partials stored in said trajectory storage means, thereby altering the acoustic qualities of said sequence of first waveforms.
17. A sound waveform synthesizer as set forth in claim 14, further including
envelope transform means coupling said storage means with said stochastic waveform generator means, including means for transforming selected ones of said spectral envelopes stored in said storage means, thereby altering the acoustic qualities of said sequence of stochastic waveforms.
US07/431,594 1989-05-10 1989-11-03 Musical synthesizer combining deterministic and stochastic waveforms Expired - Lifetime US5029509A (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US07/431,594 US5029509A (en) 1989-05-10 1989-11-03 Musical synthesizer combining deterministic and stochastic waveforms
PCT/US1990/002200 WO1990013887A1 (en) 1989-05-10 1990-04-26 Musical signal analyzer and synthesizer
AU55328/90A AU5532890A (en) 1989-05-10 1990-04-26 Musical signal analyzer and synthesizer

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US35011489A 1989-05-10 1989-05-10
US07/431,594 US5029509A (en) 1989-05-10 1989-11-03 Musical synthesizer combining deterministic and stochastic waveforms

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US35011489A Continuation-In-Part 1989-05-10 1989-05-10

Publications (1)

Publication Number Publication Date
US5029509A true US5029509A (en) 1991-07-09

Family

ID=26996487

Family Applications (1)

Application Number Title Priority Date Filing Date
US07/431,594 Expired - Lifetime US5029509A (en) 1989-05-10 1989-11-03 Musical synthesizer combining deterministic and stochastic waveforms

Country Status (3)

Country Link
US (1) US5029509A (en)
AU (1) AU5532890A (en)
WO (1) WO1990013887A1 (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5187314A (en) * 1989-12-28 1993-02-16 Yamaha Corporation Musical tone synthesizing apparatus with time function excitation generator
US5256830A (en) * 1989-09-11 1993-10-26 Yamaha Corporation Musical tone synthesizing apparatus
US5317104A (en) * 1991-11-16 1994-05-31 E-Musystems, Inc. Multi-timbral percussion instrument having spatial convolution
WO1995006859A1 (en) * 1993-09-02 1995-03-09 Media Vision, Inc. Residual excited wave guide
US5401897A (en) * 1991-07-26 1995-03-28 France Telecom Sound synthesis process
US5412152A (en) * 1991-10-18 1995-05-02 Yamaha Corporation Device for forming tone source data using analyzed parameters
US5455868A (en) * 1994-02-14 1995-10-03 Edward W. Sergent Gunshot detector
US5498834A (en) * 1990-11-30 1996-03-12 Yamaha Corporation Electronic musical instrument capable of generating a resonance tone together with a musical tone
WO1997017692A1 (en) * 1995-11-07 1997-05-15 Euphonics, Incorporated Parametric signal modeling musical synthesizer
US5684260A (en) * 1994-09-09 1997-11-04 Texas Instruments Incorporated Apparatus and method for generation and synthesis of audio
US5686683A (en) * 1995-10-23 1997-11-11 The Regents Of The University Of California Inverse transform narrow band/broad band sound synthesis
US5880392A (en) * 1995-10-23 1999-03-09 The Regents Of The University Of California Control structure for sound synthesis
US6284964B1 (en) * 1999-09-27 2001-09-04 Yamaha Corporation Method and apparatus for producing a waveform exhibiting rendition style characteristics on the basis of vector data representative of a plurality of sorts of waveform characteristics
US6298322B1 (en) 1999-05-06 2001-10-02 Eric Lindemann Encoding and synthesis of tonal audio signals using dominant sinusoids and a vector-quantized residual tonal signal
US20030009336A1 (en) * 2000-12-28 2003-01-09 Hideki Kenmochi Singing voice synthesizing apparatus, singing voice synthesizing method, and program for realizing singing voice synthesizing method
US6542857B1 (en) 1996-02-06 2003-04-01 The Regents Of The University Of California System and method for characterizing synthesizing and/or canceling out acoustic signals from inanimate sound sources
US20060081119A1 (en) * 2004-10-18 2006-04-20 Yamaha Corporation Tone data generation method and tone synthesis method, and apparatus therefor
WO2006085244A1 (en) * 2005-02-10 2006-08-17 Koninklijke Philips Electronics N.V. Sound synthesis
US20080256136A1 (en) * 2007-04-14 2008-10-16 Jerremy Holland Techniques and tools for managing attributes of media content
US20080255687A1 (en) * 2007-04-14 2008-10-16 Aaron Eppolito Multi-Take Compositing of Digital Media Assets
US20080250913A1 (en) * 2005-02-10 2008-10-16 Koninklijke Philips Electronics, N.V. Sound Synthesis
US20090062945A1 (en) * 2007-08-30 2009-03-05 Steven David Trautmann Method and System for Estimating Frequency and Amplitude Change of Spectral Peaks
EP2264696A1 (en) 1998-06-15 2010-12-22 Yamaha Corporation Voice converter with extraction and modification of attribute data
CN101308652B (en) * 2008-07-17 2011-06-29 安徽科大讯飞信息科技股份有限公司 Synthesizing method of personalized singing voice
US20130301839A1 (en) * 2012-04-19 2013-11-14 Peter Vogel Instruments Pty Ltd Sound synthesiser
US20150149160A1 (en) * 2012-06-18 2015-05-28 Goertek, Inc. Method And Device For Dereverberation Of Single-Channel Speech
US9147166B1 (en) 2011-08-10 2015-09-29 Konlanbi Generating dynamically controllable composite data structures from a plurality of data segments
US20170138721A1 (en) * 2015-10-28 2017-05-18 University Of Kent At Canterbury Apparatus and method for processing the signal in master slave interferometry and apparatus and method for master slave optical coherence tomography with any number of sampled depths
US10860946B2 (en) 2011-08-10 2020-12-08 Konlanbi Dynamic data structures for data-driven modeling
EP3719795A4 (en) * 2017-11-29 2021-08-11 Yamaha Corporation Audio synthesizing method, audio synthesizing device, and program
US11127387B2 (en) 2016-09-21 2021-09-21 Roland Corporation Sound source for electronic percussion instrument and sound production control method thereof
US20210350783A1 (en) * 2019-02-01 2021-11-11 Yamaha Corporation Sound signal synthesis method, neural network training method, and sound synthesizer

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5567901A (en) * 1995-01-18 1996-10-22 Ivl Technologies Ltd. Method and apparatus for changing the timbre and/or pitch of audio signals
US6046395A (en) * 1995-01-18 2000-04-04 Ivl Technologies Ltd. Method and apparatus for changing the timbre and/or pitch of audio signals
US6336092B1 (en) * 1997-04-28 2002-01-01 Ivl Technologies Ltd Targeted vocal transformation
US8706496B2 (en) 2007-09-13 2014-04-22 Universitat Pompeu Fabra Audio signal transforming by utilizing a computational cost function
US9159337B2 (en) 2009-10-21 2015-10-13 Dolby International Ab Apparatus and method for generating a high frequency audio signal using adaptive oversampling

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4114498A (en) * 1975-10-23 1978-09-19 Nippon Gakki Seizo Kabushiki Kaisha Electronic musical instrument having an electronic filter with time variant slope
US4466325A (en) * 1981-04-30 1984-08-21 Kabushiki Kaisha Kawai Gakki Seisakusho Tone synthesizing system for electronic musical instrument
US4502361A (en) * 1983-12-08 1985-03-05 Allen Organ Company Method and apparatus for dynamic reproduction of transient and steady state voices in an electronic musical instrument
WO1986005617A1 (en) * 1985-03-18 1986-09-25 Massachusetts Institute Of Technology Processing of acoustic waveforms
US4700603A (en) * 1985-04-08 1987-10-20 Kabushiki Kaisha Kawai Gakki Seisakusho Formant filter generator for an electronic musical instrument
EP0285276A2 (en) * 1987-04-02 1988-10-05 Massachusetts Institute Of Technology Coding of acoustic waveforms
WO1989009985A1 (en) * 1988-04-08 1989-10-19 Massachusetts Institute Of Technology Computationally efficient sine wave synthesis for acoustic waveform processing

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4114498A (en) * 1975-10-23 1978-09-19 Nippon Gakki Seizo Kabushiki Kaisha Electronic musical instrument having an electronic filter with time variant slope
US4466325A (en) * 1981-04-30 1984-08-21 Kabushiki Kaisha Kawai Gakki Seisakusho Tone synthesizing system for electronic musical instrument
US4502361A (en) * 1983-12-08 1985-03-05 Allen Organ Company Method and apparatus for dynamic reproduction of transient and steady state voices in an electronic musical instrument
WO1986005617A1 (en) * 1985-03-18 1986-09-25 Massachusetts Institute Of Technology Processing of acoustic waveforms
US4700603A (en) * 1985-04-08 1987-10-20 Kabushiki Kaisha Kawai Gakki Seisakusho Formant filter generator for an electronic musical instrument
EP0285276A2 (en) * 1987-04-02 1988-10-05 Massachusetts Institute Of Technology Coding of acoustic waveforms
WO1989009985A1 (en) * 1988-04-08 1989-10-19 Massachusetts Institute Of Technology Computationally efficient sine wave synthesis for acoustic waveform processing

Cited By (52)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5256830A (en) * 1989-09-11 1993-10-26 Yamaha Corporation Musical tone synthesizing apparatus
US5187314A (en) * 1989-12-28 1993-02-16 Yamaha Corporation Musical tone synthesizing apparatus with time function excitation generator
US5498834A (en) * 1990-11-30 1996-03-12 Yamaha Corporation Electronic musical instrument capable of generating a resonance tone together with a musical tone
US5401897A (en) * 1991-07-26 1995-03-28 France Telecom Sound synthesis process
US5412152A (en) * 1991-10-18 1995-05-02 Yamaha Corporation Device for forming tone source data using analyzed parameters
US5317104A (en) * 1991-11-16 1994-05-31 E-Musystems, Inc. Multi-timbral percussion instrument having spatial convolution
US5543578A (en) * 1993-09-02 1996-08-06 Mediavision, Inc. Residual excited wave guide
WO1995006859A1 (en) * 1993-09-02 1995-03-09 Media Vision, Inc. Residual excited wave guide
US5455868A (en) * 1994-02-14 1995-10-03 Edward W. Sergent Gunshot detector
US5684260A (en) * 1994-09-09 1997-11-04 Texas Instruments Incorporated Apparatus and method for generation and synthesis of audio
US5686683A (en) * 1995-10-23 1997-11-11 The Regents Of The University Of California Inverse transform narrow band/broad band sound synthesis
US5880392A (en) * 1995-10-23 1999-03-09 The Regents Of The University Of California Control structure for sound synthesis
WO1997017692A1 (en) * 1995-11-07 1997-05-15 Euphonics, Incorporated Parametric signal modeling musical synthesizer
US5744742A (en) * 1995-11-07 1998-04-28 Euphonics, Incorporated Parametric signal modeling musical synthesizer
US6542857B1 (en) 1996-02-06 2003-04-01 The Regents Of The University Of California System and method for characterizing synthesizing and/or canceling out acoustic signals from inanimate sound sources
EP2264696A1 (en) 1998-06-15 2010-12-22 Yamaha Corporation Voice converter with extraction and modification of attribute data
EP2450887A1 (en) 1998-06-15 2012-05-09 Yamaha Corporation Voice converter with extraction and modification of attribute data
US8447585B2 (en) 1998-12-02 2013-05-21 Lawrence Livermore National Security, Llc. System and method for characterizing, synthesizing, and/or canceling out acoustic signals from inanimate sound sources
US20030149553A1 (en) * 1998-12-02 2003-08-07 The Regents Of The University Of California Characterizing, synthesizing, and/or canceling out acoustic signals from sound sources
US7191105B2 (en) 1998-12-02 2007-03-13 The Regents Of The University Of California Characterizing, synthesizing, and/or canceling out acoustic signals from sound sources
US20080004861A1 (en) * 1998-12-02 2008-01-03 The Regents Of The University Of California System and method for characterizing, synthesizing, and/or canceling out acoustic signals from inanimate sound sources
US6298322B1 (en) 1999-05-06 2001-10-02 Eric Lindemann Encoding and synthesis of tonal audio signals using dominant sinusoids and a vector-quantized residual tonal signal
US6284964B1 (en) * 1999-09-27 2001-09-04 Yamaha Corporation Method and apparatus for producing a waveform exhibiting rendition style characteristics on the basis of vector data representative of a plurality of sorts of waveform characteristics
US7016841B2 (en) 2000-12-28 2006-03-21 Yamaha Corporation Singing voice synthesizing apparatus, singing voice synthesizing method, and program for realizing singing voice synthesizing method
US20030009336A1 (en) * 2000-12-28 2003-01-09 Hideki Kenmochi Singing voice synthesizing apparatus, singing voice synthesizing method, and program for realizing singing voice synthesizing method
US20060081119A1 (en) * 2004-10-18 2006-04-20 Yamaha Corporation Tone data generation method and tone synthesis method, and apparatus therefor
US7626113B2 (en) * 2004-10-18 2009-12-01 Yamaha Corporation Tone data generation method and tone synthesis method, and apparatus therefor
CN1763841B (en) * 2004-10-18 2011-01-26 雅马哈株式会社 Tone data generation method and tone synthesis method, and apparatus therefor
KR101207325B1 (en) 2005-02-10 2012-12-03 코닌클리케 필립스 일렉트로닉스 엔.브이. Device and method for sound synthesis
US20080184871A1 (en) * 2005-02-10 2008-08-07 Koninklijke Philips Electronics, N.V. Sound Synthesis
US20080250913A1 (en) * 2005-02-10 2008-10-16 Koninklijke Philips Electronics, N.V. Sound Synthesis
US7649135B2 (en) * 2005-02-10 2010-01-19 Koninklijke Philips Electronics N.V. Sound synthesis
US7781665B2 (en) * 2005-02-10 2010-08-24 Koninklijke Philips Electronics N.V. Sound synthesis
WO2006085244A1 (en) * 2005-02-10 2006-08-17 Koninklijke Philips Electronics N.V. Sound synthesis
JP2008530608A (en) * 2005-02-10 2008-08-07 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Speech synthesis method
US20080256136A1 (en) * 2007-04-14 2008-10-16 Jerremy Holland Techniques and tools for managing attributes of media content
US20080255687A1 (en) * 2007-04-14 2008-10-16 Aaron Eppolito Multi-Take Compositing of Digital Media Assets
US8751022B2 (en) 2007-04-14 2014-06-10 Apple Inc. Multi-take compositing of digital media assets
US8275475B2 (en) * 2007-08-30 2012-09-25 Texas Instruments Incorporated Method and system for estimating frequency and amplitude change of spectral peaks
US20090062945A1 (en) * 2007-08-30 2009-03-05 Steven David Trautmann Method and System for Estimating Frequency and Amplitude Change of Spectral Peaks
CN101308652B (en) * 2008-07-17 2011-06-29 安徽科大讯飞信息科技股份有限公司 Synthesizing method of personalized singing voice
US10452996B2 (en) 2011-08-10 2019-10-22 Konlanbi Generating dynamically controllable composite data structures from a plurality of data segments
US10860946B2 (en) 2011-08-10 2020-12-08 Konlanbi Dynamic data structures for data-driven modeling
US9147166B1 (en) 2011-08-10 2015-09-29 Konlanbi Generating dynamically controllable composite data structures from a plurality of data segments
US20130301839A1 (en) * 2012-04-19 2013-11-14 Peter Vogel Instruments Pty Ltd Sound synthesiser
US9269369B2 (en) * 2012-06-18 2016-02-23 Goertek, Inc. Method and device for dereverberation of single-channel speech
US20150149160A1 (en) * 2012-06-18 2015-05-28 Goertek, Inc. Method And Device For Dereverberation Of Single-Channel Speech
US20170138721A1 (en) * 2015-10-28 2017-05-18 University Of Kent At Canterbury Apparatus and method for processing the signal in master slave interferometry and apparatus and method for master slave optical coherence tomography with any number of sampled depths
US10760893B2 (en) * 2015-10-28 2020-09-01 University Of Kent Apparatus and method for processing the signal in master slave interferometry and apparatus and method for master slave optical coherence tomography with any number of sampled depths
US11127387B2 (en) 2016-09-21 2021-09-21 Roland Corporation Sound source for electronic percussion instrument and sound production control method thereof
EP3719795A4 (en) * 2017-11-29 2021-08-11 Yamaha Corporation Audio synthesizing method, audio synthesizing device, and program
US20210350783A1 (en) * 2019-02-01 2021-11-11 Yamaha Corporation Sound signal synthesis method, neural network training method, and sound synthesizer

Also Published As

Publication number Publication date
WO1990013887A1 (en) 1990-11-15
AU5532890A (en) 1990-11-29

Similar Documents

Publication Publication Date Title
US5029509A (en) Musical synthesizer combining deterministic and stochastic waveforms
McAulay et al. Speech analysis/synthesis based on a sinusoidal representation
Serra et al. Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition
Dolson The phase vocoder: A tutorial
Smith et al. PARSHL: An analysis/synthesis program for non-harmonic sounds based on a sinusoidal representation
Malah Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signals
US5485543A (en) Method and apparatus for speech analysis and synthesis by sampling a power spectrum of input speech
Evangelista Pitch-synchronous wavelet representations of speech and music signals
JP4527287B2 (en) A signal processing technique for changing the time scale and / or fundamental frequency of an audio signal
US4885790A (en) Processing of acoustic waveforms
US6115684A (en) Method of transforming periodic signal using smoothed spectrogram, method of transforming sound using phasing component and method of analyzing signal using optimum interpolation function
US5536902A (en) Method of and apparatus for analyzing and synthesizing a sound by extracting and controlling a sound parameter
CA1065490A (en) Emphasis controlled speech synthesizer
JP3098031B2 (en) Music synthesis method
JPH0863197A (en) Method of decoding voice signal
US20050065781A1 (en) Method for analysing audio signals
Abe et al. Sinusoidal model based on instantaneous frequency attractors
US5686683A (en) Inverse transform narrow band/broad band sound synthesis
Bonada et al. Sample-based singing voice synthesizer by spectral concatenation
US6111183A (en) Audio signal synthesis system based on probabilistic estimation of time-varying spectra
McAulay et al. Mid-rate coding based on a sinusoidal representation of speech
Serra Introducing the phase vocoder
Dubnov et al. Investigation of phase coupling phenomena in sustained portion of musical instruments sound
JPH05281996A (en) Pitch extracting device
Goodwin et al. Atomic decompositions of audio signals

Legal Events

Date Code Title Description
AS Assignment

Owner name: BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST.;ASSIGNOR:SMITH, JULIUS;REEL/FRAME:005209/0729

Effective date: 19891218

Owner name: BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST.;ASSIGNOR:SERRA, XAVIER;REEL/FRAME:005209/0726

Effective date: 19891218

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: PAT HLDR NO LONGER CLAIMS SMALL ENT STAT AS NONPROFIT ORG (ORIGINAL EVENT CODE: LSM3); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

REMI Maintenance fee reminder mailed
FPAY Fee payment

Year of fee payment: 12

SULP Surcharge for late payment

Year of fee payment: 11