Hantrakul et al., 2019 - Google Patents

Fast and Flexible Neural Audio Synthesis.

Hantrakul et al., 2019

Document ID: 7212199334235573685
Author: Hantrakul L; Engel J; Roberts A; Gu C
Publication year: 2019
Publication venue: Ismir

External Links

Cited by

Snippet

Autoregressive neural networks, such as WaveNet, have opened up new avenues for expressive audio synthesis. High-quality speech synthesis utilizes detailed linguistic features for conditioning, but comparable levels of control have yet to be realized for neural …

Continue reading at archives.ismir.net (PDF) (other versions)

230000002194 synthesizing 0 title abstract description 31

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0033—Recording/reproducing or transmission of music for electrophonic musical instruments
- G10H1/0041—Recording/reproducing or transmission of music for electrophonic musical instruments in coded form
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/471—General musical sound synthesis principles, i.e. sound category-independent synthesis methods
- G10H2250/511—Physical modelling or real-time simulation of the acoustomechanical behaviour of acoustic musical instruments using, e.g. waveguides or looped delay lines
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0008—Associated control or indicating means
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H7/00—Instruments in which the tones are synthesised from a data store, e.g. computer organs
- G10H7/08—Instruments in which the tones are synthesised from a data store, e.g. computer organs by calculating functions or polynomial approximations to evaluate amplitudes at successive sample points of a tone waveform
- G10H7/10—Instruments in which the tones are synthesised from a data store, e.g. computer organs by calculating functions or polynomial approximations to evaluate amplitudes at successive sample points of a tone waveform using coefficients or parameters stored in a memory, e.g. Fourier coefficients
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/121—Musical libraries, i.e. musical databases indexed by musical parameters, wavetables, indexing schemes using musical parameters, musical rule bases or knowledge bases, e.g. for automatic composing methods
- G10H2240/131—Library retrieval, i.e. searching a database or selecting a specific musical piece, segment, pattern, rule or parameter set
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H5/00—Instruments in which the tones are generated by means of electronic generators
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H3/00—Instruments in which the tones are generated by electromechanical means
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2230/00—General physical, ergonomic or hardware implementation of electrophonic musical tools or instruments, e.g. shape or architecture
- G10H2230/045—Special instrument [spint], i.e. mimicking the ergonomy, shape, sound or other characteristic of a specific acoustic musical instrument category
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K15/00—Acoustics not otherwise provided for

Similar Documents

Publication	Publication Date	Title
Hantrakul et al.	2019	Fast and Flexible Neural Audio Synthesis.
Engel et al.	2020	DDSP: Differentiable digital signal processing
Wu et al.	2021	MIDI-DDSP: Detailed control of musical performance via hierarchical modeling
Hawthorne et al.	2022	Multi-instrument music synthesis with spectrogram diffusion
Cogliati et al.	2016	Context-dependent piano music transcription with convolutional sparse coding
Schulze-Forster et al.	2023	Unsupervised music source separation using differentiable parametric source models
Hayes et al.	2024	A review of differentiable digital signal processing for music and speech synthesis
Zhang et al.	2023	Symbolic music representations for classification tasks: A systematic evaluation
Borin et al.	2013	Musical signal synthesis
Mirbeygi et al.	2022	Speech and music separation approaches-a survey
Wang et al.	2024	Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
Simionato et al.	2024	Physics-informed differentiable method for piano modeling
Shier et al.	2023	Differentiable modelling of percussive audio with transient and spectral synthesis
Loscos	2007	Spectral processing of the singing voice.
Zou et al.	2021	Non-parallel and Many-to-One Musical Timbre Morphing using DDSP-Autoencoder and Spectral Feature Interpolation
Wang	2023	Text to music audio generation using latent diffusion model: A re-engineering of audioldm model
Mehrabi et al.	2018	Vocal imitation for query by vocalisation
Nizami et al.	2021	A DT-Neural Parametric Violin Synthesizer
Landarech Mayo	2024	From voice to virtuosity: DDSP-based timbre transfer
Wu	2023	Controllable music performance synthesis via hierarchical modelling
Colonel	2018	Autoencoding neural networks as musical audio synthesizers
Richard et al.	2024	Model-Based Deep Learning for Music Information Research
Grumiaux et al.	2023	Efficient bandwidth extension of musical signals using a differentiable harmonic plus noise model
Xiao et al.	2024	Music performance style transfer for learning expressive musical performance
US20240347037A1 (en)	2024-10-17	Method and apparatus for synthesizing unified voice wave based on self-supervised learning