8000 GitHub - cordutie/ddsp_textures_thesis: Experiments related to my Master thesis in Sound and Music Computing (SMC) at the Music Technology Group (MTG)
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Experiments related to my Master thesis in Sound and Music Computing (SMC) at the Music Technology Group (MTG)

License

Notifications You must be signed in to change notification settings

cordutie/ddsp_textures_thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ddsp_textures_thesis

Esteban Gutiérrez1 and Lonce Wyse1

1 Department of Information and Communications Technologies, Universitat Pompeu Fabra

1. Introduction

This repository contains the experiments conducted for the thesis titled "Statistics-Driven Texture Sound Synthesis Using Differentiable Digital Signal Processing-Based Architectures" authored by Esteban Gutiérrez and advised by Lonce Wyse at the Universitat Pompeu Fabra. The experiments are based on the GitHub repositories resulting from this thesis: ddsp_textures and VAE_SubEnv.

DDSP architecture

Figure 1. DDSP architecture modified to synthesize texture sounds.

The thesis explores adapting Differentiable Digital Signal Processing (DDSP) architectures, first introduced by Engel et a 7914 l. in [1], for synthesizing and controlling texture sounds, which are complex and noisy compared to traditional pitched instrument timbres. It introduces two innovative synthesizers: the $\texttt{SubEnv\ Synth}$, which applies amplitude envelopes to subband decompositions of white noise, and the $\texttt{P-VAE\ Synth}$, which integrates a Poisson process with a Variational Autoencoder (VAE) to handle time and event-based aspects of texture sounds based on the early conceptions of a texture sound introduced by Saint-Arnaud in [2]. Additionally, the $\texttt{TextStat}$ loss function is presented, inspired in McDermott and Simoncelli's work [3] and designed to evaluate texture sounds based on their statistical properties rather than short-term perceptual similarity. The thesis demonstrates the application of these synthesizers and the loss function within DDSP-based frameworks, highlights mixed success in resynthesizing texture sounds, and identifies challenges, particularly with the $\texttt{P-VAE\ Synth}$. Future work will focus on optimizing the $\texttt{TextStat}$ loss function, reassessing the VAE component, and exploring real-time implementations. This research lays the groundwork for advancing texture sound synthesis and provides valuable insights for both theoretical and practical developments in audio signal processing.

Latent space exploration

Figure 2. Latent space exploration.

2. Experiments

The experiments folder contains several Jupyter notebooks detailing the experiments conducted for this thesis. Below is a description of each subfolder:

2.1. Signal Processors

This section introduces two differentiable signal processors explored in this thesis. Details and experiments related to these processors can be found in the corresponding Jupyter notebooks.

2.2. Loss Functions

In this section, a new loss function is introduced and preliminary research is conducted to assess its effectiveness. The relevant notebooks provide insights into the performance and efficiency of this loss function.

2.3. DDSP

This subfolder contains Jupyter notebooks on models based on Differentiable Digital Signal Processing (DDSP) using the previously defined signal processors and loss function. These notebooks cover the training, exploration, and evaluation of these DDSP-based models.

2.4. New Models

Here, you’ll find notebooks detailing the training and evaluation of new model variants derived from the ones introduced in the thesis. These models are examined for their performance and effectiveness in the experiments.

3. References

[1] J. Engel, L. Hantrakul, C. Gu, and A. Roberts, “Ddsp: Differentiable digital signal processing,” in International Conference on Learning Representations, 2020.
[2] N. Saint-Arnaud, “Classification of Sound Textures,” Master’s thesis, Massachusetts Institute of Technology, Cambridge, MA, 1995.
[3] J. H. McDermott and E. P. Simoncelli, “Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis,” Neuron, vol. 71, pp. 926–940, 2011.\

About

Experiments related to my Master thesis in Sound and Music Computing (SMC) at the Music Technology Group (MTG)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0