AudioVision

AudioVision Repository Team : Sight&Sound

Relevant Directory : final/ and sahil/

Code for Sound to Sound Architecture : final/models.py (VAE) and final/simplemodel.py (Autoencoder) Files which load data and run these experiments : final/vaeExp.py and final/autoEncoderExp.py (Autoencoder) These models take sound in the form of a spectogram (48x128 - downsampled from 192x512) and try to regenerate the same spectogram after forcing the input down to a latent space (128 dimensional)

Code for Image to Sound Architecture : imsimple.py Files which load data and run this experiment : final/im2sound.py and final/im2soundPretrain.py The first file uses a pretrained resnet to extract image features and then applies loss to do the following:

Minimise MSE loss between latent space representation of paired image and sound
Minimise MSE between ground truth and reconstructed image using input sound spectogram
Minimise MSE between ground truth and reconstructed image using image features The second file uses a pretrained sound encoder and decoder (trained using autoEncoderExp.py) to start with.

Code to run : python exp2.py (to use custom data, use data variable line 21 of exp2.py (sahil/exp2.py))

Requirements

pytorch
librosa
progress
numpy, pickle and some standard libraries

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
final		final
modules		modules
sahil		sahil
.DS_Store		.DS_Store
._.DS_Store		._.DS_Store
.gitignore		.gitignore
README.md		README.md
data.py		data.py
default.yaml		default.yaml
main.py		main.py
models.py		models.py
notes.txt		notes.txt
opts.py		opts.py
specgrams_helper.py		specgrams_helper.py
stats.pkl		stats.pkl
train.py		train.py
trainer.py		trainer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AudioVision

Requirements

About

Releases

Packages

Contributors 3

Languages

Naman-ntc/AudioVision

Folders and files

Latest commit

History

Repository files navigation

AudioVision

Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages