Deep neural architectures for dialect classification with single frequency filtering and zero−time windowing feature representations

Pre-requisites:

Install Matlab for feature extraction and Python==3.8 for classification
Install required packages using: pip install -r requirements.txt

Corpus: UT-Podcast

UT-Podcast is a speech corpus collected from podcasts, it has three dialects of English (US, UK, AU). Please download it from here. For more details refer

Corpus: VoxCeleb

The train, validation, and test split of VoxCeleb corpus is provided in voxceleb_corpus folder. VoxCeleb1 corpus can be dowloaded from here

Feature Extraction

For extraction of features (STFT, SFF, and ZTW based features), MATLAB is used. Code for feature extraction will soon be updated at feature_extraction/

Neural Network Architectures for Dialect Classification

This project implements three neural architectures:

The code for Convolution Neural Network architecture can be found in main_cnn.py
The code for Convolution Neural Network with embedded spectra filter as convolution layer architecture can be found in cnn_spectral_layer.py
The code for Temporal Convolution Neural Network architecture can be found in main_tcnn.py
The code for Time delay Neural Network architecture can be found in main_tdnn.py

NOTE: Please find the pre-trained models at: https://drive.google.com/drive/folders/1O4ZK1c8I5Vkglyka2fniUTpolyokTAsL?usp=sharing

Classification metric

Unweighted Average Recall (UAR) is used as classification metric. Evaluation results will be updated soon.

Citation

@article{dialect_class,
title = {Deep neural architectures for dialect 4A4E classification with single frequency filtering and zero-time windowing feature representations},
author={Kethireddy, Rashmi and Kadiri, Sudarsana Reddy and Gangashetty, Suryakanth V},
journal = JASA,
volume = {151},
number = {2},
pages = {1077-1092},
year = {2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
classification		classification
data		data
feature_extraction		feature_extraction
resources/t-SNE_projections		resources/t-SNE_projections
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep neural architectures for dialect classification with single frequency filtering and zero−time windowing feature representations

Pre-requisites:

Corpus: UT-Podcast

Corpus: VoxCeleb

Feature Extraction

Neural Network Architectures for Dialect Classification

Classification metric

Citation

About

Releases

Packages

Languages

r39ashmi/e2e_dialect

Folders and files

Latest commit

History

Repository files navigation

Deep neural architectures for dialect classification with single frequency filtering and zero−time windowing feature representations

Pre-requisites:

Corpus: UT-Podcast

Corpus: VoxCeleb

Feature Extraction

Neural Network Architectures for Dialect Classification

Classification metric

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages