Xeno-Canto organizer

A python tool to prepare Xeno-Canto audio files for machine learning projects

Summary

Xeno-Canto (XC) (https://www.xeno-canto.org) is a data treasure for ecology and bio-acoustics applications.
However, the mp3 files cannot be directly used for machine learning (ML).
This tool allows to download and prepare XC data for ML project.
It is a single class with a few methods for download, conversion, data segmentation and spectrograms extraction.
Thus, the complete download and preparation process can be handled in a small python script, see sample code below and main.py.
All intermediate and final items are written to files in a single directory tree.
⚠️ Running the code can download many mp3 files ⚠️

Status

🚧 Still under development 🚧

Features

Summaries can be checked before actual download
Explicit selection of mp3 duration, quality, country, species gives fine control of what is included
Also stores the XC meta-data in PKL files that are easy to integrate with Python
Spectrogram parameters can be flexibly adjusted
Spectrogram stored as PNG images for easy exploration and ingestion by established CNNs

Usage

Download this repo as zip and initialize a new .git to track you personal changes in main.py.
Make sure ffmpg and Python packages are installed (see Dependencies and installation)
Open main.py and set the a start_path, see sample code below.
Make a template JSON to define the download, see sample code.
Edit the JSON according to your needs (species, countries, recording duration and quality)
Run main.py line-by-line, check the files that are generated, adjust the parameters of your data preparation.
Once main.py is ready, run the complete main.py.
Result: metadata, mp3, wav, and spectrograms should be ready in their respective directories.
😆 😏 Now you can throw your PyTorch magics at those PNGs (not covered in this codebase 😉)

Sample code

Example of how preparation of data for an ML project can be handled with super-short Python script

#----------------------
# minimalistic example
import xco 
# Make an instance of the XCO class and define the start path 
xc = xco.XCO(start_path = 'C:/<path where data will be stored>')
# Create a template json parameter file (to be edited)
xc.make_param(filename = 'download_criteria.json', template = "mini")
# Get information of what will be downloaded
xc.get_summary(params_json = 'download_criteria.json')
# Make summaries  
print(xc.df_recs.shape)
# Download the files 
xc.download()
# Convert mp3s to wav with a specific sampling rate (requires ffmpeg to be installed)
xc.mp3_to_wav(conversion_fs = 24000)
# Extract spectrograms from segments and store as PNG
xc.extract_spectrograms(fs_tag = 24000, segm_duration = 1.0, segm_step = 0.5, win_siz = 512, win_olap = 192, max_segm_per_file = 12, equalize = True, colormap='viridis')

Illustration

The figure below is a snapshot of a few spectrograms obtained with this tool
MP3 were converted to wav files with a fixed sampling frequency
Wav files were cut into pieces and spectrograms extracted
Spectrograms were equalized, log10-transformed and mapped to [0, 255]
Can be exported as 1-channel or 3-channel images

Why save spectrogram of sounds as PNG images

It is handy because many PyTorch models and data augmentation procedures can directly ingest PNGs
It is handy because images can be easily visualized with standard software
Yes, 3-channel is an overkill but easier to be ingested by Image CNNs such as ResNet and co

Dependencies and installation

Needs internet access to download data from the XC API https://www.xeno-canto.org/api/2/recordings
Developed under Python 3.12.8
Install ffmpg (see for example https://ffmpeg.org)
Make a fresh venv and install the python packages with pip:

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 131 Commits
images		images
sample_json		sample_json
spec		spec
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
main.py		main.py
requirements.txt		requirements.txt
xco.py		xco.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Xeno-Canto organizer

Summary

Status

Features

Usage

Sample code

Illustration

Why save spectrogram of sounds as PNG images

Dependencies and installation

Hints

Useful links

Limitation

About

Releases

Packages

Languages

License

sergezaugg/xeno_canto_organizer

Folders and files

Latest commit

History

Repository files navigation

Xeno-Canto organizer

Summary

Status

Features

Usage

Sample code

Illustration

Why save spectrogram of sounds as PNG images

Dependencies and installation

Hints

Useful links

Limitation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages