sripathisridhar

🏠

Mountain climb

Sripathi Sridhar sripathisridhar

🏠

Mountain climb

Teaching machines to understand audio. PhD student @sinc-laboratory, NJIT

17 followers · 27 following

Achievements

Highlights

Lists (12)

Sort

Stars

juliawilkins / py-lightning-wandb-tutorial

Python 9 Updated Feb 27, 2025

NVIDIA / audio-flamingo

PyTorch implementation of Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities.

Python 465 27 Updated Apr 29, 2025

haoheliu / AudioLDM-training-finetuning

AudioLDM training, finetuning, evaluation and inference.

Python 248 47 Updated Dec 13, 2024

JindongJiang / latent-slot-diffusion

Official Release of NeurIPS 2023 Spotlight paper "Object-Centric Slot Diffusion"

Python 65 17 Updated Mar 9, 2024

AudioLLMs / Awesome-Audio-LLM

Audio Large Language Models

Python 514 30 Updated Mar 9, 2025

gzhu06 / Cacophony

Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986

Python 45 4 Updated Oct 13, 2024

BlackHC / neural_net_checklist

Python 150 8 Updated Aug 14, 2024

bytedance / uss

This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.

Python 349 19 Updated Sep 1, 2023

SarthakYadav / audio-mamba-official

Official implementation for our paper "Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations"

Python 38 Updated Jun 6, 2024

haoheliu / SemantiCodec-inference

Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.

Python 198 15 Updated Mar 7, 2025

haoheliu / kmeans_pytorch

Forked from subhadarship/kmeans_pytorch

kmeans using PyTorch

Jupyter Notebook 6 1 Updated Mar 9, 2024

tuhinpal / imdb-api

Serverless IMDB API powered by Cloudflare Worker

JavaScript 289 318 Updated Jul 6, 2024

ShoufaChen / DiffusionDet

[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)

Python 2,168 168 Updated Dec 22, 2022

neuml / paperai

📄 🤖 Semantic search and workflows for medical/scientific papers

Python 1,396 110 Updated Apr 21, 2025

lucidrains / slot-attention

Implementation of Slot Attention from GoogleAI

Python 426 33 Updated Aug 20, 2024

SarthakYadav / mwmae-jax-official

Official implementation of MW-MAE in Jax

Python 4 1 Updated Feb 14, 2024

csteinmetz1 / auraloss

Collection of audio-focused loss functions in PyTorch

Python 774 72 Updated Jul 30, 2024

object-understanding / SLASH

Python 23 2 Updated Aug 26, 2023

reachjason / Web3-Operator-Handbook

Actionable and opinionated no-bs ideas, frameworks and resources from successful operators in crypto to help build, grow and scale web3 products

16 Updated Jun 18, 2024

ollama / ollama

Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.

Go 139,677 11,661 Updated May 6, 2025

binpash / pash

PaSh: Light-touch Data-Parallel Shell Processing

Shell 572 44 Updated Apr 14, 2025

facebookincubator / submitit

Python 3.8+ toolbox for submitting jobs to Slurm

Python 1,420 137 Updated Apr 28, 2025

haoheliu / ontology-aware-audio-tagging

Python 13 1 Updated Nov 22, 2022

facebookresearch / detr

End-to-End Object Detection with Transformers

Python 14,300 2,553 Updated Mar 12, 2024

state-spaces / mamba

Mamba SSM architecture

Python 14,774 1,287 Updated Apr 1, 2025

haiciyang / Remixing

Official repo of ICASSP 2022 paper - Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization

Python 16 2 Updated Jan 7, 2025

YuanGongND / ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

Python 431 40 Updated Apr 24, 2024

vdumoulin / conv_arithmetic

A technical report on convolution arithmetic in the context of deep learning

TeX 14,315 2,291 Updated Jun 8, 2023

EmulationAI / awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

672 39 Updated Aug 3, 2024

XinhaoMei / WavCaps

This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.

Python 225 12 Updated Jul 25, 2024

Sripathi Sridhar sripathisridhar

Highlights

Lists (12)

Audio Captioning

audio generation

BDA

DCASE 2022

educational

MLOS

music generative models

self-supervised

source-level-embeddings

universal source separators

website-templates

workflow tools

Stars