Highlights
- Pro
Lists (12)
Sort Name ascending (A-Z)
Stars
PyTorch implementation of Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities.
AudioLDM training, finetuning, evaluation and inference.
Official Release of NeurIPS 2023 Spotlight paper "Object-Centric Slot Diffusion"
Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986
This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.
Official implementation for our paper "Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations"
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
haoheliu / kmeans_pytorch
Forked from subhadarship/kmeans_pytorchkmeans using PyTorch
Serverless IMDB API powered by Cloudflare Worker
[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)
📄 🤖 Semantic search and workflows for medical/scientific papers
Implementation of Slot Attention from GoogleAI
Official implementation of MW-MAE in Jax
Collection of audio-focused loss functions in PyTorch
Actionable and opinionated no-bs ideas, frameworks and resources from successful operators in crypto to help build, grow and scale web3 products
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
Python 3.8+ toolbox for submitting jobs to Slurm
End-to-End Object Detection with Transformers
Official repo of ICASSP 2022 paper - Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
A technical report on convolution arithmetic in the context of deep learning
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.