-
Music and Audio Computing Lab
- Daejeon, South Korea
- https://www.kirak.kim
- @_kirak_kim
Highlights
- Pro
Stars
Open-source Multi-agent Poster Generation from Papers
A Blender add-on for importing a sequence of OBJ meshes as frames
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
TorchCFM: a Conditional Flow Matching library
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
Sync Toolbox - Python package with reference implementations for efficient, robust, and accurate music synchronization based on dynamic time warping (DTW)
Real Acoustic Fields An Audio-Visual Room Acoustics Dataset and Benchmark
[NeurIPS'24 splotlight] Official Repo for AcoustiX used in Acoustic volume rendering for neural impulse response fields.
[Neurips'24 Spotlight] Official code for "Acoustic Volume Rendering for Neural Impulse Response Fields"
High quality training free inpaint for every stable diffusion model.
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
Official PyTorch Code and Models of "RePaint: Inpainting using Denoising Diffusion Probabilistic Models", CVPR 2022
a python3 version of matlab mTRF-Toolbox by Lalor Lab https://mtrfpy.readthedocs.io/
Audio To Body Dynamics, CVPR 2018
Flexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm
Elucidating the Design Space of Diffusion-Based Generative Models (EDM)
Perform transfer learning for MIR using Jukebox!
About Code release for "TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis" (ICLR 2023), https://openreview.net/pdf?id=ju_Uqw384Oq
ATEPP is a dataset of expressive piano performances by virtuoso pianists. (ISMIR2022)
High-resolution models for human tasks.
Generating Talking Face Landmarks from Speech
ππππΆ A toolkit for robust markerless 3D pose estimation
MANO hand model in PyTorch (anatomy consistent, anchors, etc)
Large dataset of hand-object contact, hand- and object-pose, and 2.9 M RGB-D grasp images.