🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…

Python 62,105 6,289 Updated Jul 20, 2025

All-Hands-AI / OpenHands

🙌 OpenHands: Code Less, Make More

Python 61,067 7,206 Updated Jul 24, 2025

CorentinJ / Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python 54,746 9,038 Updated May 30, 2025

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 41,585 5,419 Updated Aug 16, 2024

LC044 / WeChatMsg

39,810 4,375 Updated Apr 26, 2025

suno-ai / bark

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 38,245 4,563 Updated Aug 19, 2024

myshell-ai / OpenVoice

Instant voice cloning by MIT and MyShell. Audio foundation model.

Python 33,424 3,561 Updated Apr 19, 2025

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 31,664 6,574 Updated Jun 10, 2025

waydabber / BetterDisplay

Unlock your displays on your Mac! Flexible HiDPI scaling, XDR/HDR extra brig CD27 htness, virtual screens, DDC control, extra dimming, PIP/streaming, EDID override and lots more!

25,918 452 Updated Jun 30, 2025

Genesis-Embodied-AI / Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python 25,883 2,360 Updated Jul 23, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,121 2,341 Updated Jul 21, 2025

QwenLM / Qwen3

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 23,068 1,561 Updated Jul 23, 2025

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 21,568 2,644 Updated Jul 3, 2025

amusi / CVPR2025-Papers-with-Code

CVPR 2025 论文和开源项目合集

20,554 2,713 Updated Jul 2, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 18,523 1,826 Updated Jul 24, 2025

CopyTranslator / CopyTranslator

🔠Foreign language reading and translation assistant based on copy and translate.

TypeScript 17,434 1,935 Updated Nov 29, 2024

eriklindernoren / PyTorch-GAN

PyTorch implementations of Generative Adversarial Networks.

Python 17,175 4,098 Updated Jun 18, 2024

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

15,897 1,037 Updated Jul 11, 2025

datawhalechina / leedl-tutorial

《李宏毅深度学习教程》（李宏毅老师推荐👍，苹果书🍎），PDF下载地址：https://github.com/datawhalechina/leedl-tutorial/releases

Jupyter Notebook 15,503 3,052 Updated Jun 13, 2025

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 15,364 1,629 Updated Jul 16, 2025

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 15,193 3,010 Updated Jul 24, 2025

AliaksandrSiarohin / first-order-model

This repository contains the source code for the paper First Order Motion Model for Image Animation

Jupyter Notebook 14,895 3,282 Updated Nov 14, 2024

eugeneyan / open-llms

📋 A list of open LLMs available for commercial use.

12,211 885 Updated Feb 13, 2025

Rudrabha / Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Python 12,209 2,605 Updated Jun 22, 2025

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 11,659 1,179 Updated Jul 23, 2025

facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 11,607 1,143 Updated Nov 14, 2024

magic-research / magic-animate

[CVPR 2024] Official repository for "MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model"

Python 10,816 1,104 Updated Jun 21, 2024

Jiaxin Ye Jiaxin-Ye

Lists (8)

Affective Computing 🤓

AIGC 🫨

Diffusion-based Method 🫡

FaceTTS 😊🎙️

Mamba🐍

Speech Generation 🎤

Talking Head Generation 🤖️

Toolkit 👍

Stars