8000 jianzhu (steve) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View jianzhu's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report jianzhu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay

Python 73 5 Updated May 29, 2025

✨✨R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Python 141 7 Updated May 9, 2025

Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities

878 40 Updated Apr 20, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 9,115 1,177 Updated Jun 8, 2025

Fully open reproduction of DeepSeek-R1

Python 24,704 2,286 Updated Jun 2, 2025

Understanding R1-Zero-Like Training: A Critical Perspective

Python 974 45 Updated May 24, 2025

PyTorch implementation of AWR.

Python 4 1 Updated Apr 29, 2020

[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training

Python 205 14 Updated May 19, 2025
Python 102 8 Updated Apr 8, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,806 278 Updated May 15, 2025

s1: Simple test-time scaling

Python 6,429 749 Updated May 19, 2025

Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Python 280 15 Updated Apr 28, 2025

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Jupyter Notebook 454 46 Updated Oct 20, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 41,720 6,958 Updated Dec 9, 2024
Python 185 11 Updated Dec 2, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,744 376 Updated Jun 5, 2025

sangmichaelxie / doremi

< 89BF /div>

Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets

HTML 328 33 Updated Dec 26, 2023
Python 936 106 Updated Jan 23, 2025

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,334 283 Updated Nov 5, 2024

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 22,097 2,348 Updated Mar 13, 2025

《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases

Jupyter Notebook 15,212 3,038 Updated Jun 7, 2025
Jupyter Notebook 44 4 Updated Jan 18, 2024

Efficient Triton Kernels for LLM Training

Python 5,163 346 Updated Jun 7, 2025

[AAAI 2024] Code for CTX-vec2wav in UniCATS

Python 129 16 Updated Jun 11, 2024

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,591 94 Updated Sep 27, 2024

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 11,543 1,139 Updated Nov 14, 2024

A fast multimodal LLM for real-time voice

Python 3,995 304 Updated Feb 14, 2025
Next
0