Lists (1)
Sort Name ascending (A-Z)
Stars
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
TradingAgents: Multi-Agents LLM Financial Trading Framework
Solve Visual Understanding with Reinforced VLMs
MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.
AI SYSTEMS TRANSPARENCY FOR ALL! - LEAKED SYSTEM PROMPTS FOR CHATGPT, GEMINI, GROK, CLAUDE, PERPLEXITY, CURSOR, WINDSURF, DEVIN, REPLIT, AND MORE!
slime is a LLM post-training framework aiming at scaling RL.
Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.
Develop LLM Chat Applications with Electron.
Awesome papers & datasets specifically focused on long-term videos.
Distillation pipeline from pretrained Transformers to customized FLA models
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA
Code for the paper "VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use"
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
open-source coding LLM for software engineering tasks
ByteCheckpoint: An Unified Checkpointing Library for LFMs
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.
Train your Agent model via our easy and efficient framework
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models