-
Soochow University
- https://yyding1.github.io
Stars
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
[TKDE] Code implementation of Paper "Towards DS-NER: Unveiling and Addressing Latent Noise in Distant Annotations"
A comprehensive collection of process reward models.
A series of math-specific large language models of our Qwen2 series.
verl: Volcano Engine Reinforcement Learning for LLMs
Integrate the DeepSeek API into popular softwares
DeepEP: an efficient expert-parallel communication library
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
Ongoing research training transformer models at scale
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"
Official code and data repository of MathChat: MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions
[ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data generation pipeline!
Train transformer language models with reinforcement learning.
Robust recipes to align language models with human and AI preferences
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"
[ACL-24 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"
Official implementation for the paper: "Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering""
Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)