-
Jane Street
- New York
- https://fanpu.io
- @FanPu_Zeng
Stars
- All languages
- Agda
- Assembly
- C
- C#
- C++
- CSS
- Clojure
- CodeQL
- CoffeeScript
- Coq
- Cuda
- Dart
- Dockerfile
- Elixir
- Emacs Lisp
- Erlang
- Gherkin
- Go
- Groovy
- HCL
- HTML
- Hack
- Haskell
- Java
- JavaScript
- Jinja
- Jsonnet
- Julia
- Jupyter Notebook
- Kotlin
- Lean
- Lua
- MDX
- MLIR
- Makefile
- Max
- OCaml
- Objective-C
- Objective-C++
- PHP
- Pascal
- Perl
- PowerShell
- Python
- QML
- R
- Reason
- Ren'Py
- Roff
- Ruby
- Rust
- SCSS
- 8000 SWIG
- Shell
- Standard ML
- Stylus
- Swift
- TeX
- TypeScript
- Vala
- Vim Script
- X10
- XSLT
- YARA
The official implementation of Self-Play Fine-Tuning (SPIN)
A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.
A quick guide (especially) for trending instruction finetuning datasets
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
A toolkit to run Ray applications on Kubernetes
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
A flexible, high-performance serving system for machine learning models
Code for CRATE (Coding RAte reduction TransformEr).
[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)
Official Repository of Absolute Zero Reasoner
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
OLMoE: Open Mixture-of-Experts Language Models
[ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding
Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"
Efficient Triton Kernels for LLM Training