Stars
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A framework for few-shot evaluation of language models.
Building DeepSeek R1 from Scratch
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
An Open Large Reasoning Model for Real-World Solutions
A benchmark suite containing 1 million compilable programs, mined from the largest public C repositories on GitHub.
MutAP: A prompt_based learning technique to automatically generate test cases with Large Language Model
Decision Making in Non-Stationary Environments with Policy-Augmented Search
Bandit is a tool designed to find common security issues in Python code.
The official repo for the paper Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Generation (AAAI'24).
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
Reformer, the efficient Transformer, in Pytorch
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
Huggingface compatible implementation of RetNet (Retentive Networks, https://arxiv.org/pdf/2307.08621.pdf) including parallel, recurrent, and chunkwise forward.
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 37.3% tasks (pass@1) in SWE-bench lite and 46.2% tasks (pass@1) in SWE-bench verified with…