Stars
Lord of Large Language and Multi modal Systems Web User Interface
LostRuins / koboldcpp
Forked from ggml-org/llama.cppRun GGUF models easily with a KoboldAI UI. One File. Zero Install.
LLM UI with advanced features, easy setup, and multiple backend support.
Official implementation for 'Extending LLMs’ Context Window with 100 Samples'
[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs
Modeling, training, eval, and inference code for OLMo
An innovative library for efficient LLM inference via low-bit quantization
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
High-speed Large Language Model Serving for Local Deployment
Letta (formerly MemGPT) is the stateful agents framework with memory, reasoning, and context management.
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval cap…
Tools for merging pretrained large language models.
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
Explore large language models in 512MB of RAM