Starred repositories
Code for data-aware compression of DeepSeek models
🚀 One-stop solution for creating your digital avatar from chat history 💡 Fine-tune LLMs with your chat logs to capture your unique style, then bind to a chatbot to bring your digital self to life. …
Fully open reproduction of DeepSeek-R1
A simple CLI chatbot that demonstrates the integration of the Model Context Protocol (MCP).
Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.
A pedagogical implementation of Autograd
Benchmarking the serving capabilities of vLLM
FlashMLA: Efficient MLA decoding kernels
Serving CrewAI Agent as REST API with BentoML, optionally with self-host open-source LLMs
[ACL 2024] Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models
A survey and reflection on the latest research breakthroughs in LLM-generated Text detection, including data, detectors, metrics, current issues and future directions.
A modular graph-based Retrieval-Augmented Generation (RAG) system
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Build a Perplexity-Inspired Answer Engine Using Next.js, Groq, Llama-3, Langchain, OpenAI, Upstash, Brave & Serper
🤖 The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transf…
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
A framework for prompt tuning using Intent-based Prompt Calibration
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
📚A curated list of Awesome LLM Inference Papers with Codes.
Python library & examples for Masked Language Model Scoring (ACL 2020)