8000 aksarben09's list / llm · GitHub

More Web Proxy on the site http://driver.im/

aksarben09

Follow

aksarben aksarben09

Follow

7 followers · 14 following

Aksarben

Achievements

Achievements

Highlights

Pro

Stars

llm

Large Language Models

12 repositories

usyd-fsalab / fp6_llm

An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).

Cuda 249 17 Updated Oct 28, 2024

87F3 lucidrains / nGPT-pytorch

Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI

Python 281 21 Updated Mar 19, 2025

BorealisAI / flora-opt

This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.

Python 102 5 Updated Jul 1, 2024

albertan017 / LLM4Decompile

Reverse Engineering: Decompiling Binary Code with Large Language Models

Python 5,604 379 Updated May 23, 2025

microsoft / BitNet

Official inference framework for 1-bit LLMs

Python 19,814 1,480 Updated May 23, 2025

chigkim / Ollama-MMLU-Pro

Jupyter Notebook 90 16 Updated Dec 23, 2024

pytorch / torchchat

Run PyTorch LLMs locally on servers, desktop and mobile

Python 3,583 249 Updated May 20, 2025

philipturner / metal-flash-attention

FlashAttention (Metal Port)

Swift 486 27 Updated Sep 22, 2024

GATECH-EIC / ShiftAddLLM

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

Python 107 15 Updated Oct 15, 2024

google / gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.

C++ 6,435 548 Updated May 22, 2025

ztxz16 / fastllm

fastllm是c++实现，后端无依赖（仅依赖CUDA，无需依赖PyTorch）的高性能大模型推理库。可实现单4090推理DeepSeek R1 671B INT4模型，单路可达20+tps。

C++ 3,574 366 Updated May 20, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,553 1,448 Updated May 25, 2025

0