8000 guopeng-gpli (guopeng li) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View guopeng-gpli's full-sized avatar

Highlights

  • Pro

Block or report guopeng-gpli

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Framework for AI on mobile devices and wearables, hardware-aware C/C++ backend, with wrappers for Kotlin, Java, Swift, React, Flutter.

C++ 338 15 Updated May 17, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 8,885 883 Updated May 7, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,060 986 Updated May 17, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,548 834 Updated Apr 29, 2025

Huawei Cloud datasets

Jupyter Notebook 68 11 Updated Apr 15, 2025

TensorZero creates a feedback loop for optimizing LLM applications — turning production data into smarter, faster, and cheaper models.

Rust 4,176 276 Updated May 17, 2025

Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]

HTML 22 3 Updated May 13, 2025

An GPU/CUDA implementation of the Hungarian algorithm

Cuda 111 19 Updated Apr 12, 2019

Redis for LLMs

Python 1,058 159 Updated May 17, 2025

Beginner-friendly serverless LLM deployment with Replicate & fly.io

Python 13 2 Updated Sep 3, 2023

Caribou is a framework for geo-distributed deployment of serverless workflows to save carbon emissions.

Python 8 3 Updated May 4, 2025

ustc thesis proposal 中国科学技术大学 开题报告 latex 模板

TeX 23 3 Updated Dec 26, 2019

Code for reproducing results for SOSP paper Bagpipe

Python 9 3 Updated Oct 20, 2023

Efficient and easy multi-instance LLM serving

Python 413 31 Updated May 16, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, Parallelism, MLA, etc.

Python 4,000 277 Updated May 15, 2025

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)

Python 12,076 1,115 Updated May 12, 2025

基于pytorch的中文意图识别和槽位填充

Python 174 27 Updated Jul 3, 2024

Awesome Mobile LLMs

185 12 Updated Mar 23, 2025

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 97,533 14,617 Updated May 17, 2025

BERT-based intent and slots detector for chatbots.

Python 180 27 Updated Feb 21, 2025

A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems

Python 167 9 Updated Oct 15, 2024

a curated list of high-quality papers on resource-efficient LLMs 🌱

120 7 Updated Mar 15, 2025

Serverless LLM Serving for Everyone.

Python 466 42 Updated Apr 24, 2025

system paper reading notes

244 13 Updated Mar 3, 2022

Large Language Model (LLM) Systems Paper List

1,223 69 Updated May 17, 2025

A curated list for Efficient Large Language Models

Python 1,660 134 Updated Apr 23, 2025

Semantic Kernel (SK) is a lightweight SDK enabling integration of AI Large Language Models (LLMs) with conventional programming languages.

Mermaid 216 126 Updated May 16, 2025

🚀 Docker 镜像代理,通过 GitHub Actions 将 docker.io、gcr.io、registry.k8s.io、k8s.gcr.io、quay.io、ghcr.io 等国外镜像转换为国内镜像加速下载

Go 1,088 679 Updated Feb 25, 2025

Secure Transformer Inference is a protocol for serving Transformer-based models securely.

Python 92 22 Updated May 8, 2024

Integrate cutting-edge LLM technology quickly and easily into your apps

C# 24,480 3,824 Updated May 17, 2025
Next
0