8000 xcwanAndy (xcwan) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View xcwanAndy's full-sized avatar

Highlights

  • Pro

Organizations

@HKUST-SING

Block or report xcwanAndy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Large Language Model (LLM) Systems Paper List

1,234 69 Updated May 17, 2025

Distributed Triton for Parallel Systems

Python 748 48 Updated May 23, 2025
Python 5 Updated Mar 25, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 4,082 374 Updated May 23, 2025

Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on accelerating LLMs, currently focusing mainly on infer…

251 12 Updated Mar 6, 2025

No fortress, purely open ground. OpenManus is Coming.

Python 45,966 8,014 Updated May 20, 2025

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 16,597 1,950 Updated May 23, 2025

Python tool for converting files and office documents to Markdown.

Python 57,794 2,963 Updated May 21, 2025

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,639 415 Updated Mar 5, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 8,911 886 Updated May 21, 2025

Analyze computation-communication overlap in V3/R1.

1,039 142 Updated Mar 21, 2025

Expert Parallelism Load Balancer

Python 1,195 190 Updated Mar 24, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,781 295 Updated Mar 10, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,369 598 Updated May 20, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,682 772 Updated May 23, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,564 835 Updated Apr 29, 2025

Official Repo for Open-Reasoner-Zero

Python 1,928 99 Updated Apr 8, 2025

A throughput-oriented high-performance serving framework for LLMs

Cuda 812 37 Updated May 10, 2025

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

JavaScript 95,796 12,366 Updated May 22, 2025

Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.

Go 141,398 11,840 Updated May 23, 2025

Fully open reproduction of DeepSeek-R1

Python 24,523 2,258 Updated May 22, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,291 2,235 Updated Feb 1, 2025

A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.

TypeScript 14,122 1,165 Updated May 23, 2025

A general and accurate MACs / FLOPs profiler for PyTorch models

Python 611 42 Updated May 5, 2024

2025中国翻墙软件VPN推荐以及科学上网避坑,稳定好用。对比SSR机场、蓝灯、V2ray、老王VPN、VPS搭建梯子等科学上网与翻墙软件,中国最新科学上网翻墙梯子VPN下载推荐,访问Chatgpt。

HTML 17,492 1,541 Updated May 8, 2025

🙌 OpenHands: Code Less, Make More

Python 55,337 6,223 Updated May 23, 2025

A simple, performant and scalable Jax LLM!

Python 1,728 350 Updated May 23, 2025

New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos

Jupyter Notebook 7,987 512 Updated Apr 29, 2025
Next
0