MachineLearning-System
Popular repositories Loading
-
metaflow_sysml19
metaflow_sysml19 PublicForked from jiazhihao/metaflow_sysml19
Repository for SysML19 Artifacts Evaluation
C++
-
vllm
vllm PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
-
FlexFlow
FlexFlow PublicForked from flexflow/flexflow-train
A distributed deep learning framework that supports flexible parallelization strategies.
C++
-
-
nnfusion
nnfusion PublicForked from microsoft/nnfusion
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
C++
-
flash-llm
flash-llm PublicForked from AlibabaResearch/flash-llm
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
Cuda
Repositories
- vllm Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
MachineLearning-System/vllm’s past year of commit activity - streaming-llm Public Forked from mit-han-lab/streaming-llm
Efficient Streaming Language Models with Attention Sinks
MachineLearning-System/streaming-llm’s past year of commit activity - 24Eurosys-orion Public Forked from eth-easl/orion
An interference-aware scheduler for fine-grained GPU sharing
MachineLearning-System/24Eurosys-orion’s past year of commit activity - 24Eurosys-DynaPipe-Megatron-LM Public Forked from chenyu-jiang/Megatron-LM
Ongoing research training transformer models at scale
MachineLearning-System/24Eurosys-DynaPipe-Megatron-LM’s past year of commit activity - flash-llm Public Forked from AlibabaResearch/flash-llm
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
MachineLearning-System/flash-llm’s past year of commit activity - 23sosp-paella- Public Forked from eniac/paella
Paella: Low-latency Model Serving with Virtualized GPU Scheduling
MachineLearning-System/23sosp-paella-’s past year of commit activity - nnfusion Public Forked from microsoft/nnfusion
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
MachineLearning-System/nnfusion’s past year of commit activity - FlexFlow Public Forked from flexflow/flexflow-train
A distributed deep learning framework that supports flexible parallelization strategies.
MachineLearning-System/FlexFlow’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…