8000 GitHub - osayamenja/Kleos: Complete GPU residency for ML.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

osayamenja/Kleos

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kleos Conceptual Overview

Complete, efficient GPU residency for Distributed Machine Learning.


🌹Kleos: GPU-Resident Runtime for ML

Kleos is an ongoing research project exploring the design of GPU-resident infrastructure for distributed machine learning workloads. The goal is to eliminate CPU bottlenecks by fusing scheduling, communication, and compute directly on the GPU using lightweight, asynchronous primitives.

🎯 Vision and Methods

We aim to achieve complete, efficient GPU residency. We investigate optimizations that approach hardware peak performance for both distributed and single GPU workloads, where bulk-synchronous, CPU-driven orchestration is a limiting factor. To attain this vision, we employ kernel fusion enabled by (1) algorithmic innovations with strong theoretical footing and (2) principled systems design and implementation.

This repository represents a very early-stage release of Kleos infrastructure.


🗞️ News

  • June 5, 2025 — ⚡️Introducing FlashDMoE, a fused GPU kernel for distributed MoE execution.
    ➤ See this README for details, benchmarks, and usage.

⚖️ License

This project is licensed under the BSD 3-Clause License. See LICENSE for full terms.

About

Complete GPU residency for ML.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0