8000 Release Dynamo Release v0.2.1 · ai-dynamo/dynamo · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Dynamo Release v0.2.1

Latest
Compare
Choose a tag to compare
@nnshah1 nnshah1 released this 22 May 23:45
b950ec5

Dynamo is an open source project with Apache 2 license. The primary distribution is done via pip wheels with minimal binary size. The ai-dynamo github org hosts 2 repos: dynamo and NIXL. Dynamo is designed as the ideal next generation inference server, building upon the foundations of the Triton Inference Server. While Triton focuses on single-node inference deployments, we are committed to integrating its robust single-node capabilities into Dynamo within the next several months. We will maintain ongoing support for Triton while ensuring a seamless migration path for existing users to Dynamo once feature parity is achieved. As a vendor-agnostic serving framework, Dynamo supports multiple LLM inference engines including TRT-LLM, vLLM, and SGLang, with varying degrees of maturity and support.

Dynamo v0.2.1 features:

  • KV Block Manager! intro
  • Improved vLLM Performance by avoiding re-initializing sampling params
  • SGLang support! README.md
  • Multi-Modal E/P/D Disaggregation! README.md
  • Leader Worker Set K8s!
  • Qwen3, Gemma3 and Llama4 in Dynamo Run!

Future plans

Dynamo Roadmap

Known Issues

  • Benchmark guides are still being validated on public cloud instances (GCP / AWS)

What's Changed

🚀 Features & Improvements

🐛 Bug Fixes

  • fix: Extract tokenizer from GGUF for Qwen3 and Gemma3 arch by @grahamking in #1011

Other Changes

New Contributors

Full Changelog: v0.2.0...v0.2.1

0