8000 feat(dynamo-run): vllm and sglang subprocess engines by grahamking · Pull Request #954 · ai-dynamo/dynamo · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat(dynamo-run): vllm and sglang subprocess engines #954

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 6, 2025

Conversation

grahamking
Copy link
Contributor
@grahamking grahamking commented May 6, 2025

New vllm and sglang engines that run in a sub-process. Will hopefully replace the existing embedded python engines.

Why?

  • Pure Python, does not require knowing Rust to work on it. Much simpler to maintain.
  • No embedded Python interpreter which avoids linking libpython and avoids the MacOS virtualenv issues.
  • Should have better performance as it's "native" vllm / sglang.
  • Works with any version of vllm (including v1!) and sglang. Less upgrade struggle.

@grahamking grahamking force-pushed the gk-vllm-subprocess branch from 9fb2f04 to 0a30b3c Compare May 6, 2025 18:14
@grahamking grahamking changed the title feat(dynamo-run): vllm subprocess engine feat(dynamo-run): vllm and sglang subprocess engines May 6, 2025
New vllm and sglang engines that runs in a sub-process. Will hopefully
replace the existing embedded python engines.

Why?

- Pure Python, does not require knowing Rust to work on it. Much simpler
  to maintain.
- No embedded Python interpreter which avoids linking libpython and
  avoids the MacOS virtualenv issues.
- Should have better performance as it's "native" vllm / sglang.
- Works with any version of vllm (including v1!) and sglang. Less
  upgrade struggle.
Copy link
Contributor
@rmccorm4 rmccorm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sglang snuck in there since last review!

@rmccorm4
Copy link
Contributor
rmccorm4 commented May 6, 2025

Probably want to see what can be common/shared between the examples instead of updating each the same way moving forward for a 3rd+ worker example (ex: a new argparse arg or something)

@grahamking
Copy link
Contributor Author

Sglang snuck in there since last review!

Hehe. The way our review process works is that I get max one PR per day, so needs must :-)

@grahamking
Copy link
Contributor Author

Probably want to see what can be common/shared between the examples instead of updating each the same way moving forward for a 3rd+ worker example (ex: a new argparse arg or something)

It's a bit tricky right now because those Python scripts are written to a temp file and executed as python3 <tempfile>. There's no simple way for them to share code.

@grahamking grahamking merged commit 28fd481 into main May 6, 2025
11 of 12 checks passed
@rmccorm4
Copy link
Contributor
rmccorm4 commented May 6, 2025

Just curious - why the intermediate temp file? Why not execute the file directly? Couldn't this break slightly more complicated files that depend on relative imports colocated to it?

(There's a WAR for that case via PYTHONPATH, but keeping it simple)

@grahamking
Copy link
Contributor Author
grahamking commented May 6, 2025

Just curious - why the intermediate temp file? Why not execute the file directly? Couldn't this break slightly more complicated files that depend on relative imports colocated to it?

(There's a WAR for that case via PYTHONPATH, but keeping it simple)

There is no file. We do pub const PY: &str = include_str!("vllm_inc.py"); which is a compile time include. The whole Python program is a const string.

And yes, absolutely, the engine has to be a single self-contained file. I'm hoping to get the trt-llm example cleaned up enough to fit within this model.

Theout=vllm and out=sglang are just syntactic sugar for python sglang_inc.py --params .... It means we don't break people's scripts when we retire the old sglang and vllm engines. Serious deployments would copy the python script, change it, and run it directly.

@grahamking grahamking mentioned this pull request May 6, 2025
1 task
grahamking added a commit that referenced this pull request May 6, 2025
What a cleanup! A breath of fresh air.

vllm and sglang are now the sub-process engines from #954

This means unless you build with `--feature python`, dynamo-run does not
link `libpython`.
grahamking added a commit that referenced this pull request May 6, 2025
What a cleanup! A breath of fresh air.

vllm and sglang are now the sub-process engines from #954

This means unless you build with `--feature python`, dynamo-run does not
link `libpython`.
grahamking added a commit that referenced this pull request May 6, 2025
What a cleanup! A breath of fresh air.

vllm and sglang are now the sub-process engines from #954

This means unless you build with `--feature python`, dynamo-run does not
link `libpython`.
grahamking added a commit that referenced this pull request May 7, 2025
What a cleanup! A breath of fresh air.

vllm and sglang are now the sub-process engines from #954

This means unless you build with `--feature python`, dynamo-run does not
link `libpython`.
grahamking added a commit that referenced this pull request May 7, 2025
What a cleanup! A breath of fresh air.

vllm and sglang are now the sub-process engines from #954

This means unless you build with `--feature python`, dynamo-run does not
link `libpython`.
grahamking added a commit that referenced this pull request May 7, 2025
vllm and sglang are now the sub-process engines from #954

Also updated docs on doing vllm and sglang multi-gpu (tensor parallel) and multi-node (pipeline parallel).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0