8000 docs: R1 disaggregation guide by GuanLuo · Pull Request #720 · ai-dynamo/dynamo · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

docs: R1 disaggregation guide #720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 22, 2025
Merged

docs: R1 disaggregation guide #720

merged 4 commits into from
Apr 22, 2025

Conversation

GuanLuo
Copy link
Contributor
@GuanLuo GuanLuo commented Apr 17, 2025

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Copy link
copy-pr-bot bot commented Apr 17, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@pull-request-size pull-request-size bot 8000 added the size/M label Apr 17, 2025
@tlipoca9
Copy link
Contributor

Has the issue #513 been resolved?

Base automatically changed from gluo/multi-node-aggregated to main April 18, 2025 22:09
@pull-request-size pull-request-size bot added size/L and removed size/M labels Apr 18, 2025
@GuanLuo GuanLuo force-pushed the gluo/multi-node-disagg branch from b8b7b80 to c962b09 Compare April 21, 2025 17:31
@pull-request-size pull-request-size bot added size/M and removed size/L labels Apr 21, 2025
@GuanLuo GuanLuo changed the title docs: R1 disaggregation [WIP] docs: R1 disaggregation guide Apr 21, 2025
@GuanLuo GuanLuo marked this pull request as ready for review April 21, 2025 17:32
@GuanLuo
Copy link
Contributor Author
GuanLuo commented Apr 21, 2025

Has the issue #513 been resolved?

@tlipoca9 The issue is not fully resolved that PP = 1 is still a current limitation. But you may deploy large model by splitting the model with purely TP, i.e. this example uses TP = 16.

@GuanLuo GuanLuo merged commit e06bfd5 into main Apr 22, 2025
6 checks passed
@GuanLuo GuanLuo deleted the gluo/multi-node-disagg branch April 22, 2025 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0