8000 [Proposal] Enhance default routing and Placement Strategy using Directory Partitioning Scheme · Issue #9426 · dotnet/orleans · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Proposal] Enhance default routing and Placement Strategy using Directory Partitioning Scheme #9426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ReubenBond opened this issue Apr 7, 2025 · 0 comments

Comments

@ReubenBond
Copy link
Member
ReubenBond commented Apr 7, 2025

Current client routing (sticky bucketing) and default grain placement strategies can cause unnecessary network hops, increasing activation latency and recovery churn, as they aren't optimally aligned with the grain's directory partition location.

This proposal has two parts. The first part is to change the default placement to ResourceOptimizedPlacement and update that director so that it has 4 tiers instead of three:

  • PreferLocal if local silo is within load tolerance, otherwise
  • Place on the silo which hosts the grain's directory partition if that silo is within tolerance, otherwise
  • Place on the least-loaded silo (Power of K choices), otherwise
  • Place on a random compatible silo, otherwise (fallback for the case where we have no resource usage stats)

This optimizes for good locality between the caller & callee, and callee and directory, while still keeping the system balanced.

The second part of the idea capitalizes on the first by making clients route to the gateway corresponding to the grain's directory partition. It involves us removing the sticky bucketing system that we currently use and adding functionality to subscribe to membership changes from the cluster (eg, via an observer callback).

This should result in the majority of calls requiring no network round trips to the directory if they are submitted by an external client. Calls originating from a silo or grain can take advantage of locality (if within tolerance) and otherwise are co-located with their directory partition for improved activation perf (while still being much better balanced than RandomPlacement, since the directory partitioning scheme has a very low chance of significant imbalance). It also reduces the grain directory recovery costs in the case of a silo crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0