8000 [Test][Autoscaler] deflaky unexpected dead actors in tests by more resources by rueian · Pull Request #3728 · ray-project/kuberay · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Test][Autoscaler] deflaky unexpected dead actors in tests by more resources #3728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rueian
Copy link
Contributor
@rueian rueian commented Jun 1, 2025

Why are these changes needed?

Use more resources to deflaky. This can pass the flaky test 200 times without failures on my mac.

Related issue number

Closes #3701

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

…sources

Signed-off-by: Rueian <rueiancsie@gmail.com>
@rueian rueian changed the title [Test][Autoscaler] deflaky unexpected dead actors in tests by more re… [Test][Autoscaler] deflaky unexpected dead actors in tests by more resources Jun 1, 2025
@rueian rueian marked this pull request as ready for review June 1, 2025 23:36
@rueian rueian mentioned this pull request Jun 1, 2025
2 tasks
@kevin85421
Copy link
Member
kevin85421 commented Jun 2, 2025

Why did you conclude that the flakiness is due to resource issues? The resource configuration is unexpectedly high compared to the configuration before #3707.

This makes me feel these PRs are hot fix instead of fixing the real root causes.

@rueian
Copy link
Contributor Author
rueian commented Jun 2, 2025

Why did you conclude that the flakiness is due to resource issues? The resource configuration is unexpectedly high compared to the configuration before #3707.

This makes me feel these PRs are hot fix instead of fixing the real root causes.

#3707 has already resolved the unexpected actor exit issue on my MacBook. However, the problem still occurs on the Buildkite CI runners. I’m wondering if this could be due to the runners having lower performance. What we could try now is increasing the resource requirement once again for the CI runners.

Yes, you are right. If this still doesn't fix the flakiness on the CI runners, we will need to find another way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] Deflaky Autoscaler V2 e2e tests
2 participants
0