8000 Do not create ApplyPolicyFailed for conflicts · Issue #6267 · karmada-io/karmada · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Do not create ApplyPolicyFailed for conflicts #6267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
grosser opened this issue Apr 4, 2025 · 11 comments · May be fixed by #6349
Open

Do not create ApplyPolicyFailed for conflicts #6267

grosser opened this issue Apr 4, 2025 · 11 comments · May be fixed by #6349
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Milestone

Comments

@grosser
Copy link
Contributor
grosser commented Apr 4, 2025

What would you like to be added:
when ApplyPolicyFailed happens due to a conflict, either issue it as "Normal" or retry a few times first

Why is this needed:
this made our alerts spammy that alert on "Warn events" for all apps, and not random apps get these events.
We had to ignore the events, but ideally we want to still alert when "non-conflict" sync fails (were retry does not help and someone needs to fix something)

5m29s       Warning   ApplyPolicyFailed            configmap/foo                                Apply cluster policy(duplicate) failed: resourcebindings.work.karmada.io "foo" already exists

11m         Warning   ApplyPolicyFailed            deployment/foo                                Apply cluster policy(duplicate) failed: Operation cannot be fulfilled on deployments.apps "foo": the object has been modified; please apply your changes to the latest version and try again
@grosser grosser added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 4, 2025
@XiShanYongYe-Chang
Copy link
Member

Operation cannot be fulfilled on deployments.apps "foo": the object has been modified; please apply your changes to the latest version and try again

Hi @grosser, do you mean to minimize the incident by retrying?

@grosser
Copy link
Contributor Author
grosser commented Apr 8, 2025

yes, that would cut down on the noise a lot, maybe even eliminate it

@XiShanYongYe-Chang
Copy link
Member

Thanks @grosser

Can you help point out specific code changes and change logic? Others may also be concerned about this optimization.

@grosser
Copy link
Contributor Author
grosser commented Apr 9, 2025

here needs to ignore when the object already exists (I assume that is not the same as a conflict)

I did not find where the actual object update is happening, but if that does not have a retry on conflict then it should be added and it it already has one then either it needs to reload to avoid the conflict or have it's retry steps spaced out a bit more

@XiShanYongYe-Chang
Copy link
Member

I'm sorry, I didn't quite understand you.

Apply cluster policy(duplicate) failed: Operation cannot be fulfilled on deployments.apps "foo": the object has been modified; please apply your changes to the latest version and try again

You mean to reduce the report of this event or

Apply cluster policy(duplicate) failed: resourcebindings.work.karmada.io "foo" already exists

These two don't seem to be the same thing, they deal with different scenes.

@grosser
Copy link
Contributor Author
grosser commented Apr 10, 2025

Yes correct, might be better as separate issues, I was trying to fix ApplyPolicyFailed events and found these 2 examples.
We can focus on the already exists here since that seems more obvious, then tackle the other one after.

@XiShanYongYe-Chang
Copy link
Member

Okay, let's focus on one of them first

@grosser
Copy link
Contributor Author
grosser commented Apr 16, 2025

the "already exists" seems easy, would you agree that the fix would be to ignore if an already exist error occurs here

@XiShanYongYe-Chang
Copy link
Member

the "already exists" seems easy, would you agree that the fix would be to ignore if an already exist error occurs here

I think the retry here should not be "already exists", it should be an update conflict. That doesn't seem to match your description.

@grosser
Copy link
Contributor Author
grosser commented Apr 18, 2025

yeah don't know then, I'd leave the issue open for others to find, but don't have time atm to dig deeper

@RainbowMango
Copy link
Member

this made our alerts spammy that alert on "Warn events" for all apps, and not random apps get these events.

Yeah, this is indeed not good.
But, if a conflict requires a retry and the retry eventually succeeds, wouldn't the warning be cleared by the subsequent ApplyPolicySucceed event?

Anyway, who can analyze the root cause? Where exactly does the conflict come from?

@RainbowMango RainbowMango added this to the v1.14 milestone May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

3 participants
0