Description
Checklist:
- [x ] I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
- [x ] I've included steps to 6A48 reproduce the bug.
- [ x] I've pasted the output of
argocd version
.
Describe the bug
Argocd version: v2.12.4+27d1e64
if you install any CRDs on the clusters with conversion webhooks, and the conversion webhook is down, then all applications on the cluster go to an Unknown or an error state:
Failed to load target state: failed to get cluster version for cluster "": failed to get cluster info for """: error synchronizing cache state : failed to sync cluster ": failed to load initial state of resource BucketServerSideEncryptionConfiguration.s3.aws.upbound.io: conversion webhook for s3.aws.upbound.io/v1beta1, Kind=BucketServerSideEncryptionConfiguration failed: Post "https://provider-aws-s3.crossplane-system.svc:9443/convert?timeout=30s": no endpoints available for service "provider-aws-s3"
If I have SSA on, the UI just gets stuck in "refreshing" and there's a nil pointer exception in the logs.
time="2024-11-18T14:19:18Z" level=error msg="Recovered from panic: runtime error: invalid memory address or nil pointer dereference
goroutine 294 [running]:
runtime/debug.Stack()
/usr/local/go/src/runtime/debug/stack.go:24 +0x5e
github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).processAppRefreshQueueItem.func1()
/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:1480 +0x54
panic({0x382cd20?, 0x7756330?})
/usr/local/go/src/runtime/panic.go:770 +0x132
github.com/argoproj/argo-cd/v2/controller.(*appStateManager).CompareAppState(0xc00055cd20, 0xc0dae6a408, 0xc0a7114488, {0xc0a792d6c0, 0x1, 0x1}, {0xc0a7920700, 0x1, 0x1}, 0x0, ...)
/go/src/github.com/argoproj/argo-cd/controller/state.go:864 +0x5ff9
github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).processAppRefreshQueueItem(0xc0004dec40)
/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:1590 +0x1188
github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).Run.func3()
/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:830 +0x25
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)
/go/pkg/mod/k8s.io/apimachinery@v0.29.6/pkg/util/wait/backoff.go:226 +0x33
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000636b00, {0x5555d00, 0xc001cec2a0}, 0x1, 0xc000081f80)
/go/pkg/mod/k8s.io/apimachinery@v0.29.6/pkg/util/wait/backoff.go:227 +0xaf
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000636b00, 0x3b9aca00, 0x0, 0x1, 0xc000081f80)
/go/pkg/mod/k8s.io/apimachinery@v0.29.6/pkg/util/wait/backoff.go:204 +0x7f
k8s.io/apimachinery/pkg/util/wait.Until(...)
/go/pkg/mod/k8s.io/apimachinery@v0.29.6/pkg/util/wait/backoff.go:161
created by github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).Run in goroutine 112
/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:829 +0x865
To Reproduce
Install a CRD with a conversion webhook that goes to an unavailable endpoint.
Expected behavior
I'm not sure what the expected behavior should be. I don't think there should be a NPE when it happens in SSA at the very least.
It would be nice to be able to exclude those resources on an app by app basis, or be able to skip any resources that aren't included in the application? It basically means that if I need to do a new sync to fix the webhook, I can't really do it.
Screenshots
Version
Paste the output from `argocd version` here.
Argocd version: v2.12.4+27d1e64
Logs
Paste any relevant application logs here.