-
Notifications
You must be signed in to change notification settings - Fork 1k
Handle unknown cluster during replication #3619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically we ignore unknown cluster during replication and still apply the task? Will this case any issue after the cluster that applied the replication task later come active. I mean, I know there will be errors, but I don't see how we can recover from that error other than connect those two clusters.
sourceCluster := clusterMetadata.ClusterNameForFailoverVersion(true, version) | ||
sourceCluster, err := clusterMetadata.ClusterNameForFailoverVersion(true, version) | ||
if err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail the entire get replication task call or just drop this replication task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will fail the apply replication task call and the replication stack will put this task into DLQ after some retries.
* Handle unknown cluster during replication
* Handle unknown cluster during replication
What changed?
Handle unknown cluster during replication
Why?
When replicate a workflow from cluster B to cluster C, we don't know if the workflow has any history events generate from a different cluster (e.g. cluster A). In this case, currently we can blindly apply this task rather than panic.
How did you test it?
Local test
Potential risks
Is hotfix candidate?
No