8000 Handle unknown cluster during replication by yux0 · Pull Request #3619 · temporalio/temporal · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Handle unknown cluster during replication #3619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 18, 2022
Merged

Conversation

yux0
Copy link
Contributor
@yux0 yux0 commented Nov 18, 2022

What changed?
Handle unknown cluster during replication

Why?
When replicate a workflow from cluster B to cluster C, we don't know if the workflow has any history events generate from a different cluster (e.g. cluster A). In this case, currently we can blindly apply this task rather than panic.

How did you test it?
Local test

Potential risks

Is hotfix candidate?
No

@yux0 yux0 requested review from meiliang86 and yycptt November 18, 2022 06:27
@yux0 yux0 requested a review from a team as a code owner November 18, 2022 06:27
Copy link
Member
@yycptt yycptt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically we ignore unknown cluster during replication and still apply the task? Will this case any issue after the cluster that applied the replication task later come active. I mean, I know there will be errors, but I don't see how we can recover from that error other than connect those two clusters.

sourceCluster := clusterMetadata.ClusterNameForFailoverVersion(true, version)
sourceCluster, err := clusterMetadata.ClusterNameForFailoverVersion(true, version)
if err != nil {
return nil, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail the entire get replication task call or just drop this replication task?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will fail the apply replication task call and the replication stack will put this task into DLQ after some retries.

@yux0 yux0 merged commit 4483f10 into temporalio:master Nov 18, 2022
@yux0 yux0 deleted the unknown-cluster branch November 18, 2022 23:02
meiliang86 pushed a commit that referenced this pull request Nov 22, 2022
* Handle unknown cluster during replication
meiliang86 pushed a commit that referenced this pull request Nov 28, 2022
* Handle unknown cluster during replication
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0