8000 Fix the recovery issue in kubernetes by stiffme · Pull Request #126 · aembke/fred.rs · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix the recovery issue in kubernetes #126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Fix the recovery issue in kubernetes #126

wants to merge 1 commit into from

Conversation

stiffme
Copy link
Contributor
@stiffme stiffme commented Apr 27, 2023

The fred.rs sometimes can't recovery when connecting to a redis cluster inside kubernetes. This issue could be reproduced using following command in k8s:

kubectl scale statefuleset redis --replicas 0
kubectl scale statefuleset redis --replicas 8

When redis cluster recovered, fred.rs sometimes will failed to connect. I've implement the DNS interface. Through the logs, it seems fred.rs still connects to old IPs of the redis pods and tries to get the redis cluster nodes but got NULL.

The fix of this issue is to release the backchannel if the result is null, in this case, fred.rs will use the configured redis URL, (in our case it is the FQDN of the k8s service), and do DNS query. The result of the query will be one of the new IPs of the new redis Pod. And then fred.rs could establish a new connection to the new IP to get the up-to-date cluster nodes.

I've implement this change in our testing tools, and I've tested it with traffic for a long time. It always recovers now.

The fred.rs sometimes can't recovery when connecting to a redis cluster inside kubernetes.
This issue could be reproduced using following command in k8s:

kubectl scale statefuleset redis --replicas 0
kubectl scale statefuleset redis --replicas 8

When redis cluster recovered, fred.rs sometimes will failed to connect.
I've implement the DNS interface. Through the logs, it seems fred.rs still connects to old IPs of the redis pods and tries to get the redis cluster nodes but got NULL.

The fix of this issue is to release the backchannel if the result is null, in this case, fred.rs will use the configured redis URL, (in our case it is the FQDN of the k8s service), and do DNS query. The result of the query will be one of the new IPs of the new redis Pod. And then fred.rs could establish a new connection to the new IP to get the up-to-date cluster nodes.

I've implement this change in our testing tools, and I've tested it with traffic for a long time. It always recovers now.
@aembke
Copy link
Owner
aembke commented Apr 27, 2023

This is excellent, thank you. I'll get a patch up with these changes by the end of the weekend.

@aembke aembke mentioned this pull request May 1, 2023
@aembke
Copy link
Owner
aembke commented May 1, 2023

I'll merge this into main via #127

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0