Alice health monitoring #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Request to add a new section, under Dashboard, to manage Nodes. Overall we need a good mechanism to handle Node maintenance, without involving AppSupport. This can also help with integration with load balancers, such as F5.
The features requested:
(1) Health OK / Not OK
This should be automatic, but it should be settable via manual override, via a UI checkbox. If a Node is OK, health is OK, if the Node is down health is 404.
Health is provided via the following example URL: http://np143.wc1.yellowpages.com:5671/health
The purpose is to notifiy external load-balancers, such as F5, that traffic should not be routed to this node. F5 can be configured to check both a URL and a TCP port to determine if a monitored service is available.
So, an admin user might set this to "Not OK" in order to bring the node down for maintenance in a graceful manner. Existing AMQP clients would continue to stay connected until they complete their task. But the F5 would not direct new clients to connect to this node.
Always though, if the AMQP service dies, then Alice should detect this condition and also set the Health to Not OK.
(2) Node Up / Down
This should trigger the rabbitmqctl stop_app and start_app methods. These manually start and stop the AMQP service itself on port 5672. Erlang continues to run. Also, if an AMQP port is down, the health check should automatically get set to "Not OK"
(3) Node traversal
The UI should provide html links, which make it easy to traverse from Node to Node.
The text was updated successfully, but these errors were encountered: