-
Notifications
You must be signed in to change notification settings - Fork 8
Add config to escalate notifications #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
in | ||
Lwt.return [ Slack.generate_failed_build_notification ?slack_user_id ~failed_steps n channel ]) | ||
Lwt.return [ Slack.generate_failed_build_notification ~slack_ids ~cfg ~failed_steps n channel ]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lwt.return [ Slack.generate_failed_build_notification ~slack_ids ~cfg ~failed_steps n channel ]) | |
let slack_ids = List.filter_map Fun.id slack_ids in | |
Lwt.return [ Slack.generate_failed_build_notification ~slack_ids ~cfg ~failed_steps n channel ]) |
and then remove the filter in slack.ml
I think it makes more sense to filter before and have a simpler type for the generate_failed_build_notification
function.
lib/slack.ml
Outdated
Util.Webhook.get_escalation_threshold cfg n | ||
|> Option.map_default | ||
(fun escalation_threshold -> | ||
FailedStepSet.exists | ||
(fun step -> | ||
(* escalate steps that have been broken for longer than the threshold *) | ||
Util.Webhook.is_past_span step.created_at escalation_threshold) | ||
failed_steps) | ||
false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Util.Webhook.get_escalation_threshold cfg n | |
|> Option.map_default | |
(fun escalation_threshold -> | |
FailedStepSet.exists | |
(fun step -> | |
(* escalate steps that have been broken for longer than the threshold *) | |
Util.Webhook.is_past_span step.created_at escalation_threshold) | |
failed_steps) | |
false | |
match Util.Webhook.get_escalation_threshold cfg n with | |
| None -> false | |
| Some escalation_threshold -> | |
FailedStepSet.exists | |
(fun step -> | |
(* escalate steps that have been broken for longer than the threshold *) | |
Util.Webhook.is_past_span step.created_at escalation_threshold) | |
failed_steps |
a bit simpler to write it plainer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm always on the fence on one or the other.. I thought about it, and decided for the Option version, but I think it's good to go with the match option. Thanks
let one_hour = Ptime.Span.of_int_s (60 * 60) in | ||
is_past_span escalated_at one_hour) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this constant be part of the config instead? Like reescalate_notification_delay
or something like that? It seems quite specific to some use-case (e.g., some teams might work on more like a day schedule).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, maybe it should, but i want to avoid flooding the config with options.
I think this is a sensible default: you have a first escalation warning after a certain period of time that you consider a bit too long for the build to have been broken and unattended, and then every extra hour you notify again when you have new builds that are broken due to the original author not fixing the issues.
In any case it's better to test first and tweak afterwards. You know the old "how you use your product/how users use the product" meme.. 🤷♂️
Description of the task
This change allows users to configure a time threshold value for monorobot to insist on notifying broken steps.
If configured, when a failed build notification is received, monorobot will check if any of the steps that are broken in that build (old and new failed steps) has been marked has broken for longer than the allowed threshold. If so, it will notify again, even if no new step has been broken, showing a notification saying that the
build is still failing
and the @mentions of the different users that broke steps that qualify for notification.