Fluentd output plugin to send check results to sensu-client, which is an agent process of Sensu monitoring framework.
<match ddos>
type sensu
# Connection settings
server localhost
port 3030
# Payload settings
## The check is named "ddos_detection"
check_name ddos_detection
## The severity is read from the field "level"
check_status_field level
</match>
Tye type of this plugin is sensu
.
Specify type sensu
in the match section.
server
(default is "localhost")- The IP address or the hostname of the host running sensu-client daemon.
port
(default is 3030)- The TCP port number of the Sensu client socket on which sensu-client daemon is listening.
The payload of a check result is a JSON object which contains attributes as follows. Attributes are indicated by JSONPath expressions.
$.name
- The check name to identify the check.
$.output
- An arbitrary string to describe the check result.
- This attribute is often used to contain metric values.
$.status
- The severity of the check result.
- 0 (OK), 1 (WARNING), 2 (CRITICAL) or 3 (UNKNOWN or CUSTOM).
The check result can also contain other attributes. This plugin supports the attributes below.
$.type
- Either "standard" or "metric".
- If the attribute is set to "standard", the sensu-server creates an event only when the status is not OK or when the status is changed to OK.
- If the attribute is set to "metric", the sensu-server creates an event even if the status is OK. It is useful when Sensu sends check results to metrics collectors such as Graphite.
$.ttl
- The time to live (TTL) in seconds, until the check result is considered stale. If TTL expires, sensu-server creates an event.
- This attribute is useful when you want to be notified when logs are not output for a certain period.
- Same as
freshness_threshold
in Nagios.
$.handlers
- The names of handlers which process events created for the check.
$.low_flap_threshold
and$.high_flap_threshold
- Threshold percentages to determine the status is considered "flapping," or the state is changed too frequently.
- Same as the options in Nagios. See the description of Flap Detection in Nagios.
$.source
- The source of the check, such as servers or network switches.
- If this attribute is not specified, the host of sensu-client is considered as the source.
$.executed
- The timestamp on which the check is executed.
- Note that there is also another timestamp attribute named
issued
, which is automatically measured by sensu-client process. Uchiwa, the default dashboard of Sensu, displaysissued
as the timestamp of check results.
This plugin additionally adds "fluentd" attribute to the check result. The value of the attribute is a JSON object whoes elements are input to the plugin.
$.fluentd.tag
- The tag of the Fluentd data.
$.fluentd.time
- The time of the Fluentd data, in seconds since the Unix epoch.
$.fluentd.record
- The record of the Fluentd data.
The check name is determined as below.
- The field specified by
check_name_field
option, if present and valid (highest priority)
- The valid values are strings composed of ASCII alphanumerics, underscores, periods, and hyphens.
- or
check_name
option, if present
- The valid values are same as above.
- or the tag name, if valid
- The valid values are same as above.
- or "fluent-plugin-sensu" (lowest priority)
The check output is determined as below.
- The field specified by
check_output_field
option, if present (highest priority) - or
check_output
option, if present - or JSON notation of the record (lowest priority)
The severity of the check result is determined as below.
- The field specified by
check_status_field
option, if present and permitted (highest priority)
- The values permitted to the field for each status (case insensitive):
- status 0: an integer
0
and strings"0"
,"OK"
- status 1: an integer
1
and strings"1"
,"WARNING"
,"warn"
- status 2: an integer
2
and strings"2"
,"CRITICAL"
,"crit"
- status 3: an integer
3
and strings"3"
,"UNKNOWN"
,"CUSTOM"
- status 0: an integer
- or
check_status
option, if present
- The permitted values for each status (case insensitive):
- status 0:
0
andOK
- status 1:
1
,WARNING
,warn
- status 2:
2
,CRITICAL
,crit
- status 3:
3
,UNKNOWN
,CUSTOM
- status 0:
- If the value is not permitted, it causes a configuration error.
- or
3
, which means UNKNOWN or CUSTOM (lowest priority)
"warn" and "crit" come from fluent-plugin-notifier.
The check type is determined as below.
check_type
option (highest priority)
- The value must be a string
"standard"
or"metric"
.
- or "standard" (lowest priority)
The TTL seconds till expiration is determined as below.
check_ttl
option (highest priority)
- The value must be an integer which represents the TTL seconds.
- or N/A (lowest priority)
- It means no expiration detection is performed.
The handlers which process check results are determined as below.
check_handlers
option (highest priority)
- The value must be an array of strings which represent handler names.
- or
["default"]
(lowest priority)
The threshold percentages for flap detection are determined as below.
check_low_flap_threshold
andcheck_high_flap_threshold
options (highest priority)
- The values must be integers of threshold percentages.
- or N/A (lowest priority)
- It means no flap detection is performed.
The two options either must be specified together, not specified at all.
If the options are specified,
the following condition must be true:
0 <= check_low_flap_threshold <= check_high_flap_threshold <= 100
.
The source of the checks is determined as below.
- The field specified by
check_source_field
option, if present and valid (highest priority) - or
check_source
option - or N/A (lowest priority)
- It means the host of sensu-client is considered as the check source.
The executed timestamp is determined as below.
- The field specified by
check_executed_field
if present and valid (highest priority)
- The value must be an integer which represents seconds since the Unix epoch.
- The time of the Fluentd record (lowest priority)
The default value of flush_interval
option is set to 1 second.
It means that check results are delayed at most 1 second
before being sent.
Except for flush_interval
,
the plugin uses default options
for buffered output plugins (defined in Fluent::BufferedOutput class).
You can override buffering options in the configuration. For example:
<match ddos>
type sensu
...snip...
buffer_type file
buffer_path /var/lib/fluentd/buffer/ddos
flush_interval 0.1
try_flush_interval 0.1
</match>
Assume you have a web server which runs:
- Apache HTTP server
- Fluentd
- sensu-client
- which listens to the TCP port 3030 for Sensu client socket.
You want to be notified when Apache responds too many server errors, for example 5 errors per minute as WARNING, and 50 errors per minute as CRITICAL.
The setting for Fluentd utilizes fluent-plugin-datacounter, fluent-plugin-record-reformer, and of course fluent-plugin-sensu. Install those plugins and add configuration as below.
# Parse Apache access log
<source>
type tail
tag access
format apache2
# The paths vary by setup
path /var/log/httpd/access_log
pos_file /var/lib/fluentd/pos/httpd-access_log.pos
</source>
# Count 5xx errors per minute
<match access>
type datacounter
tag count.access
unit minute
aggregate all
count_key code
pattern1 error ^5\d\d$
</match>
# Calculate the severity level
<match count.access>
type record_reformer
tag server_errors
enable_ruby true
<record>
level ${error_count < 5 ? 'OK' : error_count < 50 ? 'WARNING' : 'CRITICAL'}
</record>
</match>
# Send checks to sensu-client
<match server_errors>
type sensu
server localhost
port 3030
check_name server_errors
check_type standard
check_status_field level
check_ttl 100
</match>
The TTL is set to 100 seconds here, because the check must be sent for each 60 seconds, plus 40 seconds as a margin.
You can use record_transformer filter
instead of fluent-plugin-record-reformer
on Fluentd 0.12.0 and above.
If you are concerned with scalability, fluent-plugin-norikra may be a better option than datacounter and record_reformer.
Another alternative configuration for the use case is sending the error count to Graphite using fluent-plugin-graphite, and making Sensu monitor the value on Graphite with check-data.rb.
Install fluent-plugin-sensu
gem.
Submit an issue or a pull request.
Feedback to @miyakawa_taku on Twitter is also welcome.