fluent-plugin-histogram, a plugin for Fluentd
Fluentd output plugin.
Count up input keys, and make scalable and rough histogram to help detecting hotspot problems.
"Scalable rough histogram" fit for cases there are an enormous variety of keys.
We refered "Strauss, O.: Rough histograms for robust statistics, Pattern Recogniti, 2000. Proceedings. 15th International Conference on (Volume:2)" for "rough histogram".
In this approarch, a increment unit is not one value(.
), increment some values like this shape △
.
To use this, please set alpha >= 1
(default 1) option in fluent.conf.
Moreover, we optimized that histogram for enormous variety of keys by fix histogram width.
To use this, please set bin_num
(default 100) in fluent.conf.
Be careful, our plugin's output histogram is not correct count-up results about provided data. But this plugin can scale out - can handle 25,000 records/sec inputs data -, and that output histogram is enough to use for detecting a hotspot problem.
if run below commands,
$ echo '{"keys":"a key"}' | fluent-cat input.sample
$ echo '{"keys":["one", "two", "takusan", "takusan", "takusan", "takusan"]}' | fluent-cat input.sample
$ echo '{"keys":{"Q":2, "Y":2, "X":1, "D":1}}' | fluent-cat input.sample
output is
2014-02-02 23:08:58 +0900 histo.sample.localhost: {
"hist":[0,0,2,4,2,0,0,0,0,1,5,7,3,0,1,7,12,7,1,0,0,0,0,0,0,0],
"sum":13,
"avg":0,
"sd":3
}
count up about you specified key, and make histogramatic something.
And calculate,
- Sum(sum)
- Average(avg)
- Standard Deviation(sd)
run bench
$ ruby bench/genload.rb input.sample 7000 -l 5