Re: Proposition - latency histogram

Josh Durgin <jdurgin@xxxxxxxxxx> · Mon, 28 Nov 2016 18:43:46 -0800

On 11/28/2016 09:43 AM, John Spray wrote:
On Mon, Nov 28, 2016 at 4:51 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
On Mon, 28 Nov 2016, Bartłomiej Święcki wrote:
Hi,

Currently we can query OSD for op latency but it's given as an average.
Average may not give the bets information in this case - i.e. spikes can
easily get hidden there.

Instead of an average we could easily do a simple histogram - quantize
the latency into predefined set of time intervals, for each of them have
a simple performance counter, at each op increase one of them. Since
those are per OSD, we could have pretty high resolution with fractional
memory usage, performance impact should be negligible since only one
(two if split into read and write) of those counters would be
incremented per one osd op.

In addition we could also do this in 2D - each counter matching given
latency range and op size range. having such 2D table would show both
latency histogram, request size histogram and combinations of those
(i.e. latency histogram of ~4k ops only).

What do you think about this idea? I can prepare some code - a simple proof of
concept looks really
straightforward to implement.

This sounds like a great idea.  I think the main issue is that the data
won't be easily exposed via the perfcounter interface... at least not in a
way that generic tools can visualize.  Unless there is a standardish way
to report histogram metrics?

Newer tools are waking up to the need for histograms, e.g. Prometheus
has a histogram datatype:
https://prometheus.io/docs/concepts/metric_types/#histogram

Someone has done some work on adding support in grafana:
https://github.com/grafana/grafana/issues/600

Should be reasonably straightforward to add a histogram type to the
perf counters: people might end up flattening it to a series of scalar
time series with _bucket suffixes or whatever, but I'd definitely be
in favour of us adding an explicit histogram type internally.

There are also existing libraries like HdrHistogram that have nice
serialized formats that could be extracted in windowed intervals for
monitoring systems, or later analysis, and have existing scripts for
graphing [0].

It also has support for correcting reporting of outliers in common 
benchmark architectures ("coordinated omission"), which would be handy
for a number of our benchmarks [1][2][3].

Josh

[0] https://hdrhistogram.github.io/HdrHistogram/
[1] 
http://psy-lob-saw.blogspot.com/2015/02/hdrhistogram-better-latency-capture.html
[2] 
http://repository.cmu.edu/cgi/viewcontent.cgi?article=1872&context=compsci
[3] 
http://www.azulsystems.com/sites/default/files/images/HowNotToMeasureLatency_LLSummit_NYC_12Nov2013.pdf
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html