On Mon, Nov 28, 2016 at 4:51 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > On Mon, 28 Nov 2016, Bartłomiej Święcki wrote: >> Hi, >> >> Currently we can query OSD for op latency but it's given as an average. >> Average may not give the bets information in this case - i.e. spikes can >> easily get hidden there. >> >> Instead of an average we could easily do a simple histogram - quantize >> the latency into predefined set of time intervals, for each of them have >> a simple performance counter, at each op increase one of them. Since >> those are per OSD, we could have pretty high resolution with fractional >> memory usage, performance impact should be negligible since only one >> (two if split into read and write) of those counters would be >> incremented per one osd op. >> >> In addition we could also do this in 2D - each counter matching given >> latency range and op size range. having such 2D table would show both >> latency histogram, request size histogram and combinations of those >> (i.e. latency histogram of ~4k ops only). >> >> What do you think about this idea? I can prepare some code - a simple proof of >> concept looks really >> straightforward to implement. > > This sounds like a great idea. I think the main issue is that the data > won't be easily exposed via the perfcounter interface... at least not in a > way that generic tools can visualize. Unless there is a standardish way > to report histogram metrics? Newer tools are waking up to the need for histograms, e.g. Prometheus has a histogram datatype: https://prometheus.io/docs/concepts/metric_types/#histogram Someone has done some work on adding support in grafana: https://github.com/grafana/grafana/issues/600 Should be reasonably straightforward to add a histogram type to the perf counters: people might end up flattening it to a series of scalar time series with _bucket suffixes or whatever, but I'd definitely be in favour of us adding an explicit histogram type internally. John -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html