RE: Proposition - latency histogram

Allen Samuels <Allen.Samuels@xxxxxxxxxxx> · Mon, 28 Nov 2016 16:46:20 +0000

> -----Original Message-----
> From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-
> owner@xxxxxxxxxxxxxxx] On Behalf Of Bartlomiej Swiecki
> Sent: Monday, November 28, 2016 8:22 AM
> To: Ceph Development <ceph-devel@xxxxxxxxxxxxxxx>
> Subject: Proposition - latency histogram
> 
> Hi,
> 
> 
> Currently we can query OSD for op latency but it's given as an average.
> Average may not give
> the bets information in this case - i.e. spikes can easily get hidden there.
> 
> Instead of an average we could easily do a simple histogram - quantize the
> latency into predefined set of time intervals, for each of them have a simple
> performance counter, at each op increase one of them. Since those are per
> OSD, we could have pretty high resolution with fractional memory usage,
> performance impact should be negligible since only one (two if split into read
> and write) of those counters would be incremented per one osd op.
> 

+1

A reminder, there are different latency domains for the different media types (flash, HDD). One solution is to make the buckets be parameterized.

> In addition we could also do this in 2D - each counter matching given latency
> range and op size range.
> having such 2D table would show both latency histogram, request size
> histogram and combinations of those (i.e. latency histogram of ~4k ops only).
> 
> What do you think about this idea? I can prepare some code - a simple proof
> of concept looks really straightforward to implement.
> 
> 
> Bartek
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f