Re: Proposition - latency histogram

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 31, 2017 at 3:33 PM, Bartłomiej Święcki
<bartlomiej.swiecki@xxxxxxxxxxxx> wrote:
> Attached is a sample output from test python script (included in PR)
> to display live results.

This is very cool, and inspired me to build a colored version in to a
toy gui that I've been using to exercise ceph-mgr.
http://imgur.com/a/TG5kc (it a super-primitive rendering using linear
color scale on cells of a <table>)

I did that by exposing the perf counters via the MCommand (`tell`)
interface on the OSD so that the UI could poll them.

I think there are two main use cases for these plots:
 * System-wide (average of OSDs): what are my doing/experiencing?
 * Individual OSD: is this OSD healthy?  Potentially would plot this
as a delta against the systemwide average to highlight OSDs behaving
badly.

Currently, ordinary perf counters are getting shipped back to ceph-mgr
continuously, so we would need to decide whether we want to do the
same for the larger histogram ones, or whether we would expose them
via `tell` so that any interested parties could fetch them on demand.
The main benefit to continuously sending them would be that ceph-mgr
could maintain a continuous sum/average across all the OSDs.  The cost
depends how widely we use this data type: if there were only a few
histograms per osd (osd read, osd write, store read, store write),
then I suspect we could get away with transmitting them around quite
freely.

The 2D data is awesome and I can't see us not wanting this, though
there will also be at least some key places we want 1D data,
especially for the MDS where metadata ops don't have a size dimension.

John


>
>
> On 01/31/2017 04:22 PM, Bartłomiej Święcki wrote:
>>
>> Hi,
>>
>> Bringing back performance histograms:
>> https://github.com/ceph/ceph/pull/12829
>> I've updated the PR, rebased on master and made internal changes less
>> aggressive.
>>
>> All ctest tests passing and I haven't seen any issues with performance
>> (and I can actually see much better what the performance characteristics
>> are
>>
>> Waiting for your comments,
>> Bartek
>>
>>
>>
>> Looking
>>
>> On 01/09/2017 12:27 PM, Bartłomiej Święcki wrote:
>>>
>>> Hi,
>>>
>>> I've made a simple implementation of performance histograms.
>>> Implementation is not very sophisticated
>>> but I think it could be a good start for more detailed discussion.
>>>
>>> Here's the PR: https://github.com/ceph/ceph/pull/12829
>>>
>>>
>>> Regards,
>>> Bartek
>>>
>>>
>>> On 11/28/2016 05:22 PM, Bartłomiej Święcki wrote:
>>>>
>>>> Hi,
>>>>
>>>>
>>>> Currently we can query OSD for op latency but it's given as an average.
>>>> Average may not give
>>>> the bets information in this case - i.e. spikes can easily get hidden
>>>> there.
>>>>
>>>> Instead of an average we could easily do a simple histogram - quantize
>>>> the latency into
>>>> predefined set of time intervals, for each of them have a simple
>>>> performance counter,
>>>> at each op increase one of them. Since those are per OSD, we could have
>>>> pretty high resolution
>>>> with fractional memory usage, performance impact should be negligible
>>>> since only one (two if split
>>>> into read and write) of those counters would be incremented per one osd
>>>> op.
>>>>
>>>> In addition we could also do this in 2D - each counter matching given
>>>> latency range and op size range.
>>>> having such 2D table would show both latency histogram, request size
>>>> histogram and combinations of those
>>>> (i.e. latency histogram of ~4k ops only).
>>>>
>>>> What do you think about this idea? I can prepare some code - a simple
>>>> proof of concept looks really
>>>> straightforward to implement.
>>>>
>>>>
>>>> Bartek
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux