I've just started on this myself..
I started with https://ceph.com/docs/v0.80/dev/perf_counters/
I'm currently monitoring the latency, using the (to pick one example) [op_w_latency][sum] and [op_w_latency][avgcount]. Both values are counters, so they only increase with time. The lifetime average latency of the cluster isn't verify useful, so I track the deltas of those values, then divide the recent deltas to get the average latency over my sample period.
Just graphing the latencies let me see a spike in write latency on all disks on one node, which eventually led me to a dead write-cache battery.
That's for the OSDs. I have similar things setup for MON and RadosGW.
I'm sure there are many more useful things to graph. One of things I'm interested in (but haven't found time to research yet) is the journal usage, with maybe some alerts if the journal is more than 90% full.
On Mon, Oct 13, 2014 at 2:57 PM, Jakes John <jakesjohn12345@xxxxxxxxx> wrote:
Bump:). It would be helpful, if someone can share info related to debugging using counters/statsOn Sun, Oct 12, 2014 at 7:42 PM, Jakes John <jakesjohn12345@xxxxxxxxx> wrote:Hi All,I would like to know if there are useful performance counters in ceph which can help to debug the cluster. I have seen hundreds of stat counters in various daemon dumps. Some of them are,1. commit_latency_ms2. apply_latency_ms3. snap_trim_queue_len4. num_snap_trimmingWhat do these indicate?. .I have used iostat, atop for cluster statistics but, none of them indicate the internal ceph status. Machines might be new but, osds can still be slow. If some of these counters can help to debug why certain osds are bad( or can get bad later), it would be great. Some counters like total processed requests, pending requests in queue, avg time taken to process a request etc ?Are there any docs for all performance counters which I can read?. I couldn't find anything in ceph docs.Thanks
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com