Chris Kitzmiller wrote: >> ~# ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok perf >> >> [...] >> >> "osd": { "opq": 0, >> "op_wip": 0, >> "op": 3566, >> "op_in_bytes": 208803635, >> "op_out_bytes": 146962506, >> "op_latency": { "avgcount": 3566, >> "sum": 100.330695000}, >> "op_process_latency": { "avgcount": 3566, >> "sum": 84.702772000}, >> "op_r": 471, >> "op_r_out_bytes": 146851024, >> "op_r_latency": { "avgcount": 471, >> "sum": 1.329795000}, >> >> [...] >> >> Is the value of "op_r_latency" (ie 1.329ms above)? >> In this case, I don't understand the meaning of "avgcount" >> and "sum". >> >> "sum" is the sum of what? >> "avgcount" is the average of what? > > There are a bunch of these avgcount/sum pairs and, from what I've gleaned, you're to simply divide sum by avgcount to get the mean of that particular stat over whatever time frame it is measuring. Err..., I'm sorry, I'm not sure to well understand. If I take the values of op_r_latency above, I have: sum/avgcount = 1.329795000/471 = 0.002823344 0,002823344ms would be my latency of read operation? It seems to me impossible (unfortunately ;)) or maybe the unit is in seconds? In this case 2.823344ms could be a plausible value. In any case, I don't understand the name "avgcount". The name "count" seems to me more logical (but maybe I don't really have understand its meaning). If I see the source code ./src/common/perf_counters.cc, it seems to me that, indeed, the number is in seconds but I'm (really) not a c++ expert. Is possible to confirm to me that? Another thing: if I understand well, the value sum/avgcount is an average of latency, average computed from the start of the osd daemon. So, after lot of times, the average will be more stable and it no longer incur variation. Is it possible to restart the counters? I noticed that restarting the daemon works but it's a little drastic. -- François Lafont _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com