On Mon, 2021-09-13 at 18:43 +0530, Venky Shankar wrote: > Right now, cumulative read/write/metadata latencies are tracked > and are periodically forwarded to the MDS. These meterics are not > particularly useful. A much more useful metric is the average latency > and standard deviation (stdev) which is what this series of patches > aims to do. > > The userspace (libcephfs+tool) changes are here:: > > https://github.com/ceph/ceph/pull/41397 > > The math involved in keeping track of the average latency and stdev > seems incorrect, so, this series fixes that up too (closely mimics > how its done in userspace with some restrictions obviously) as per:: > > NEW_AVG = OLD_AVG + ((latency - OLD_AVG) / total_ops) > NEW_STDEV = SQRT(((OLD_STDEV + (latency - OLD_AVG)*(latency - NEW_AVG)) / (total_ops - 1))) > > Note that the cumulative latencies are still forwarded to the MDS but > the tool (cephfs-top) ignores it altogether. > > Venky Shankar (4): > ceph: use "struct ceph_timespec" for r/w/m latencies > ceph: track average/stdev r/w/m latency > ceph: include average/stddev r/w/m latency in mds metrics > ceph: use tracked average r/w/m latencies to display metrics in > debugfs > > fs/ceph/debugfs.c | 12 +++---- > fs/ceph/metric.c | 81 +++++++++++++++++++++++++---------------------- > fs/ceph/metric.h | 64 +++++++++++++++++++++++-------------- > 3 files changed, 90 insertions(+), 67 deletions(-) > This looks reasonably sane. I'll plan to go ahead and pull this into the testing kernels and do some testing with them. If anyone has objections (Xiubo?) let me know and I can take them out. Thanks, -- Jeff Layton <jlayton@xxxxxxxxxx>