Re: [PATCH v11 3/4] ceph: add read/write latency metric support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2020-03-19 at 02:00 -0400, xiubli@xxxxxxxxxx wrote:
> From: Xiubo Li <xiubli@xxxxxxxxxx>
> 
> Calculate the latency for OSD read requests. Add a new r_end_stamp
> field to struct ceph_osd_request that will hold the time of that
> the reply was received. Use that to calculate the RTT for each call,
> and divide the sum of those by number of calls to get averate RTT.
> 
> Keep a tally of RTT for OSD writes and number of calls to track average
> latency of OSD writes.
> 
> URL: https://tracker.ceph.com/issues/43215
> Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
> ---
>  fs/ceph/addr.c                  |  18 +++++++
>  fs/ceph/debugfs.c               |  61 +++++++++++++++++++++-
>  fs/ceph/file.c                  |  26 ++++++++++
>  fs/ceph/metric.c                | 109 ++++++++++++++++++++++++++++++++++++++++
>  fs/ceph/metric.h                |  23 +++++++++
>  include/linux/ceph/osd_client.h |   1 +
>  net/ceph/osd_client.c           |   2 +
>  7 files changed, 239 insertions(+), 1 deletion(-)
> 

[...]

> +static inline void __update_avg_and_sq(atomic64_t *totalp, atomic64_t *lat_sump,
> +				       struct percpu_counter *sq_sump,
> +				       spinlock_t *lockp, unsigned long lat)
> +{
> +	s64 total, avg, sq, lsum;
> +
> +	spin_lock(lockp);
> +	total = atomic64_inc_return(totalp);
> +	lsum = atomic64_add_return(lat, lat_sump);
> +	spin_unlock(lockp);
> +
> +	if (unlikely(total == 1))
> +		return;
> +
> +	/* the sq is (lat - old_avg) * (lat - new_avg) */
> +	avg = DIV64_U64_ROUND_CLOSEST((lsum - lat), (total - 1));
> +	sq = lat - avg;
> +	avg = DIV64_U64_ROUND_CLOSEST(lsum, total);
> +	sq = sq * (lat - avg);
> +	percpu_counter_add(sq_sump, sq);
> +}
> +
> +void ceph_update_read_latency(struct ceph_client_metric *m,
> +			      unsigned long r_start,
> +			      unsigned long r_end,
> +			      int rc)
> +{
> +	unsigned long lat = r_end - r_start;
> +
> +	if (unlikely(rc < 0 && rc != -ENOENT && rc != -ETIMEDOUT))
> +		return;
> +
> +	__update_min_latency(&m->read_latency_min, lat);
> +	__update_max_latency(&m->read_latency_max, lat);
> +	__update_avg_and_sq(&m->total_reads, &m->read_latency_sum,
> +			    &m->read_latency_sq_sum,
> +			    &m->read_latency_lock,
> +			    lat);
> +

Thanks for refactoring the set, Xiubo.

This makes something very evident though. __update_avg_and_sq takes a
spinlock and we have to hit it every time we update the other values, so
there really is no reason to use atomic or percpu values for any of
this.

I think it would be best to just make all of these be normal variables,
and simply take the spinlock when you fetch or update them.

Thoughts?
-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux