Re: [PATCH] block: make iolatency avg_lat exponentially decay

Johannes Weiner <hannes@xxxxxxxxxxx> · Tue, 31 Jul 2018 17:21:50 -0400

Hi Dennis,

this generally looks good to me. Just two small nit picks:

On Tue, Jul 31, 2018 at 01:36:47PM -0700, Dennis Zhou wrote:
> @@ -135,6 +135,24 @@ struct iolatency_grp {
>  	struct child_latency_info child_lat;
>  };
>  
> +#define BLKIOLATENCY_MIN_WIN_SIZE (100 * NSEC_PER_MSEC)
> +#define BLKIOLATENCY_MAX_WIN_SIZE NSEC_PER_SEC
> +/*
> + * These are the constants used to fake the fixed-point moving average
> + * calculation just like load average. The latency window is bucketed to
> + * try to approximately calculate average latency for the last 1 minute.
> + */
> +#define BLKIOLATENCY_NR_EXP_FACTORS 5
> +#define BLKIOLATENCY_EXP_BUCKET_SIZE (BLKIOLATENCY_MAX_WIN_SIZE / \
> +				      (BLKIOLATENCY_NR_EXP_FACTORS - 1))
> +static const u64 iolatency_exp_factors[BLKIOLATENCY_NR_EXP_FACTORS] = {
> +	2045, // exp(1/600) - 600 samples
> +	2039, // exp(1/240) - 240 samples
> +	2031, // exp(1/120) - 120 samples
> +	2023, // exp(1/80)  - 80 samples
> +	2014, // exp(1/60)  - 60 samples

Might be useful to drop the FIXED_1 name in a comment here. It says
"fixed-point", and "load average", but since the numbers are directly
in relationship to that constant, it'd be good to name it I think.

> @@ -462,7 +480,7 @@ static void iolatency_check_latencies(struct iolatency_grp *iolat, u64 now)
>  	struct child_latency_info *lat_info;
>  	struct blk_rq_stat stat;
>  	unsigned long flags;
> -	int cpu;
> +	int cpu, exp_idx;
>  
>  	blk_rq_stat_init(&stat);
>  	preempt_disable();
> @@ -480,11 +498,10 @@ static void iolatency_check_latencies(struct iolatency_grp *iolat, u64 now)
>  
>  	lat_info = &parent->child_lat;
>  
> -	iolat->total_lat_avg =
> -		div64_u64((iolat->total_lat_avg * iolat->total_lat_nr) +
> -			  stat.mean, iolat->total_lat_nr + 1);
> -
> -	iolat->total_lat_nr++;
> +	exp_idx = min_t(int, BLKIOLATENCY_NR_EXP_FACTORS - 1,
> +			iolat->cur_win_nsec / BLKIOLATENCY_EXP_BUCKET_SIZE);
> +	CALC_LOAD(iolat->total_lat_avg, iolatency_exp_factors[exp_idx],
> +		  stat.mean);

The load average keeps the running value in fixed point presentation
to avoid rounding errors. I guess because this is IO time in ns, the
values are so much higher than the FIXED_1 denominator (2048) that
rounding errors are negligible, and we don't need to bother with it.

Can you mention that in a comment, please?