Hi Dennis, this generally looks good to me. Just two small nit picks: On Tue, Jul 31, 2018 at 01:36:47PM -0700, Dennis Zhou wrote: > @@ -135,6 +135,24 @@ struct iolatency_grp { > struct child_latency_info child_lat; > }; > > +#define BLKIOLATENCY_MIN_WIN_SIZE (100 * NSEC_PER_MSEC) > +#define BLKIOLATENCY_MAX_WIN_SIZE NSEC_PER_SEC > +/* > + * These are the constants used to fake the fixed-point moving average > + * calculation just like load average. The latency window is bucketed to > + * try to approximately calculate average latency for the last 1 minute. > + */ > +#define BLKIOLATENCY_NR_EXP_FACTORS 5 > +#define BLKIOLATENCY_EXP_BUCKET_SIZE (BLKIOLATENCY_MAX_WIN_SIZE / \ > + (BLKIOLATENCY_NR_EXP_FACTORS - 1)) > +static const u64 iolatency_exp_factors[BLKIOLATENCY_NR_EXP_FACTORS] = { > + 2045, // exp(1/600) - 600 samples > + 2039, // exp(1/240) - 240 samples > + 2031, // exp(1/120) - 120 samples > + 2023, // exp(1/80) - 80 samples > + 2014, // exp(1/60) - 60 samples Might be useful to drop the FIXED_1 name in a comment here. It says "fixed-point", and "load average", but since the numbers are directly in relationship to that constant, it'd be good to name it I think. > @@ -462,7 +480,7 @@ static void iolatency_check_latencies(struct iolatency_grp *iolat, u64 now) > struct child_latency_info *lat_info; > struct blk_rq_stat stat; > unsigned long flags; > - int cpu; > + int cpu, exp_idx; > > blk_rq_stat_init(&stat); > preempt_disable(); > @@ -480,11 +498,10 @@ static void iolatency_check_latencies(struct iolatency_grp *iolat, u64 now) > > lat_info = &parent->child_lat; > > - iolat->total_lat_avg = > - div64_u64((iolat->total_lat_avg * iolat->total_lat_nr) + > - stat.mean, iolat->total_lat_nr + 1); > - > - iolat->total_lat_nr++; > + exp_idx = min_t(int, BLKIOLATENCY_NR_EXP_FACTORS - 1, > + iolat->cur_win_nsec / BLKIOLATENCY_EXP_BUCKET_SIZE); > + CALC_LOAD(iolat->total_lat_avg, iolatency_exp_factors[exp_idx], > + stat.mean); The load average keeps the running value in fixed point presentation to avoid rounding errors. I guess because this is IO time in ns, the values are so much higher than the FIXED_1 denominator (2048) that rounding errors are negligible, and we don't need to bother with it. Can you mention that in a comment, please?