Re: [PATCH 3/4] dm stats: report histogram of latencies

Mikulas Patocka <mpatocka@xxxxxxxxxx> · Mon, 15 Jun 2015 11:34:51 -0400 (EDT)

On Mon, 15 Jun 2015, Coly Li wrote:

> ? 15/6/15 ??9:06, Mikulas Patocka ??:
> > These micro optimizations would save one or two instructions, so they are 
> > not really needed. Also, these micro optimizations are at a code path that 
> > prints the statistics, not the code path that processes i/o requests.
> It is not one or two instructions, it depends on how big
> s->n_histogram_entries is.

No, it doesn't depend on the s->n_histogram_entries.

When evaluating the expression "s->n_histogram_entries + 1", there is one 
instruction that loads s->n_histogram_entries and another instruction that 
adds 1 to it. If you replace it with pre-computed value, you save just 
those two instructions, nothing else.

Mikulas

> Yes I agree, it is not on critical I/O path :-)
> 
> Coly
> > On Sun, 14 Jun 2015, Coly Li wrote:
> >
> >> ? 15/6/10 ??5:22, Mikulas Patocka ??:
> >>> This patch adds an option to dm statistics to collect and report the
> >>> histogram of latencies.
> >>>
> >>> Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>
> >>>
> >>> ---
> >>>  Documentation/device-mapper/statistics.txt |   21 ++-
> >>>  drivers/md/dm-stats.c                      |  201 +++++++++++++++++++++++++----
> >>>  2 files changed, 196 insertions(+), 26 deletions(-)
> >>>
> >>> Index: linux-4.1-rc7/drivers/md/dm-stats.c
> >>> ===================================================================
> >>> --- linux-4.1-rc7.orig/drivers/md/dm-stats.c	2015-06-08 16:38:43.000000000 +0200
> >>> +++ linux-4.1-rc7/drivers/md/dm-stats.c	2015-06-08 17:02:01.000000000 +0200
> >>> @@ -29,6 +29,7 @@ struct dm_stat_percpu {
> >>>  	unsigned long long io_ticks[2];
> >>>  	unsigned long long io_ticks_total;
> >>>  	unsigned long long time_in_queue;
> >>> +	unsigned long long *histogram;
> >>>  };
> >>>  
> >>>  struct dm_stat_shared {
> >> [snip]
> >>> @@ -619,6 +700,11 @@ static void __dm_stat_init_temporary_per
> >>>  		shared->tmp.io_ticks[WRITE] += ACCESS_ONCE(p->io_ticks[WRITE]);
> >>>  		shared->tmp.io_ticks_total += ACCESS_ONCE(p->io_ticks_total);
> >>>  		shared->tmp.time_in_queue += ACCESS_ONCE(p->time_in_queue);
> >>> +		if (s->n_histogram_entries) {
> >>> +			unsigned i;
> >>> +			for (i = 0; i < s->n_histogram_entries + 1; i++)
> >>> +				shared->tmp.histogram[i] += ACCESS_ONCE(p->histogram[i]);
> >> s->n_histogram_entries + 1 in for() looping will be calculated many times, maybe this way could be better,
> >>
> >> +		if (s->n_histogram_entries) {
> >> +			unsigned i, j;
> >> +			j = s->n_histogram_entries + 1;
> >> +			for (i = 0; i < j; i++)
> >> +				shared->tmp.histogram[i] += ACCESS_ONCE(p->histogram[i]);
> >>
> >>
> [snip]
> 

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel