Re: [PATCH 11/17] find: micro-optimize for_each_{set,clear}_bit()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 26, 2021 at 03:57:13PM +0200, Petr Mladek wrote:
> On Sat 2021-08-14 14:17:07, Yury Norov wrote:
> > The macros iterate thru all set/clear bits in a bitmap. They search a
> > first bit using find_first_bit(), and the rest bits using find_next_bit().
> > 
> > Since find_next_bit() is called shortly after find_first_bit(), we can
> > save few lines of I-cache by not using find_first_bit().
> 
> Is this only a speculation or does it fix a real performance problem?
> 
> The macro is used like:
> 
> 	for_each_set_bit(bit, addr, size) {
> 		fn(bit);
> 	}
> 
> IMHO, the micro-opimization does not help when fn() is non-trivial.
 
The effect is measurable:

Start testing for_each_bit()
for_each_set_bit:                15296 ns,   1000 iterations
for_each_set_bit_from:           15225 ns,   1000 iterations

Start testing for_each_bit() with cash flushing
for_each_set_bit:               547626 ns,   1000 iterations
for_each_set_bit_from:          497899 ns,   1000 iterations

Refer this:

https://www.mail-archive.com/dri-devel@xxxxxxxxxxxxxxxxxxxxx/msg356151.html

Thanks,
Yury
 
> > --- a/include/linux/find.h
> > +++ b/include/linux/find.h
> > @@ -280,7 +280,7 @@ unsigned long find_next_bit_le(const void *addr, unsigned
> >  #endif
> >  
> >  #define for_each_set_bit(bit, addr, size) \
> > -	for ((bit) = find_first_bit((addr), (size));		\
> > +	for ((bit) = find_next_bit((addr), (size), 0);		\
> >  	     (bit) < (size);					\
> >  	     (bit) = find_next_bit((addr), (size), (bit) + 1))
> >  
> 
> It is not a big deal. I just think that the original code is slightly
> more self-explaining.
> 
> Best Regards,
> Petr




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux