On Fri, May 31, 2024 at 03:29:28PM +0200, Arnd Bergmann wrote: > "Alexander Lobakin" <aleksander.lobakin@xxxxxxxxx> > Subject: Re: + gcc-disable-warray-bounds-for-gcc-9.patch added to mm-hotfixes-unstable > branch > Content-Type: text/plain > Status: RO > Content-Length: 3582 > Lines: 85 > > On Fri, May 31, 2024, at 03:18, Yury Norov wrote: > > On Wed, May 29, 2024 at 04:39:45PM +0200, Arnd Bergmann wrote: > >> On Fri, May 24, 2024, at 05:00, Andrew Morton wrote: > > > >> I think this is mixing up > >> a couple of independent issues, and makes it harder to > >> ever enable the warning again. > >> > >> The bitmap.h code looks suspicious to me, and if gcc is > >> unable to analyze this as a false positive, it probably > >> also can't optimize it correctly, > > > > In the b44759705f7d Alexander says: > > > > bloat-o-meter shows no difference in vmlinux and -2 bytes for > > gpio-pca953x.ko, which says the optimization didn't suffer due to > > that change. The converted helpers have the value width embedded > > and always compile-time constant and that helps a lot. > > > > Bloat-o-meter itself is not a measure of how effective the code is, > > but it's a good hit that code generation before/after is at least > > on par. Have you an evidence that the patch makes code generation > > worse? > > So far, it's only a suspicion based on the gcc warning: > this is in the same category as some other warnings that > depend on gcc analyzing code across function boundaries. > Another example is -Wmaybe-uninitialized. > > It's usually good at this, but if the code gets too > complex it fails to track the state correctly, resulting > both in missed optimizations and false-postiive warnings. > > This often happens in combination with sanitizers, as they > add more complexity and turn off some optimization steps > that are required here. > > >> so it may be better > >> to either not have this as an inline function at all, > >> or find an implementation that gcc can optimize better. > > > > The functions look bulky but it boild to one or at max two words fetch > > plus shifts. Inlining helps to generate better code, particularly in > > the bitmap_get/set_value8 case because masks and offsets generation is > > done at compile time. > > > > We had quite a few cycles back then... Alexander, can you please > > share on code generation, particularly inline vs outline versions? > > Right, maybe all we need here is marking it __always_inline > then. Something I've seen before is functions that > are meant to become trivial after inlining, but that > (before inlining) appear to have too many statements for > gcc to consider them candidates. It may then try to generate > a specialized version of the function that optimizes for > certain constant arguments, but is then confused about which > arguments are constant or not. > > Do you still see gcc-9 false-positive warnings with the > trivial patch below? KASAN was not enabled. I disabled all CONFIG_*SAN and CONFIG_KFENCE, but the warning is still there. __always_inline doesn't improve as well. Thanks, Yury > > Arnd > > diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h > index 8c4768c44a01..08df806df3d2 100644 > --- a/include/linux/bitmap.h > +++ b/include/linux/bitmap.h > @@ -737,7 +737,7 @@ static inline void bitmap_from_u64(unsigned long *dst, u64 mask) > * @map memory region. For @nbits = 0 and @nbits > BITS_PER_LONG the return > * value is undefined. > */ > -static inline unsigned long bitmap_read(const unsigned long *map, > +static __always_inline unsigned long bitmap_read(const unsigned long *map, > unsigned long start, > unsigned long nbits) > { > @@ -772,7 +772,7 @@ static inline unsigned long bitmap_read(const unsigned long *map, > * > * For @nbits == 0 and @nbits > BITS_PER_LONG no writes are performed. > */ > -static inline void bitmap_write(unsigned