On Mon, Dec 09, 2024 at 12:43:54PM -0800, Yury Norov wrote: > On Mon, Dec 09, 2024 at 01:03:00PM -0700, Nathan Chancellor wrote: > > Maybe people are not using CONFIG_WERROR=y and W=e when hitting this so > > they do not notice? It also only became visible in 6.12 because of the > > 'inline' -> '__always_inline' changes in bitmap.h and cpumask.h, since > > prior to that, the size of the objects being passed to memcpy() were not > > known, so FORTIFY could not catch them (another +1 for that change). > > Thanks, but I'm actually not happy with that series (ab6b1010dab68f6d4). > The original motivation was that one part of compiler decided to outline > the pure wrappers or lightweight inline implementation for small bitmaps, > like those fitting inside a machine word. > > After that, another part of compiler started complaining that outlined > helpers mismatch the sections - .text and .init.data. Not another part of the compiler but modpost, a kernel tool, started complaining. If modpost could perform control flow analysis, it could avoid false positives such as the one from ab6b1010dab68 by seeing more of the callchain rather than just the outlined function being called with a potentially discarded variable. > (Not mentioning that the helpers were not designed to be real outlined > functions, and doing that adds ~3k to kernel image.) Isn't the point of '__always_inline' to convey this to the compiler? As far as I understand it, the C standard permits the compiler is completely free to ignore 'inline', which could happen for any number of reasons, especially with code generation options such as the sanitizers or other instrumentation. If you know that these functions need to be inlined to generate better code but the compiler doesn't, why not tell it? > I don't like forcing compiler to do this or that, but in this case I > just don't know how to teach it to outline the function twice, if it > wants to do that. This should be done automatically, I guess... I do not think that I understand what you are getting at or asking for here, sorry. Are you saying you would expect the compiler to split bitmap_and() into basically bitmap_and_small_const_nbits() and __bitmap_and() then decide which to call in cpumask_and() based on the condition of small_const_nbits(nbits) at a particular site? Isn't that basically what we are allowing the compiler to figure out by always inlining these functions into their call sites? > Similarly, I don't know how to teach it to keep the functions inlined, > other than forcing it to do so. That's pretty much what '__always_inline' is, right? It's you as the programmer saying "I know that this needs to be inlined for xyz reason so I really need you to do it". Otherwise, you are just asking to tweak a heuristic. Cheers, Nathan