On Mon, Aug 28, 2023 at 9:30 AM Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Mon, 28 Aug 2023 at 03:53, David Laight <David.Laight@xxxxxxxxxx> wrote: > > > > From: Linus Torvalds > > > > > > We use this: > > > > > > static __always_inline unsigned long variable__ffs(unsigned long word) > > > { > > > asm("rep; bsf %1,%0" > > > : "=r" (word) > > > : "rm" (word)); > > > return word; > > > } > > > > > > for the definition, and it looks like clang royally just screws up > > > here. Yes, "m" is _allowed_ in that input set, but it damn well > > > shouldn't be used for something that is already in a register, since > > > "r" is also allowed, and is the first choice. > > > > Why don't we just remove the "m" option? > > For this particular case, it would probably be the right thing to do. > It's sad, though, because gcc handles this correctly, and always has. > > And in this particular case, it probably matters not at all. > > In many other cases where we have 'rm', we may actually be in the > situation that having 'rm' (or other cases like "g" that also allows > immediates) helps because register pressure can be a thing. > > It's mostly a thing on 32-bit x86 where you have a lot fewer > registers, and there we've literally run into situations where we have > had internal compiler errors because of complex inline asm statements > running out of registers. > > With a simple "one input, one output" case, that just isn't an issue, > so to work around a clang misfeature we could do it - if somebody > finds a case where it actually matters (as opposed to "damn, when > looking at the generted code for a function that we never actually use > on x86, I noticed that code generation is horrendous"). > > Linus Yes; it's a compiler bug, and we will fix it. Then the fix will be an incentive for folks that care to move to a newer toolchain. -- Thanks, ~Nick Desaulniers