On 09/03/2025 at 19:23, David Laight wrote: > On Sun, 9 Mar 2025 01:58:53 +0000 > David Laight <david.laight.linux@xxxxxxxxx> wrote: > >> On Fri, 7 Mar 2025 18:58:08 +0900 >> Vincent Mailhol <mailhol.vincent@xxxxxxxxxx> wrote: >> >>> On 07/03/2025 at 04:23, David Laight wrote: >>>> On Thu, 06 Mar 2025 20:29:52 +0900 >>>> Vincent Mailhol via B4 Relay <devnull+mailhol.vincent.wanadoo.fr@xxxxxxxxxx> wrote: >>>> >>>>> From: Vincent Mailhol <mailhol.vincent@xxxxxxxxxx> >>>>> >>>>> In an upcoming change, GENMASK() and its friends will indirectly >>>>> depend on sizeof() which is not available in asm. >>>>> >>>>> Instead of adding further complexity to __GENMASK() to make it work >>>>> for both asm and non asm, just split the definition of the two >>>>> variants. >>>> ... >>>>> +#else /* defined(__ASSEMBLY__) */ >>>>> + >>>>> +#define GENMASK(h, l) __GENMASK(h, l) >>>>> +#define GENMASK_ULL(h, l) __GENMASK_ULL(h, l) >>>> >>>> What do those actually expand to now? >>>> As I've said a few times both UL(0) and ULL(0) are just (0) for __ASSEMBLY__ >>>> so the expansions of __GENMASK() and __GENMASK_ULL() contained the >>>> same numeric constants. >>> >>> Indeed, in asm, the UL(0) and ULL(0) expands to the same thing: 0. >>> >>> But the two macros still expand to something different on 32 bits >>> architectures: >>> >>> * __GENMASK: >>> >>> (((~(0)) << (l)) & (~(0) >> (32 - 1 - (h)))) >>> >>> * __GENMASK_ULL: >>> >>> (((~(0)) << (l)) & (~(0) >> (64 - 1 - (h)))) >>> >>> On 64 bits architecture these are the same. >> >> I've just fed those into godbolt (https://www.godbolt.org/z/Ter6WE9qE) as: >> int fi(void) >> { >> int v; >> asm("mov $(((~(0)) << (8)) & (~(0) >> (32 - 1 - (15)))),%0": "=r" (v)); >> return v -(((~(0u)) << (8)) & (~(0u) >> (32 - 1 - (15)))); >> } >> >> gas warns: >> <source>:4: Warning: 0xffffffffff00 shortened to 0xffffff00 >> >> unsigned long long fll(void) >> { >> unsigned long long v; >> asm("mov $(((~(0)) << (8)) & (~(0) >> (64 - 1 - (15)))),%0": "=r" (v)); >> return v -(((~(0ull)) << (8)) & (~(0ull) >> (64 - 1 - (15)))); >> } >> >> (for other architectures you'll need to change the opcode) >> >> For x86 and x86-32 the assembler seems to be doing 64bit maths with unsigned >> right shifts - so the second function (with the 64 in it) generates 0xff00. >> I doubt a 32bit only assembler does 64bit maths, but the '>> 48' above >> might get masked to a '>> 16' by the cpu and generate the correct result. >> >> So __GENMASK() is likely to be broken for any assembler that supports 64bits >> when generating 32bit code. >> __GENMASK_ULL() works (assuming all have unsigned >>) on 64bit assemblers >> (even when generating 32bit code). It may work on some 32bit assemblers.> > I've remembered my 'pi' has a 32bit userspace (on a 64bit kernel). > I quick test of "mov %0,#(...)" and bits 11..8 gives the correct output > for size '32' but the error message: > /tmp/ccPB7bWh.s:26: Warning: shift count out of range (56 is not between 0 and 31) > with size '64'. > > Assuming that part of the gnu assembler is consistent across architectures > you can't use either GENMASK in asm for 32bit architectures. > > Any change (probably including removing the asm support for the uapi) isn't > going to make things worse! > > David > >> >> Since most uses in the header files will be GENMASK() I doubt (hope) no >> asm code actually uses the values! After reading your message, I wanted to understand where these GENMASK*() were used in asm code. It happens that most of the architectures simply don't use it! I see these are using in aarch64, but that's about it. I couldn't find any use of neither GENMASK() nor GENMASK_ULL() in x86, arm 32 bits, m68k or powerpc asm code (I did not check the other architectures). aarch64 uses both the long and long long variants, but this being a 64 bits architecture, these are the same. So OK. So, this goes into the same direction as you: I guess that the fact that no one noticed the issue is that no one uses this on a 32 bits arch, probably for historical reasons, i.e. the asm code was written before the introduction of the GENMASK() macros. >> The headers assemble - but that is about all that can be said. >> >> Bags of worms :-) +1 (I will not open that bag) I think that the asm and non asm variant should have used a different implementation from the begining. By wanting to have a single variant that fit both the asm and the C code, we have a complex result that is hard to understand and maintain (c.f. the bug which you pointed out which have been unnoticed for ever). But now that it is in the uapi, I am not sure of what is the best approach. I sincerely hope that we can just modify it with no userland impact. Well, just to make it clear, I am not touching the asm part. I acknowledge the issue, but I do not think that I have the skill to fix it. Yours sincerely, Vincent Mailhol