Re: [PATCH] x86: bitops: fix build regression

Nick Desaulniers <ndesaulniers@xxxxxxxxxx> · Tue, 5 May 2020 11:22:47 -0700

On Tue, May 5, 2020 at 11:07 AM <hpa@xxxxxxxxx> wrote:
>
> On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
> >From: Sedat Dilek <sedat.dilek@xxxxxxxxx>
> >
> >It turns out that if your config tickles __builtin_constant_p via
> >differences in choices to inline or not, this now produces invalid
> >assembly:
> >
> >$ cat foo.c
> >long a(long b, long c) {
> >  asm("orb\t%1, %0" : "+q"(c): "r"(b));
> >  return c;
> >}
> >$ gcc foo.c
> >foo.c: Assembler messages:
> >foo.c:2: Error: `%rax' not allowed with `orb'
> >
> >The "q" constraint only has meanting on -m32 otherwise is treated as
> >"r".
> >
> >This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> >or Clang+allyesconfig.
> >
> >Keep the masking operation to appease sparse (`make C=1`), add back the
> >cast in order to properly select the proper 8b register alias.
> >
> > [Nick: reworded]
> >
> >Cc: stable@xxxxxxxxxxxxxxx
> >Cc: Jesse Brandeburg <jesse.brandeburg@xxxxxxxxx>
> >Link: https://github.com/ClangBuiltLinux/linux/issues/961
> >Link: https://lore.kernel.org/lkml/20200504193524.GA221287@xxxxxxxxxx/
> >Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> >Reported-by: Sedat Dilek <sedat.dilek@xxxxxxxxx>
> >Reported-by: kernelci.org bot <bot@xxxxxxxxxxxx>
> >Suggested-by: Andy Shevchenko <andriy.shevchenko@xxxxxxxxx>
> >Suggested-by: Ilie Halip <ilie.halip@xxxxxxxxx>
> >Tested-by: Sedat Dilek <sedat.dilek@xxxxxxxxx>
> >Signed-off-by: Sedat Dilek <sedat.dilek@xxxxxxxxx>
> >Signed-off-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
> >---
> > arch/x86/include/asm/bitops.h | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> >diff --git a/arch/x86/include/asm/bitops.h
> >b/arch/x86/include/asm/bitops.h
> >index b392571c1f1d..139122e5b25b 100644
> >--- a/arch/x86/include/asm/bitops.h
> >+++ b/arch/x86/include/asm/bitops.h
> >@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> >       if (__builtin_constant_p(nr)) {
> >               asm volatile(LOCK_PREFIX "orb %1,%0"
> >                       : CONST_MASK_ADDR(nr, addr)
> >-                      : "iq" (CONST_MASK(nr) & 0xff)
> >+                      : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> >                       : "memory");
> >       } else {
> >               asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> >@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> >       if (__builtin_constant_p(nr)) {
> >               asm volatile(LOCK_PREFIX "andb %1,%0"
> >                       : CONST_MASK_ADDR(nr, addr)
> >-                      : "iq" (CONST_MASK(nr) ^ 0xff));
> >+                      : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> >       } else {
> >               asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> >                       : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
>
> Drop & 0xff and change ^ 0xff to ~.
>
> The redundancy is confusing.

Thanks for the review.  While I would also like to have less
redundancy, we have ourselves a catch-22 that that won't resolve.

Without the cast to u8, gcc and clang will not select low-8-bit
registers required for the `b` suffix on `orb` and `andb`, which
results in an assembler error.
Without the mask, sparse will warn about the upper bytes of the value
being truncated.
(I guess that would have been a more concise commit message).
-- 
Thanks,
~Nick Desaulniers