From: Dmitry Vyukov > Sent: 29 May 2019 11:57 > On Wed, May 29, 2019 at 12:30 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > On Wed, May 29, 2019 at 12:16:31PM +0200, Marco Elver wrote: > > > On Wed, 29 May 2019 at 12:01, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > > > > > On Wed, May 29, 2019 at 11:20:17AM +0200, Marco Elver wrote: > > > > > For the default, we decided to err on the conservative side for now, > > > > > since it seems that e.g. x86 operates only on the byte the bit is on. > > > > > > > > This is not correct, see for instance set_bit(): > > > > > > > > static __always_inline void > > > > set_bit(long nr, volatile unsigned long *addr) > > > > { > > > > if (IS_IMMEDIATE(nr)) { > > > > asm volatile(LOCK_PREFIX "orb %1,%0" > > > > : CONST_MASK_ADDR(nr, addr) > > > > : "iq" ((u8)CONST_MASK(nr)) > > > > : "memory"); > > > > } else { > > > > asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0" > > > > : : RLONG_ADDR(addr), "Ir" (nr) : "memory"); > > > > } > > > > } > > > > > > > > That results in: > > > > > > > > LOCK BTSQ nr, (addr) > > > > > > > > when @nr is not an immediate. > > > > > > Thanks for the clarification. Given that arm64 already instruments > > > bitops access to whole words, and x86 may also do so for some bitops, > > > it seems fine to instrument word-sized accesses by default. Is that > > > reasonable? > > > > Eminently -- the API is defined such; for bonus points KASAN should also > > do alignment checks on atomic ops. Future hardware will #AC on unaligned > > [*] LOCK prefix instructions. > > > > (*) not entirely accurate, it will only trap when crossing a line. > > https://lkml.kernel.org/r/1556134382-58814-1-git-send-email-fenghua.yu@xxxxxxxxx > > Interesting. Does an address passed to bitops also should be aligned, > or alignment is supposed to be handled by bitops themselves? The bitops are defined on 'long []' and it is expected to be aligned. Any code that casts the argument is likely to be broken on big-endian. I did a quick grep a few weeks ago and found some very dubious code. Not all the casts seemed to be on code that was LE only (although I didn't try to find out what the casts were from). The alignment trap on x86 could be avoided by only ever requesting 32bit cycles - and assuming the buffer is always 32bit aligned (eg int []). But on BE passing an 'int []' is just so wrong .... David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)