On Wed, May 29, 2019 at 12:16:31PM +0200, Marco Elver wrote: > On Wed, 29 May 2019 at 12:01, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > On Wed, May 29, 2019 at 11:20:17AM +0200, Marco Elver wrote: > > > For the default, we decided to err on the conservative side for now, > > > since it seems that e.g. x86 operates only on the byte the bit is on. > > > > This is not correct, see for instance set_bit(): > > > > static __always_inline void > > set_bit(long nr, volatile unsigned long *addr) > > { > > if (IS_IMMEDIATE(nr)) { > > asm volatile(LOCK_PREFIX "orb %1,%0" > > : CONST_MASK_ADDR(nr, addr) > > : "iq" ((u8)CONST_MASK(nr)) > > : "memory"); > > } else { > > asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0" > > : : RLONG_ADDR(addr), "Ir" (nr) : "memory"); > > } > > } > > > > That results in: > > > > LOCK BTSQ nr, (addr) > > > > when @nr is not an immediate. > > Thanks for the clarification. Given that arm64 already instruments > bitops access to whole words, and x86 may also do so for some bitops, > it seems fine to instrument word-sized accesses by default. Is that > reasonable? Eminently -- the API is defined such; for bonus points KASAN should also do alignment checks on atomic ops. Future hardware will #AC on unaligned [*] LOCK prefix instructions. (*) not entirely accurate, it will only trap when crossing a line. https://lkml.kernel.org/r/1556134382-58814-1-git-send-email-fenghua.yu@xxxxxxxxx