On Wed, 29 May 2019 at 12:01, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Wed, May 29, 2019 at 11:20:17AM +0200, Marco Elver wrote: > > For the default, we decided to err on the conservative side for now, > > since it seems that e.g. x86 operates only on the byte the bit is on. > > This is not correct, see for instance set_bit(): > > static __always_inline void > set_bit(long nr, volatile unsigned long *addr) > { > if (IS_IMMEDIATE(nr)) { > asm volatile(LOCK_PREFIX "orb %1,%0" > : CONST_MASK_ADDR(nr, addr) > : "iq" ((u8)CONST_MASK(nr)) > : "memory"); > } else { > asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0" > : : RLONG_ADDR(addr), "Ir" (nr) : "memory"); > } > } > > That results in: > > LOCK BTSQ nr, (addr) > > when @nr is not an immediate. Thanks for the clarification. Given that arm64 already instruments bitops access to whole words, and x86 may also do so for some bitops, it seems fine to instrument word-sized accesses by default. Is that reasonable?