On Fri, Dec 13, 2019 at 02:17:08PM +0100, Arnd Bergmann wrote: > On Thu, Dec 12, 2019 at 9:50 PM Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, Dec 12, 2019 at 11:34 AM Will Deacon <will@xxxxxxxxxx> wrote: > > > The root of my concern in all of this, and what started me looking at it in > > > the first place, is the interaction with 'typeof()'. Inheriting 'volatile' > > > for a pointer means that local variables in macros declared using typeof() > > > suddenly start generating *hideous* code, particularly when pointless stack > > > spills get stackprotector all excited. > > > > Yeah, removing volatile can be a bit annoying. > > > > For the particular case of the bitops, though, it's not an issue. > > Since you know the type there, you can just cast it. > > > > And if we had the rule that READ_ONCE() was an arithmetic type, you could do > > > > typeof(0+(*p)) __var; > > > > since you might as well get the integer promotion anyway (on the > > non-volatile result). > > > > But that doesn't work with structures or unions, of course. > > > > I'm not entirely sure we have READ_ONCE() with a struct. I do know we > > have it with 64-bit entities on 32-bit machines, but that's ok with > > the "0+" trick. > > I'll have my randconfig builder look for instances, so far I found one, > see below. My feeling is that it would be better to enforce at least > the size being a 1/2/4/8, to avoid cases where someone thinks > the access is atomic, but it falls back on a memcpy. I've been using something similar built on compiletime_assert_atomic_type() and I spotted another instance in the xdp code (xskq_validate_desc()) which tries to READ_ONCE() on a 128-bit descriptor, although a /very/ quick read of the code suggests that this probably can't be concurrently modified if the ring indexes are synchronised properly. However, enabling this for 32-bit ARM is total carnage; as Linus mentioned, a whole bunch of code appears to be relying on atomic 64-bit access of READ_ONCE(); the perf ring buffer, io_uring, the scheduler, pm_runtime, cpuidle, ... :( Unfortunately, at least some of these *do* look like bugs, but I can't see how we can fix them, not least because the first two are user ABI afaict. It may also be that in practice we get 2x32-bit stores, and that works out fine when storing a 32-bit virtual address. I'm not sure what (if anything) the compiler guarantees in these cases. Will