On Wed, Nov 6, 2024, at 14:40, Jason Gunthorpe wrote: > On Wed, Nov 06, 2024 at 11:01:20AM +0100, Uros Bizjak wrote: >> On Wed, Nov 6, 2024 at 9:55 AM Arnd Bergmann <arnd@xxxxxxxx> wrote: >> > >> > On Tue, Nov 5, 2024, at 13:30, Joerg Roedel wrote: >> > > On Fri, Nov 01, 2024 at 04:22:57PM +0000, Suravee Suthikulpanit wrote: >> > >> include/asm-generic/rwonce.h | 2 +- >> > >> include/linux/compiler_types.h | 8 +++++++- >> > >> 2 files changed, 8 insertions(+), 2 deletions(-) >> > > >> > > This patch needs Cc: >> > > >> > > Arnd Bergmann <arnd@xxxxxxxx> >> > > linux-arch@xxxxxxxxxxxxxxx >> > > >> > >> > It also needs an update to the comment about why this is safe: >> > >> > >> +++ b/include/asm-generic/rwonce.h >> > >> @@ -33,7 +33,7 @@ >> > >> * (e.g. a virtual address) and a strong prevailing wind. >> > >> */ >> > >> #define compiletime_assert_rwonce_type(t) \ >> > >> - compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ >> > >> + compiletime_assert(__native_word(t) || sizeof(t) == sizeof(__dword_type), \ >> > >> "Unsupported access size for {READ,WRITE}_ONCE().") >> > >> > As far as I can tell, 128-but words don't get stored atomically on >> > any architecture, so this seems wrong, because it would remove >> > the assertion on someone incorrectly using WRITE_ONCE() on a >> > 128-bit variable. >> >> READ_ONCE() and WRITE_ONCE() do not guarantee atomicity for double >> word types. They only guarantee (c.f. include/asm/generic/rwonce.h): >> >> * Prevent the compiler from merging or refetching reads or writes. The >> * compiler is also forbidden from reordering successive instances of >> * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some >> * particular ordering. ... >> >> and later: >> >> * Yes, this permits 64-bit accesses on 32-bit architectures. These will >> * actually be atomic in some cases (namely Armv7 + LPAE), but for others we >> * rely on the access being split into 2x32-bit accesses for a 32-bit quantity >> * (e.g. a virtual address) and a strong prevailing wind. >> >> This is the "strong prevailing wind", mentioned in the patch review at [1]. >> >> [1] https://lore.kernel.org/lkml/20241016130819.GJ3559746@xxxxxxxxxx/ I understand the special case for ARMv7VE. I think the more important comment in that file is * Use __READ_ONCE() instead of READ_ONCE() if you do not require any * atomicity. Note that this may result in tears! The entire point of compiletime_assert_rwonce_type() is to ensure that these are accesses fit the stricter definition, and I would prefer to not extend that to 64-bit architecture. If there are users that need the "once" behavior but not require atomicity of the access, can't that just use __READ_ONCE() instead? > Yes, there are two common uses for READ_ONCE, actually "read once" and > prevent compiler double read "optimizations". It doesn't matter if > things tear in this case because it would be used with cmpxchg or > something like that. > > The other is an atomic relaxed memory order read, which would > have to guarentee non-tearing. > > It is unfortunate the kernel has combined these two things, and we > probably have exciting bugs on 32 bit from places using the atomic > read variation on a u64.. Right, at the minimum, we'd need to separate READ_ONCE()/WRITE_ONCE() from the smp_load_acquire()/smp_store_release() definitions in asm/barrier.h. Those certainly don't work beyond word size aside from a few special cases. >> FYI, Processors with AVX guarantee 128bit atomic access with SSE >> 128bit move instructions, see e.g. [2]. >> >> [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 AVX instructions are not used in the kernel. If you want atomic loads, that has to rely on architecture specific instructions like cmpxchg16b on x86-64 or ldp on arm64. Actually using these requires checking the system_has_cmpxchg128() macro. Arnd