Re: [PATCH v9 03/10] asm/rwonce: Introduce [READ|WRITE]_ONCE() support for __int128

"Arnd Bergmann" <arnd@xxxxxxxx> · Thu, 07 Nov 2024 11:01:58 +0100

On Wed, Nov 6, 2024, at 14:40, Jason Gunthorpe wrote:
> On Wed, Nov 06, 2024 at 11:01:20AM +0100, Uros Bizjak wrote:
>> On Wed, Nov 6, 2024 at 9:55 AM Arnd Bergmann <arnd@xxxxxxxx> wrote:
>> >
>> > On Tue, Nov 5, 2024, at 13:30, Joerg Roedel wrote:
>> > > On Fri, Nov 01, 2024 at 04:22:57PM +0000, Suravee Suthikulpanit wrote:
>> > >>  include/asm-generic/rwonce.h   | 2 +-
>> > >>  include/linux/compiler_types.h | 8 +++++++-
>> > >>  2 files changed, 8 insertions(+), 2 deletions(-)
>> > >
>> > > This patch needs Cc:
>> > >
>> > >       Arnd Bergmann <arnd@xxxxxxxx>
>> > >       linux-arch@xxxxxxxxxxxxxxx
>> > >
>> >
>> > It also needs an update to the comment about why this is safe:
>> >
>> > >> +++ b/include/asm-generic/rwonce.h
>> > >> @@ -33,7 +33,7 @@
>> > >>   * (e.g. a virtual address) and a strong prevailing wind.
>> > >>   */
>> > >>  #define compiletime_assert_rwonce_type(t)                                   \
>> > >> -    compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long),  \
>> > >> +    compiletime_assert(__native_word(t) || sizeof(t) == sizeof(__dword_type), \
>> > >>              "Unsupported access size for {READ,WRITE}_ONCE().")
>> >
>> > As far as I can tell, 128-but words don't get stored atomically on
>> > any architecture, so this seems wrong, because it would remove
>> > the assertion on someone incorrectly using WRITE_ONCE() on a
>> > 128-bit variable.
>> 
>> READ_ONCE() and WRITE_ONCE() do not guarantee atomicity for double
>> word types. They only guarantee (c.f. include/asm/generic/rwonce.h):
>> 
>>  * Prevent the compiler from merging or refetching reads or writes. The
>>  * compiler is also forbidden from reordering successive instances of
>>  * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
>>  * particular ordering. ...
>> 
>> and later:
>> 
>>  * Yes, this permits 64-bit accesses on 32-bit architectures. These will
>>  * actually be atomic in some cases (namely Armv7 + LPAE), but for others we
>>  * rely on the access being split into 2x32-bit accesses for a 32-bit quantity
>>  * (e.g. a virtual address) and a strong prevailing wind.
>> 
>> This is the "strong prevailing wind", mentioned in the patch review at [1].
>> 
>> [1] https://lore.kernel.org/lkml/20241016130819.GJ3559746@xxxxxxxxxx/

I understand the special case for ARMv7VE. I think the more important
comment in that file is

  * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
  * atomicity. Note that this may result in tears!

The entire point of compiletime_assert_rwonce_type() is to ensure
that these are accesses fit the stricter definition, and I would
prefer to not extend that to 64-bit architecture. If there are users
that need the "once" behavior but not require atomicity of the
access, can't that just use __READ_ONCE() instead?

> Yes, there are two common uses for READ_ONCE, actually "read once" and
> prevent compiler double read "optimizations". It doesn't matter if
> things tear in this case because it would be used with cmpxchg or
> something like that.
>
> The other is an atomic relaxed memory order read, which would
> have to guarentee non-tearing.
>
> It is unfortunate the kernel has combined these two things, and we
> probably have exciting bugs on 32 bit from places using the atomic
> read variation on a u64..

Right, at the minimum, we'd need to separate READ_ONCE()/WRITE_ONCE()
from the smp_load_acquire()/smp_store_release() definitions in
asm/barrier.h. Those certainly don't work beyond word size aside
from a few special cases.

>> FYI, Processors with AVX guarantee 128bit atomic access with SSE
>> 128bit move instructions, see e.g. [2].
>> 
>> [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688

AVX instructions are not used in the kernel. If you want atomic
loads, that has to rely on architecture specific instructions
like cmpxchg16b on x86-64 or ldp on arm64. Actually using these
requires checking the system_has_cmpxchg128() macro.

   Arnd