Re: [PATCH v3] Avoid memory barrier in read_seqcount() through load acquire

"Christoph Lameter (Ampere)" <cl@xxxxxxxxxx> · Wed, 23 Oct 2024 16:42:36 -0700 (PDT)

On Wed, 23 Oct 2024, Peter Zijlstra wrote:

> > I doubt anybody will notice, and smp_load_acquire() is the future. Any
> > architecture that does badly on it just doesn't matter (and, as
> > mentioned, I don't think they even exist - "smp_rmb()" is generally at
> > least as expensive).
>
> Do we want to do the complementing patch and make write_seqcount_end()
> use smp_store_release() ?
>
> I think at least ARM (the 32bit thing) has wmb but uses mb for
> store_release. But I also think I don't really care about that.

The proper instruction would be something like

atomic_inc_release(&seqcount)

The current atomics do not provide such a macro.

The closest in the current tree is atomic_inc_return_release().

We would prefer atomic_inc_release(&seqcount) because such an
atomic may be executed as a far atomic in the ARM mesh.

This could be cheaper than a local atomic and could f.e. be executed
on the memory controller of a remote NUMA node in order to avoid a costly
transfer of cacheline ownership.

The code generated is a atomic that also does a release. So there would be
no extra barrier etc needed.