On Wed, 23 Oct 2024, Peter Zijlstra wrote: > > I doubt anybody will notice, and smp_load_acquire() is the future. Any > > architecture that does badly on it just doesn't matter (and, as > > mentioned, I don't think they even exist - "smp_rmb()" is generally at > > least as expensive). > > Do we want to do the complementing patch and make write_seqcount_end() > use smp_store_release() ? > > I think at least ARM (the 32bit thing) has wmb but uses mb for > store_release. But I also think I don't really care about that. The proper instruction would be something like atomic_inc_release(&seqcount) The current atomics do not provide such a macro. The closest in the current tree is atomic_inc_return_release(). We would prefer atomic_inc_release(&seqcount) because such an atomic may be executed as a far atomic in the ARM mesh. This could be cheaper than a local atomic and could f.e. be executed on the memory controller of a remote NUMA node in order to avoid a costly transfer of cacheline ownership. The code generated is a atomic that also does a release. So there would be no extra barrier etc needed.