Re: [RFC] why do we need smp_rmb/smp_wmb pair in fd_install()/expand_fdtable()?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 8 Aug 2024 at 06:20, Christian Brauner <brauner@xxxxxxxxxx> wrote:
>
> But then multiple times people brought up that supposedly smp_rmb() and
> smp_wmb() are cheaper because they only do load or store ordering
> whereas smp_{load,store}_{acquire,release}() do load and store ordering.

It really can go either way.

But I think we've reached a point where release/acquire is "typically
cheaper", and the reason is simply arm64.

As mentioned, on x86 none of this matters. And on older architectures
that were designed around the concept of separate memory barriers, the
rmb/wmb model thus matches that architecture model and tends to be
natural and likely the best impedance match.

But the arm64 memory ordering was created after people had figured out
the rules of good memory ordering, and so we have this:

   https://developer.arm.com/documentation/102336/0100/Load-Acquire-and-Store-Release-instructions

and this particular quote:

 "Weaker ordering requirements that are imposed by Load-Acquire and
  Store-Release instructions allow for micro-architectural
  optimizations, which could reduce some of the performance impacts that
  are otherwise imposed by an explicit memory barrier.

  If the ordering requirement is satisfied using either a Load-Acquire
  or Store-Release, then it would be preferable to use these
  instructions instead of a DMB"

iow we now have a relevant architecture that gets memory ordering
right, and that officially prefers release/acquire ordering.

End result: we *used* to prefer rmb/wmb pairs, because (a) it was how
we did memory ordering originally, (b) relevant architectures didn't
care, and (c) it matched the questionable architectures.

And now, in the last few years, the equation has simply shifted.

So rmb/wmb has gone from "this is the only way to do it" to "this is
the legacy way to do it and it performs ok everywhere" to "this is the
historical way that some people are more used to".

For new code, release/acquire is preferred. And if it's *critical*
code, maybe it's even worth converting from wmb/rmb to
release/acquire.

Partly because of that "it should be better on arm64", but also partly
because I think release/acquire is both a better model conceptually,
_and_ is more self-documenting (ie it's a nice explicit hand-off in
ways that some of our subtler "this wmb pairs with that rmb" code is
very much not at all self-documenting and needs very explicit and
clear comments).

Now, I'm not saying you shouldn't add a comment about a
release/acquire pair, but at the same time, the very fact that you
release a _particular_ variable and acquire that variable elsewhere
*is* a big clue. So when I'm saying it's "more self-documenting", I
want to emphasize that "more". I'm not claiming it's _completely_
self-documenting ;)

        Linus




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux