Re: [RFC] why do we need smp_rmb/smp_wmb pair in fd_install()/expand_fdtable()?

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Wed, 7 Aug 2024 20:06:31 -0700

On Wed, 7 Aug 2024 at 19:50, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> What's the problem with droping both barriers and turning that
> into
>         expanded = expand_fdtable(files, nr);
>         smp_store_release(&files->resize_in_progress, false);
> and
>         if (unlikely(smp_load_acquire(&files->resize_in_progress))) {
>                 ....
>                 return;
>         }

That should be fine. smp_store_release()->smp_load_acquire() is the
more modern model, and the better one. But I think we simply have a
long history of using the old smp_wmb()->smp_rmb() model, so we have a
lot of code that does that.

On x86, there's basically no difference - in all cases it ends up
being just an instruction scheduling barrier.

On arm64, store_release->load_acquire is likely better, but obviously
micro-architectural implementation issues might make it a wash.

On other architectures, there probably isn't a huge difference, but
acquire/release can be more expensive if the architecture is
explicitly designed for the old-style rmb/wmb model.

So on alpha, for example, store_release->load_acquire ends up being a
full memory barrier in both cases (rmb is always a full memory barrier
on alpha), which is hugely more expensive than wmb (well, again, in
theory this is all obviously dependent on microarchitectures, but wmb
in particular is very cheap unless the uarch really screwed the pooch
and just messed up its barriers entirely).

End result: wmb/rmb is usually never _horrific_, while release/acquire
can be rather expensive on bad machines.

But release/acquire is the RightThing(tm), and the fact that alpha
based its ordering on the bad old model is not really our problem.

So I'm ok with just saying "screw bad memory orderings, go with the
modern model"

             Linus