On Wed, 7 Aug 2024 at 19:50, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > What's the problem with droping both barriers and turning that > into > expanded = expand_fdtable(files, nr); > smp_store_release(&files->resize_in_progress, false); > and > if (unlikely(smp_load_acquire(&files->resize_in_progress))) { > .... > return; > } That should be fine. smp_store_release()->smp_load_acquire() is the more modern model, and the better one. But I think we simply have a long history of using the old smp_wmb()->smp_rmb() model, so we have a lot of code that does that. On x86, there's basically no difference - in all cases it ends up being just an instruction scheduling barrier. On arm64, store_release->load_acquire is likely better, but obviously micro-architectural implementation issues might make it a wash. On other architectures, there probably isn't a huge difference, but acquire/release can be more expensive if the architecture is explicitly designed for the old-style rmb/wmb model. So on alpha, for example, store_release->load_acquire ends up being a full memory barrier in both cases (rmb is always a full memory barrier on alpha), which is hugely more expensive than wmb (well, again, in theory this is all obviously dependent on microarchitectures, but wmb in particular is very cheap unless the uarch really screwed the pooch and just messed up its barriers entirely). End result: wmb/rmb is usually never _horrific_, while release/acquire can be rather expensive on bad machines. But release/acquire is the RightThing(tm), and the fact that alpha based its ordering on the bad old model is not really our problem. So I'm ok with just saying "screw bad memory orderings, go with the modern model" Linus