On Tue, Mar 11, 2025 at 9:21 PM Vlastimil Babka <vbabka@xxxxxxx> wrote: > > On 3/11/25 17:31, Mateusz Guzik wrote: > > On Tue, Mar 11, 2025 at 5:21 PM Sebastian Andrzej Siewior > > <bigeasy@xxxxxxxxxxxxx> wrote: > >> > >> On 2025-03-11 16:44:30 [+0100], Mateusz Guzik wrote: > >> > On Fri, Feb 21, 2025 at 06:44:22PM -0800, Alexei Starovoitov wrote: > >> > > +#define __localtry_lock(lock) \ > >> > > + do { \ > >> > > + localtry_lock_t *lt; \ > >> > > + preempt_disable(); \ > >> > > + lt = this_cpu_ptr(lock); \ > >> > > + local_lock_acquire(<->llock); \ > >> > > + WRITE_ONCE(lt->acquired, 1); \ > >> > > + } while (0) > >> > > >> > I think these need compiler barriers. > >> > > >> > I checked with gcc docs (https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html) > >> > and found this as confirmation: > >> > > Accesses to non-volatile objects are not ordered with respect to volatile accesses. > >> > > >> > Unless the Linux kernel is built with some magic to render this moot(?). > >> > >> You say we need a barrier() after the WRITE_ONCE()? If so, we need it in > >> the whole file… > >> > > > > I see the original local_lock machinery on the stock kernel works fine > > as it expands to the preempt pair which has the appropriate fences. If > > debug is added, the "locking" remains unaffected, but the debug state > > might be bogus when looked at from the "wrong" context and adding the > > compiler fences would trivially sort it out. I don't think it's a big > > deal for *their* case, but patching that up should not raise any > > eyebrows and may prevent eyebrows from going up later. > > > > The machinery added in this patch does need the addition for > > correctness in the base operation though. > > Yeah my version of this kind of lock in sheaves code had those barrier()'s, > IIRC after you or Jann told me. It's needed so that the *compiler* does not > e.g. reorder a write to the protected data to happen before the > WRITE_ONCE(lt->acquired, 1) (or after the WRITE_ONCE(lt->acquired, 0) in > unlock). I think you all are missing a fine print in gcc doc: "Unless...can be aliased". The kernel is compiled with -fno-strict-aliasing. No need for barrier()s here.