On 3/11/25 23:24, Alexei Starovoitov wrote: > On Tue, Mar 11, 2025 at 9:21 PM Vlastimil Babka <vbabka@xxxxxxx> wrote: >> >> On 3/11/25 17:31, Mateusz Guzik wrote: >> > On Tue, Mar 11, 2025 at 5:21 PM Sebastian Andrzej Siewior >> > <bigeasy@xxxxxxxxxxxxx> wrote: >> >> >> >> On 2025-03-11 16:44:30 [+0100], Mateusz Guzik wrote: >> >> > On Fri, Feb 21, 2025 at 06:44:22PM -0800, Alexei Starovoitov wrote: >> >> > > +#define __localtry_lock(lock) \ >> >> > > + do { \ >> >> > > + localtry_lock_t *lt; \ >> >> > > + preempt_disable(); \ >> >> > > + lt = this_cpu_ptr(lock); \ >> >> > > + local_lock_acquire(<->llock); \ >> >> > > + WRITE_ONCE(lt->acquired, 1); \ >> >> > > + } while (0) >> >> > >> >> > I think these need compiler barriers. >> >> > >> >> > I checked with gcc docs (https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html) >> >> > and found this as confirmation: >> >> > > Accesses to non-volatile objects are not ordered with respect to volatile accesses. >> >> > >> >> > Unless the Linux kernel is built with some magic to render this moot(?). >> >> >> >> You say we need a barrier() after the WRITE_ONCE()? If so, we need it in >> >> the whole file… >> >> >> > >> > I see the original local_lock machinery on the stock kernel works fine >> > as it expands to the preempt pair which has the appropriate fences. If >> > debug is added, the "locking" remains unaffected, but the debug state >> > might be bogus when looked at from the "wrong" context and adding the >> > compiler fences would trivially sort it out. I don't think it's a big >> > deal for *their* case, but patching that up should not raise any >> > eyebrows and may prevent eyebrows from going up later. >> > >> > The machinery added in this patch does need the addition for >> > correctness in the base operation though. >> >> Yeah my version of this kind of lock in sheaves code had those barrier()'s, >> IIRC after you or Jann told me. It's needed so that the *compiler* does not >> e.g. reorder a write to the protected data to happen before the >> WRITE_ONCE(lt->acquired, 1) (or after the WRITE_ONCE(lt->acquired, 0) in >> unlock). > > I think you all are missing a fine print in gcc doc: > "Unless...can be aliased". > The kernel is compiled with -fno-strict-aliasing. > No need for barrier()s here. Note I know next to nothing about these things, but I see here [1]: "Whether GCC actually performs type-based aliasing analysis depends on the details of the code. GCC has other ways to determine (in some cases) whether objects alias, and if it gets a reliable answer that way, it won’t fall back on type-based heuristics. [...] You can turn off type-based aliasing analysis by giving GCC the option -fno-strict-aliasing." I'd read that as -fno-strict-aliasing only disables TBAA, but that does not necessary mean anything can be assumed to be aliased with anything? An if we e.g. have a pointer to memcg_stock_pcp through which we access the stock_lock an the other (protected) fields and that pointer doesn't change between that, I imagine gcc can reliably determine these can't alias? [1] https://www.gnu.org/software/c-intro-and-ref/manual/html_node/Aliasing-Type-Rules.html