On Mon, 19 Apr 2021 11:49:04 Marco Elver wrote: >On Mon, 19 Apr 2021 at 11:44, Marco Elver <elver@xxxxxxxxxx> wrote: >> On Mon, 19 Apr 2021 at 11:41, Hillf Danton <hdanton@xxxxxxxx> wrote: >> > On Mon, 19 Apr 2021 10:50:25 Marco Elver wrote: >> > > + >> > > + WRITE_ONCE(kfence_timer_waiting, true); >> > > + smp_mb(); /* See comment in __kfence_alloc(). */ >> > >> > This is not needed given task state change in wait_event(). >> >> Yes it is. We want to avoid the unconditional irq_work in >> __kfence_alloc(). When the system is under load doing frequent >> allocations, at least in my tests this avoids the irq_work almost >> always. Without the irq_work you'd be correct of course. > >And in case this is about the smp_mb() here, yes it definitely is >required. We *must* order the write of kfence_timer_waiting *before* >the check of kfence_allocation_gate, which wait_event() does before >anything else (including changing the state). One of the reasons why wait_event() checks the wait condition before anything else is no waker can help waiter before waiter gets themselves on the wait queue head list. Nor can waker without scheduling on the waiter side, even if the waiter is sitting on the list. So the mb cannot make sense without scheduling, let alone the mb in wait_event(). >Otherwise the write may >be reordered after the read, and we could potentially never wake up >because __kfence_alloc() not waking us. > >This is documented in __kfence_alloc(). > >> > > + wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate), HZ); >> > > + smp_store_release(&kfence_timer_waiting, false); /* Order after wait_event(). */ >> > > +