On Tue, 13 Feb 2024 at 13:39, Oscar Salvador <osalvador@xxxxxxx> wrote: > > On Tue, Feb 13, 2024 at 12:34:55PM +0100, Vlastimil Babka wrote: > > On 2/13/24 10:21, Marco Elver wrote: > > > On Tue, 13 Feb 2024 at 10:16, Vlastimil Babka <vbabka@xxxxxxx> wrote: > > >> Isn't this racy? Shouldn't we use some atomic cmpxchg operation to change > > >> from REFCOUNT_SATURATED to 1? > > > > > > If 2 threads race here, both will want to add it to the list as well > > > and take the lock. So this could just be solved with double-checked > > > locking: > > > > > > if (count == REFCOUNT_SATURATED) { > > > spin_lock(...); > > > > Yeah probably stack_list_lock could be taken here already. But then the > > kmalloc() of struct stack must happen also here, before taking the lock. > > I am thinking what would be a less expensive and safer option here. > Of course, taking the spinlock is easier, but having the allocation > inside is tricky, and having it outside could mean that we might free > the struct once we checked __within__ the lock that the refcount > is no longer REFCOUNT_SATURATED. No big deal, but a bit sub-optimal. > > On the other hand, IIUC, cmpxchg has some memory ordering, like > store_and_release/load_acquire do, so would it be safe to use it > instead of taking the lock? Memory ordering here is secondary because the count is not used to release and later acquire any memory (the list is used for that, you change list head reads/writes to smp_load_acquire/smp_store_release in the later patch). The problem is mutual exclusion. You can do mutual exclusion with something like this as well: > if (refcount_read(&stack->count) == REFCOUNT_SATURATED) { > int old = REFCOUNT_SATURATED; > if (atomic_try_cmpxchg_relaxed(&stack->count.refs, &old, 1)) > add_stack_record_to_list(...); > } > refcount_inc(&stack->count); Probably simpler.