On Fri, Feb 25, 2022 at 11:07:45AM +0100, Vlastimil Babka wrote: > On 2/25/22 10:50, Hyeonggon Yoo wrote: > > On Fri, Feb 25, 2022 at 09:34:09AM +0000, Hyeonggon Yoo wrote: > >> On Thu, Feb 24, 2022 at 07:16:11PM +0100, Vlastimil Babka wrote: > >> > On 2/21/22 11:53, Hyeonggon Yoo wrote: > >> > > Simply deactivate_slab() by removing variable 'lock' and replacing > >> > > 'l' and 'm' with 'mode'. Instead, remove slab from list and unlock > >> > > n->list_lock when cmpxchg_double() fails, and then retry. > >> > > > >> > > One slight functional change is releasing and taking n->list_lock again > >> > > when cmpxchg_double() fails. This is not harmful because SLUB avoids > >> > > deactivating slabs as much as possible. > >> > > > >> > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> > >> > > >> > Hm I wonder if we could simplify even a bit more. Do we have to actually > >> > place the slab on a partial (full) list before the cmpxchg, only to remove > >> > it when cmpxchg fails? Seems it's to avoid anyone else seeing the slab > >> > un-frozen, but not on the list, which would be unexpected. However if anyone > >> > sees such slab, they have to take the list_lock first to start working with > >> > the slab... so this should be safe, because we hold the list_lock here, and > >> > will place the slab on the list before we release it. But it thus shouldn't > >> > matter if the placement happens before or after a successful cmpxchg, no? So > >> > we can only do it once after a successful cmpxchg and need no undo's? > >> > > >> > >> My thought was similar. But after testing I noticed that &n->list_lock prevents > >> race between __slab_free() and deactivate_slab(). > >> > >> > Specifically AFAIK the only possible race should be with a __slab_free() > >> > which might observe !was_frozen after we succeed an unfreezing cmpxchg and > >> > go through the > >> > "} else { /* Needs to be taken off a list */" > >> > branch but then it takes the list_lock as the first thing, so will be able > >> > to proceed only after the slab is actually on the list. > >> > > >> > Do I miss anything or would you agree? > >> > > >> > >> It's so tricky. > >> > >> I tried to simplify more as you said. Seeing frozen slab on list was not > >> problem. But the problem was that something might interfere between > >> cmpxchg_double() and taking spinlock. > >> > >> This is what I faced: > >> > >> CPU A CPU B > >> deactivate_slab() { __slab_free() { > >> /* slab is full */ > >> slab.frozen = 0; > >> cmpxchg_double(); > >> /* Hmm... > >> slab->frozen == 0 && > >> slab->freelist != NULL? > >> Oh This must be on the list.. */ > > Oh this is wrong. > > slab->freelist must be > > NULL because it's full > > slab. > > > > It's more complex > > than I thought... > > > > > >> spin_lock_irqsave(); > >> cmpxchg_double(); > >> /* Corruption: slab > >> * was not yet inserted to > >> * list but try removing */ > >> remove_full(); > >> spin_unlock_irqrestore(); > >> } > >> spin_lock_irqsave(); > >> add_full(); > >> spin_unlock_irqrestore(); > >> } > > > > So it was... > > > > CPU A CPU B > > deactivate_slab() { __slab_free() { > > /* slab is full */ > > slab.frozen = 0; > > cmpxchg_double(); > > /* > > Hmm... > > !was_frozen && > > prior == NULL? > > Let's freeze this! > > */ > > put_cpu_partial(); > > } > > spin_lock_irqsave(); > > Yeah in my proposal I didn't intend to only take spin_lock_irqsave() here. > My idea for CPU A would be something like: > Oh, misunderstood your proposal. I spent hours figuring what's wrong haha > spin_lock_irqsave(); > slab.frozen = 0; > if (cmpxchg_double()); { > /* success */ > add_partial(); // (or add_full()) > spin_unlock_irqrestore(); > } else { > /* fail */ > spin_unlock_irqrestore(); > goto redo; > } > > So we would still have the list_lock protection around cmpxchg as in the > current code. We just wouldn't do e.g. add_partial() before cmpxchg, only to > remove_partial() when cmpxchg failed. Now I got what you mean... I think that would work. Will try that. Thank you for nice proposal! > > > add_full(); > > /* It's now frozen by CPU B and at the same time on full list */ > > spin_unlock_irqrestore(); > > > > And &n->list_lock prevents such a race. > -- Thank you, You are awesome! Hyeonggon :-)