On Thu, Feb 24, 2022 at 07:16:11PM +0100, Vlastimil Babka wrote: > On 2/21/22 11:53, Hyeonggon Yoo wrote: > > Simply deactivate_slab() by removing variable 'lock' and replacing > > 'l' and 'm' with 'mode'. Instead, remove slab from list and unlock > > n->list_lock when cmpxchg_double() fails, and then retry. > > > > One slight functional change is releasing and taking n->list_lock again > > when cmpxchg_double() fails. This is not harmful because SLUB avoids > > deactivating slabs as much as possible. > > > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> > > Hm I wonder if we could simplify even a bit more. Do we have to actually > place the slab on a partial (full) list before the cmpxchg, only to remove > it when cmpxchg fails? Seems it's to avoid anyone else seeing the slab > un-frozen, but not on the list, which would be unexpected. However if anyone > sees such slab, they have to take the list_lock first to start working with > the slab... so this should be safe, because we hold the list_lock here, and > will place the slab on the list before we release it. But it thus shouldn't > matter if the placement happens before or after a successful cmpxchg, no? So > we can only do it once after a successful cmpxchg and need no undo's? > My thought was similar. But after testing I noticed that &n->list_lock prevents race between __slab_free() and deactivate_slab(). > Specifically AFAIK the only possible race should be with a __slab_free() > which might observe !was_frozen after we succeed an unfreezing cmpxchg and > go through the > "} else { /* Needs to be taken off a list */" > branch but then it takes the list_lock as the first thing, so will be able > to proceed only after the slab is actually on the list. > > Do I miss anything or would you agree? > It's so tricky. I tried to simplify more as you said. Seeing frozen slab on list was not problem. But the problem was that something might interfere between cmpxchg_double() and taking spinlock. This is what I faced: CPU A CPU B deactivate_slab() { __slab_free() { /* slab is full */ slab.frozen = 0; cmpxchg_double(); /* Hmm... slab->frozen == 0 && slab->freelist != NULL? Oh This must be on the list.. */ spin_lock_irqsave(); cmpxchg_double(); /* Corruption: slab * was not yet inserted to * list but try removing */ remove_full(); spin_unlock_irqrestore(); } spin_lock_irqsave(); add_full(); spin_unlock_irqrestore(); } I think it's quite confusing because it's protecting code, not data. Would you have an idea to solve it, or should we add a comment for this? > > --- > > mm/slub.c | 74 +++++++++++++++++++++++++------------------------------ > > 1 file changed, 33 insertions(+), 41 deletions(-) > > > > diff --git a/mm/slub.c b/mm/slub.c > > index a4964deccb61..2d0663befb9e 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -2350,8 +2350,8 @@ static void deactivate_slab(struct kmem_cache *s, struct slab *slab, > > { > > enum slab_modes { M_NONE, M_PARTIAL, M_FULL, M_FREE }; > > struct kmem_cache_node *n = get_node(s, slab_nid(slab)); > > - int lock = 0, free_delta = 0; > > - enum slab_modes l = M_NONE, m = M_NONE; > > + int free_delta = 0; > > + enum slab_modes mode = M_NONE; > > void *nextfree, *freelist_iter, *freelist_tail; > > int tail = DEACTIVATE_TO_HEAD; > > unsigned long flags = 0; > > @@ -2420,57 +2420,49 @@ static void deactivate_slab(struct kmem_cache *s, struct slab *slab, > > new.frozen = 0; > > > > if (!new.inuse && n->nr_partial >= s->min_partial) > > - m = M_FREE; > > + mode = M_FREE; > > else if (new.freelist) { > > - m = M_PARTIAL; > > - if (!lock) { > > - lock = 1; > > - /* > > - * Taking the spinlock removes the possibility that > > - * acquire_slab() will see a slab that is frozen > > - */ > > - spin_lock_irqsave(&n->list_lock, flags); > > - } > > - } else { > > - m = M_FULL; > > - if (kmem_cache_debug_flags(s, SLAB_STORE_USER) && !lock) { > > - lock = 1; > > - /* > > - * This also ensures that the scanning of full > > - * slabs from diagnostic functions will not see > > - * any frozen slabs. > > - */ > > - spin_lock_irqsave(&n->list_lock, flags); > > - } > > + mode = M_PARTIAL; > > + /* > > + * Taking the spinlock removes the possibility that > > + * acquire_slab() will see a slab that is frozen > > + */ > > + spin_lock_irqsave(&n->list_lock, flags); > > + add_partial(n, slab, tail); > > + } else if (kmem_cache_debug_flags(s, SLAB_STORE_USER)) { > > + mode = M_FULL; > > + /* > > + * This also ensures that the scanning of full > > + * slabs from diagnostic functions will not see > > + * any frozen slabs. > > + */ > > + spin_lock_irqsave(&n->list_lock, flags); > > + add_full(s, n, slab); > > } > > > > - if (l != m) { > > - if (l == M_PARTIAL) > > - remove_partial(n, slab); > > - else if (l == M_FULL) > > - remove_full(s, n, slab); > > > > - if (m == M_PARTIAL) > > - add_partial(n, slab, tail); > > - else if (m == M_FULL) > > - add_full(s, n, slab); > > - } > > - > > - l = m; > > if (!cmpxchg_double_slab(s, slab, > > old.freelist, old.counters, > > new.freelist, new.counters, > > - "unfreezing slab")) > > + "unfreezing slab")) { > > + if (mode == M_PARTIAL) { > > + remove_partial(n, slab); > > + spin_unlock_irqrestore(&n->list_lock, flags); > > + } else if (mode == M_FULL) { > > + remove_full(s, n, slab); > > + spin_unlock_irqrestore(&n->list_lock, flags); > > + } > > goto redo; > > + } > > > > - if (lock) > > - spin_unlock_irqrestore(&n->list_lock, flags); > > > > - if (m == M_PARTIAL) > > + if (mode == M_PARTIAL) { > > + spin_unlock_irqrestore(&n->list_lock, flags); > > stat(s, tail); > > - else if (m == M_FULL) > > + } else if (mode == M_FULL) { > > + spin_unlock_irqrestore(&n->list_lock, flags); > > stat(s, DEACTIVATE_FULL); > > - else if (m == M_FREE) { > > + } else if (mode == M_FREE) { > > stat(s, DEACTIVATE_EMPTY); > > discard_slab(s, slab); > > stat(s, FREE_SLAB); > -- Thank you, You are awesome! Hyeonggon :-)