On 2023/10/24 00:00, Vlastimil Babka wrote: > On 10/21/23 16:43, chengming.zhou@xxxxxxxxx wrote: >> From: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> >> >> Now we will freeze slabs when moving them out of node partial list to >> cpu partial list, this method needs two cmpxchg_double operations: >> >> 1. freeze slab (acquire_slab()) under the node list_lock >> 2. get_freelist() when pick used in ___slab_alloc() >> >> Actually we don't need to freeze when moving slabs out of node partial >> list, we can delay freeze to use slab freelist in ___slab_alloc(), so >> we can save one cmpxchg_double(). >> >> And there are other good points: >> >> 1. The moving of slabs between node partial list and cpu partial list >> becomes simpler, since we don't need to freeze or unfreeze at all. >> >> 2. The node list_lock contention would be less, since we only need to >> freeze one slab under the node list_lock. (In fact, we can first >> move slabs out of node partial list, don't need to freeze any slab >> at all, so the contention on slab won't transfer to the node list_lock >> contention.) >> >> We can achieve this because there is no concurrent path would manipulate >> the partial slab list except the __slab_free() path, which is serialized >> now. >> >> Note this patch just change the parts of moving the partial slabs for >> easy code review, we will fix other parts in the following patches. >> Specifically this patch change three paths: >> 1. get partial slab from node: get_partial_node() >> 2. put partial slab to node: __unfreeze_partials() >> 3. cache partail slab on cpu when __slab_free() > > So the problem with this approach that one patch breaks things and another > one fixes them up, is that git bisect becomes problematic, so we shouldn't > do that and instead bite the bullet and deal with a potentially large patch. > At some level it's unavoidable as one has to grasp all the moving pieces > anyway to see e.g. if the changes in allocation path are compatible with the > changes in freeing. > When possible, we can do preparatory stuff that doesn't break things like in > patches 1 and 2, maybe get_cpu_partial() could also be introduced (even if > unused) before switching the world over to the new scheme in a single patch > (and possible cleanups afterwards). So would it be possible to redo it in > such way please? Ok, I will change to this way. I didn't know how to handle this potentially large patch gracefully at first. Your detailed advice is very helpful to me, I will prepare all possible preparatory stuff, then change all parts over in one patch. > > <snip> > >> @@ -2621,23 +2622,7 @@ static void __unfreeze_partials(struct kmem_cache *s, struct slab *partial_slab) >> spin_lock_irqsave(&n->list_lock, flags); >> } >> >> - do { >> - >> - old.freelist = slab->freelist; >> - old.counters = slab->counters; >> - VM_BUG_ON(!old.frozen); >> - >> - new.counters = old.counters; >> - new.freelist = old.freelist; >> - >> - new.frozen = 0; >> - >> - } while (!__slab_update_freelist(s, slab, >> - old.freelist, old.counters, >> - new.freelist, new.counters, >> - "unfreezing slab")); >> - >> - if (unlikely(!new.inuse && n->nr_partial >= s->min_partial)) { >> + if (unlikely(!slab->inuse && n->nr_partial >= s->min_partial)) { >> slab->next = slab_to_discard; >> slab_to_discard = slab; >> } else { >> @@ -3634,18 +3619,8 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab, >> was_frozen = new.frozen; >> new.inuse -= cnt; >> if ((!new.inuse || !prior) && !was_frozen) { >> - >> - if (kmem_cache_has_cpu_partial(s) && !prior) { >> - >> - /* >> - * Slab was on no list before and will be >> - * partially empty >> - * We can defer the list move and instead >> - * freeze it. >> - */ >> - new.frozen = 1; >> - >> - } else { /* Needs to be taken off a list */ >> + /* Needs to be taken off a list */ >> + if (!kmem_cache_has_cpu_partial(s) || prior) { >> >> n = get_node(s, slab_nid(slab)); >> /* >> @@ -3675,7 +3650,7 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab, >> * activity can be necessary. >> */ >> stat(s, FREE_FROZEN); >> - } else if (new.frozen) { >> + } else if (kmem_cache_has_cpu_partial(s) && !prior) { >> /* >> * If we just froze the slab then put it onto the >> * per cpu partial list. > > I think this comment is now misleading, we didn't freeze the slab, so it's > now something like "if we started with a full slab...". Ok, will check and change these inconsistent comments by the way. Thanks!