On Fri, Jan 15, 2021 at 7:35 PM Vlastimil Babka <vbabka@xxxxxxx> wrote: > In deactivate_slab() we currently move all but one objects on the cpu freelist > to the page freelist one by one using the costly cmpxchg_double() operation. > Then we unfreeze the page while moving the last object on page freelist, with > a final cmpxchg_double(). > > This can be optimized to avoid the cmpxchg_double() per object. Just count the > objects on cpu freelist (to adjust page->inuse properly) and also remember the > last object in the chain. Then splice page->freelist to the last object and > effectively add the whole cpu freelist to page->freelist while unfreezing the > page, with a single cmpxchg_double(). This might have some more (good) effects, although these might well be too minuscule to notice: - The old code inverted the direction of the freelist, while the new code preserves the direction. - We're no longer dirtying the cachelines of objects in the middle of the freelist. In the current code it probably doesn't really matter, since I think we basically only take this path when handling NUMA mismatches, PFMEMALLOC stuff, racing new_slab(), and flush_slab() for handling flushing IPIs? But yeah, if we want to start automatically sending flush IPIs, it might be a good idea, given that the next accesses to the page will probably come from a different CPU (unless the page is entirely unused, in which case it may be freed to the page allocator's percpu list) and we don't want to create unnecessary cache/memory traffic. (And it's a good cleanup regardless, I think.) > Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx> Reviewed-by: Jann Horn <jannh@xxxxxxxxxx> [...] > /* > - * Stage two: Ensure that the page is unfrozen while the > - * list presence reflects the actual number of objects > - * during unfreeze. > + * Stage two: Unfreeze the page while splicing the per-cpu > + * freelist to the head of page's freelist. > + * > + * Ensure that the page is unfrozen while the list presence > + * reflects the actual number of objects during unfreeze. (my computer complains about trailing whitespace here) > * > * We setup the list membership and then perform a cmpxchg > * with the count. If there is a mismatch then the page