On 2020/7/7 下午2:59, Christopher Lameter wrote: > On Thu, 2 Jul 2020, Xunlei Pang wrote: > >> This patch introduces two counters to maintain the actual number >> of partial objects dynamically instead of iterating the partial >> page lists with list_lock held. >> >> New counters of kmem_cache_node are: pfree_objects, ptotal_objects. >> The main operations are under list_lock in slow path, its performance >> impact is minimal. > > > If at all then these counters need to be under CONFIG_SLUB_DEBUG. > >> --- a/mm/slab.h >> +++ b/mm/slab.h >> @@ -616,6 +616,8 @@ struct kmem_cache_node { >> #ifdef CONFIG_SLUB >> unsigned long nr_partial; >> struct list_head partial; >> + atomic_long_t pfree_objects; /* partial free objects */ >> + atomic_long_t ptotal_objects; /* partial total objects */ > > Please in the CONFIG_SLUB_DEBUG. Without CONFIG_SLUB_DEBUG we need to > build with minimal memory footprint. Thanks for the comments. show_slab_objects() also calls it with CONFIG_SYSFS: if (flags & SO_PARTIAL) { struct kmem_cache_node *n; for_each_kmem_cache_node(s, node, n) { if (flags & SO_TOTAL) x = count_partial(n, count_total); else if (flags & SO_OBJECTS) x = count_partial(n, count_inuse); else x = n->nr_partial; total += x; nodes[node] += x; } } I'm not sure if it's due to some historical reason, it works without CONFIG_SLUB_DEBUG. > >> #ifdef CONFIG_SLUB_DEBUG >> atomic_long_t nr_slabs; >> atomic_long_t total_objects; >> diff --git a/mm/slub.c b/mm/slub.c > > > > Also this looks to be quite heavy on the cache and on execution time. Note > that the list_lock could be taken frequently in the performance sensitive > case of freeing an object that is not in the partial lists. > Yes, the concurrent __slab_free() has potential lock/atomic contention, how about using percpu variable for partial free like below? static inline void __update_partial_free(struct kmem_cache_node *n, long delta) { atomic_long_add(delta, this_cpu_ptr(n->partial_free_objs)); } -Xunlei