On Thu, Jul 2, 2020 at 11:32 AM Xunlei Pang <xlpang@xxxxxxxxxxxxxxxxx> wrote: > The node list_lock in count_partial() spend long time iterating > in case of large amount of partial page lists, which can cause > thunder herd effect to the list_lock contention, e.g. it cause > business response-time jitters when accessing "/proc/slabinfo" > in our production environments. Would you have any numbers to share to quantify this jitter? I have no objections to this approach, but I think the original design deliberately made reading "/proc/slabinfo" more expensive to avoid atomic operations in the allocation/deallocation paths. It would be good to understand what is the gain of this approach before we switch to it. Maybe even run some slab-related benchmark (not sure if there's something better than hackbench these days) to see if the overhead of this approach shows up. > This patch introduces two counters to maintain the actual number > of partial objects dynamically instead of iterating the partial > page lists with list_lock held. > > New counters of kmem_cache_node are: pfree_objects, ptotal_objects. > The main operations are under list_lock in slow path, its performance > impact is minimal. > > Co-developed-by: Wen Yang <wenyang@xxxxxxxxxxxxxxxxx> > Signed-off-by: Xunlei Pang <xlpang@xxxxxxxxxxxxxxxxx> > --- > mm/slab.h | 2 ++ > mm/slub.c | 38 +++++++++++++++++++++++++++++++++++++- > 2 files changed, 39 insertions(+), 1 deletion(-) > > diff --git a/mm/slab.h b/mm/slab.h > index 7e94700..5935749 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -616,6 +616,8 @@ struct kmem_cache_node { > #ifdef CONFIG_SLUB > unsigned long nr_partial; > struct list_head partial; > + atomic_long_t pfree_objects; /* partial free objects */ > + atomic_long_t ptotal_objects; /* partial total objects */ You could rename these to "nr_partial_free_objs" and "nr_partial_total_objs" for readability. - Pekka