Re: [PATCH 6.6.y] mm: ratelimit stat flush from workingset shrinker

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 07, 2024 at 03:48:06PM GMT, Jesper Dangaard Brouer wrote:
> From: Shakeel Butt <shakeelb@xxxxxxxxxx>
> 
> commit d4a5b369ad6d8aae552752ff438dddde653a72ec upstream.
> 
> One of our workloads (Postgres 14 + sysbench OLTP) regressed on newer
> upstream kernel and on further investigation, it seems like the cause is
> the always synchronous rstat flush in the count_shadow_nodes() added by
> the commit f82e6bf9bb9b ("mm: memcg: use rstat for non-hierarchical
> stats").  On further inspection it seems like we don't really need
> accurate stats in this function as it was already approximating the amount
> of appropriate shadow entries to keep for maintaining the refault
> information.  Since there is already 2 sec periodic rstat flush, we don't
> need exact stats here.  Let's ratelimit the rstat flush in this code path.
> 
> Link: https://lkml.kernel.org/r/20231228073055.4046430-1-shakeelb@xxxxxxxxxx
> Fixes: f82e6bf9bb9b ("mm: memcg: use rstat for non-hierarchical stats")
> Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx>
> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
> Cc: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> Cc: Yu Zhao <yuzhao@xxxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxx>
> Cc: Roman Gushchin <roman.gushchin@xxxxxxxxx>
> Cc: Muchun Song <songmuchun@xxxxxxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>
> 
> ---
> On production with kernel v6.6 we are observing issues with excessive
> cgroup rstat flushing due to the extra call to mem_cgroup_flush_stats()
> in count_shadow_nodes() introduced in commit f82e6bf9bb9b ("mm: memcg:
> use rstat for non-hierarchical stats") that commit is part of v6.6.
> We request backport of commit d4a5b369ad6d ("mm: ratelimit stat flush
> from workingset shrinker") as it have a fixes tag for this commit.
> 
> IMHO it is worth explaining call path that makes count_shadow_nodes()
> cause excessive cgroup rstat flushing calls. Function shrink_node()
> calls mem_cgroup_flush_stats() on its own first, and then invokes
> shrink_node_memcgs(). Function shrink_node_memcgs() iterates over
> cgroups via mem_cgroup_iter() for each calling shrink_slab(). The
> shrink_slab() calls do_shrink_slab() that via shrinker->count_objects()
> invoke count_shadow_nodes(), and count_shadow_nodes() does
> a mem_cgroup_flush_stats() call, that seems unnecessary.
> 

Actually at Meta production we have also replaced
mem_cgroup_flush_stats() in shrink_node() with
mem_cgroup_flush_stats_ratelimited() as it was causing too much flushing
issue. We have not observed any issue after the change. I will propose
that patch to upstream as well.




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux