The patch titled Subject: workingset: memcg: sleep when flushing stats in workingset_refault() has been added to the -mm mm-unstable branch. Its filename is workingset-memcg-sleep-when-flushing-stats-in-workingset_refault.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/workingset-memcg-sleep-when-flushing-stats-in-workingset_refault.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Yosry Ahmed <yosryahmed@xxxxxxxxxx> Subject: workingset: memcg: sleep when flushing stats in workingset_refault() Date: Thu, 30 Mar 2023 19:17:59 +0000 In workingset_refault(), we call mem_cgroup_flush_stats_atomic_ratelimited() to read accurate stats within an RCU read section and with sleeping disallowed. Move the call above the RCU read section to make it non-atomic. Flushing is an expensive operation that scales with the number of cpus and the number of cgroups in the system, so avoid doing it atomically where possible. Since workingset_refault() is the only caller of mem_cgroup_flush_stats_atomic_ratelimited(), just make it non-atomic, and rename it to mem_cgroup_flush_stats_ratelimited(). Link: https://lkml.kernel.org/r/20230330191801.1967435-7-yosryahmed@xxxxxxxxxx Signed-off-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx> Acked-by: Shakeel Butt <shakeelb@xxxxxxxxxx> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxxx> Cc: Jens Axboe <axboe@xxxxxxxxx> Cc: Josef Bacik <josef@xxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Michal Koutný <mkoutny@xxxxxxxx> Cc: Muchun Song <muchun.song@xxxxxxxxx> Cc: Roman Gushchin <roman.gushchin@xxxxxxxxx> Cc: Tejun Heo <tj@xxxxxxxxxx> Cc: Vasily Averin <vasily.averin@xxxxxxxxx> Cc: Zefan Li <lizefan.x@xxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- include/linux/memcontrol.h | 4 ++-- mm/memcontrol.c | 4 ++-- mm/workingset.c | 5 +++-- 3 files changed, 7 insertions(+), 6 deletions(-) --- a/include/linux/memcontrol.h~workingset-memcg-sleep-when-flushing-stats-in-workingset_refault +++ a/include/linux/memcontrol.h @@ -1039,7 +1039,7 @@ static inline unsigned long lruvec_page_ void mem_cgroup_flush_stats(void); void mem_cgroup_flush_stats_atomic(void); -void mem_cgroup_flush_stats_atomic_ratelimited(void); +void mem_cgroup_flush_stats_ratelimited(void); void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, int val); @@ -1541,7 +1541,7 @@ static inline void mem_cgroup_flush_stat { } -static inline void mem_cgroup_flush_stats_atomic_ratelimited(void) +static inline void mem_cgroup_flush_stats_ratelimited(void) { } --- a/mm/memcontrol.c~workingset-memcg-sleep-when-flushing-stats-in-workingset_refault +++ a/mm/memcontrol.c @@ -674,10 +674,10 @@ void mem_cgroup_flush_stats_atomic(void) do_flush_stats(true); } -void mem_cgroup_flush_stats_atomic_ratelimited(void) +void mem_cgroup_flush_stats_ratelimited(void) { if (time_after64(jiffies_64, READ_ONCE(flush_next_time))) - mem_cgroup_flush_stats_atomic(); + mem_cgroup_flush_stats(); } static void flush_memcg_stats_dwork(struct work_struct *w) --- a/mm/workingset.c~workingset-memcg-sleep-when-flushing-stats-in-workingset_refault +++ a/mm/workingset.c @@ -406,6 +406,9 @@ void workingset_refault(struct folio *fo unpack_shadow(shadow, &memcgid, &pgdat, &eviction, &workingset); eviction <<= bucket_order; + /* Flush stats (and potentially sleep) before holding RCU read lock */ + mem_cgroup_flush_stats_ratelimited(); + rcu_read_lock(); /* * Look up the memcg associated with the stored ID. It might @@ -461,8 +464,6 @@ void workingset_refault(struct folio *fo lruvec = mem_cgroup_lruvec(memcg, pgdat); mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr); - - mem_cgroup_flush_stats_atomic_ratelimited(); /* * Compare the distance to the existing workingset size. We * don't activate pages that couldn't stay resident even if _ Patches currently in -mm which might be from yosryahmed@xxxxxxxxxx are cgroup-rename-cgroup_rstat_flush_irqsafe-to-atomic.patch memcg-rename-mem_cgroup_flush_stats_delayed-to-ratelimited.patch memcg-do-not-flush-stats-in-irq-context.patch memcg-replace-stats_flush_lock-with-an-atomic.patch memcg-sleep-during-flushing-stats-in-safe-contexts.patch workingset-memcg-sleep-when-flushing-stats-in-workingset_refault.patch vmscan-memcg-sleep-when-flushing-stats-during-reclaim.patch memcg-do-not-modify-rstat-tree-for-zero-updates.patch