The patch titled Subject: mm: don't hold css->refcnt during traversal has been added to the -mm mm-unstable branch. Its filename is mm-dont-hold-css-refcnt-during-traversal.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-dont-hold-css-refcnt-during-traversal.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Kinsey Ho <kinseyho@xxxxxxxxxx> Subject: mm: don't hold css->refcnt during traversal Date: Tue, 27 Aug 2024 23:07:39 +0000 To obtain the pointer to the next memcg position, mem_cgroup_iter() currently holds css->refcnt during memcg traversal only to put css->refcnt at the end of the routine. This isn't necessary as an rcu_read_lock is already held throughout the function. The use of the RCU read lock with css_next_descendant_pre() guarantees that sibling linkage is safe without holding a ref on the passed-in @css. Remove css->refcnt usage during traversal by leveraging RCU. Link: https://lkml.kernel.org/r/20240827230753.2073580-3-kinseyho@xxxxxxxxxx Signed-off-by: Kinsey Ho <kinseyho@xxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Michal Koutný <mkoutny@xxxxxxxx> Cc: Muchun Song <muchun.song@xxxxxxxxx> Cc: Roman Gushchin <roman.gushchin@xxxxxxxxx> Cc: Shakeel Butt <shakeel.butt@xxxxxxxxx> Cc: Tejun Heo <tj@xxxxxxxxxx> Cc: Yosry Ahmed <yosryahmed@xxxxxxxxxx> Cc: Zefan Li <lizefan.x@xxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/memcontrol.c | 18 +----------------- 1 file changed, 1 insertion(+), 17 deletions(-) --- a/mm/memcontrol.c~mm-dont-hold-css-refcnt-during-traversal +++ a/mm/memcontrol.c @@ -1013,20 +1013,7 @@ struct mem_cgroup *mem_cgroup_iter(struc else if (reclaim->generation != iter->generation) goto out_unlock; - while (1) { - pos = READ_ONCE(iter->position); - if (!pos || css_tryget(&pos->css)) - break; - /* - * css reference reached zero, so iter->position will - * be cleared by ->css_released. However, we should not - * rely on this happening soon, because ->css_released - * is called from a work queue, and by busy-waiting we - * might block it. So we clear iter->position right - * away. - */ - (void)cmpxchg(&iter->position, pos, NULL); - } + pos = READ_ONCE(iter->position); } else if (prev) { pos = prev; } @@ -1067,9 +1054,6 @@ struct mem_cgroup *mem_cgroup_iter(struc */ (void)cmpxchg(&iter->position, pos, memcg); - if (pos) - css_put(&pos->css); - if (!memcg) iter->generation++; } _ Patches currently in -mm which might be from kinseyho@xxxxxxxxxx are cgroup-clarify-css-sibling-linkage-is-protected-by-cgroup_mutex-or-rcu.patch mm-dont-hold-css-refcnt-during-traversal.patch mm-increment-gen-before-restarting-traversal.patch mm-restart-if-multiple-traversals-raced.patch mm-clean-up-mem_cgroup_iter.patch