This patch series implements the proposal in LSF/MM/BPF 2023 conference for reducing offline/zombie memcgs by memory recharging [1]. The main difference is that this series focuses on recharging and does not include eviction of any memory charged to offline memcgs. Two methods of recharging are proposed: (a) Recharging of mapped folios. When a memcg is offlined, queue an asynchronous worker that will walk the lruvec of the offline memcg and try to recharge any mapped folios to the memcg of one of the processes mapping the folio. The main assumption is that a process mapping the folio is the "rightful" owner of the memory. Currently, this is only supported for evictable folios, as the unevictable lru is imaginary and we cannot iterate the folios on it. A separate proposal [2] was made to revive the unevictable lru, which would allow recharging of unevictable folios. (b) Deferred recharging of folios. For folios that are unmapped, or mapped but we fail to recharge them with (a), we rely on deferred recharging. Simply put, any time a folio is accessed or dirtied by a userspace process, and that folio is charged to an offline memcg, we will try to recharge it to the memcg of the process accessing the folio. Again, we assume this process should be the "rightful" owner of the memory. This is also done asynchronously to avoid slowing down the data access path. In both cases, we never OOM kill the recharging target if it goes above limit. This is to avoid OOM killing a process an arbitrary amount of time after it started using memory. This is a conservative policy that can be revisited later. The patches in this series are divided as follows: - Patches 1 & 2 are preliminary refactoring and helpers introducion. - Patches 3 to 5 implement (a) and (b) above. - Patches 6 & 7 add stats, a sysctl, and a config option. - Patch 8 is a selftest. [1]https://lore.kernel.org/linux-mm/CABdmKX2M6koq4Q0Cmp_-=wbP0Qa190HdEGGaHfxNS05gAkUtPA@xxxxxxxxxxxxxx/ [2]https://lore.kernel.org/lkml/20230618065719.1363271-1-yosryahmed@xxxxxxxxxx/ Yosry Ahmed (8): memcg: refactor updating memcg->moving_account mm: vmscan: add lruvec_for_each_list() helper memcg: recharge mapped folios when a memcg is offlined memcg: support deferred memcg recharging memcg: recharge folios when accessed or dirtied memcg: add stats for offline memcgs recharging memcg: add sysctl and config option to control memory recharging selftests: cgroup: test_memcontrol: add a selftest for memcg recharging include/linux/memcontrol.h | 14 + include/linux/swap.h | 8 + include/linux/vm_event_item.h | 5 + kernel/sysctl.c | 11 + mm/Kconfig | 12 + mm/memcontrol.c | 376 +++++++++++++++++- mm/page-writeback.c | 2 + mm/swap.c | 2 + mm/vmscan.c | 28 ++ mm/vmstat.c | 6 +- tools/testing/selftests/cgroup/cgroup_util.c | 14 + tools/testing/selftests/cgroup/cgroup_util.h | 1 + .../selftests/cgroup/test_memcontrol.c | 310 +++++++++++++++ 13 files changed, 784 insertions(+), 5 deletions(-) -- 2.41.0.255.g8b1d071c50-goog