From: Zeng Jingxiang <linuszeng@xxxxxxxxxxx> memsw account is a very useful knob for container memory overcommitting: It's a great abstraction of the "expected total memory usage" of a container, so containers can't allocate too much memory using SWAP, but still be able to SWAP out. For a simple example, with memsw.limit == memory.limit, containers can't exceed their original memory limit, even with SWAP enabled, they get OOM killed as how they used to, but the host is now able to offload cold pages. Similar ability seems absent with V2: With memory.swap.max == 0, the host can't use SWAP to reclaim container memory at all. But with a value larger than that, containers are able to overuse memory, causing delayed OOM kill, thrashing, CPU/Memory usage ratio could be heavily out of balance, especially with compress SWAP backends. This patch set adds two interfaces to control the behavior of the memory.swap.max/current in cgroupv2: CONFIG_MEMSW_ACCOUNT_ON_DFL cgroup.memsw_account_on_dfl={0, 1} When one of the interfaces is enabled: memory.swap.current and memory.swap.max represents the usage/limit of swap. When neither is enabled (default behavior),memory.swap.current and memory.swap.max represents the usage/limit of memory+swap. As discussed in [1], this patch set can change the semantics of of memory.swap.max/current to the v1-like behavior. Link: https://lore.kernel.org/all/Zk-fQtFrj-2YDJOo@xxxxxxxxxxxxxxxxxxxxxxxxx/ [1] linuszeng (5): Kconfig: add SWAP_CHARGE_V1_MODE config memcontrol: add boot option to enable memsw account on dfl mm/memcontrol: do not scan anon pages if memsw limit is hit mm/memcontrol: allow memsw account in cgroup v2 Docs/cgroup-v2: add cgroup.memsw_account_on_dfl Documentation Documentation/admin-guide/cgroup-v2.rst | 21 +++++-- .../admin-guide/kernel-parameters.txt | 7 +++ include/linux/memcontrol.h | 8 +++ init/Kconfig | 16 ++++++ mm/memcontrol-v1.c | 2 +- mm/memcontrol-v1.h | 4 +- mm/memcontrol.c | 55 ++++++++++++++----- 7 files changed, 93 insertions(+), 20 deletions(-) -- 2.41.1