+Tejun Hi Zeng, On Wed, Mar 19, 2025 at 02:41:43PM +0800, Jingxiang Zeng wrote: > From: Zeng Jingxiang <linuszeng@xxxxxxxxxxx> > > memsw account is a very useful knob for container memory > overcommitting: It's a great abstraction of the "expected total > memory usage" of a container, so containers can't allocate too > much memory using SWAP, but still be able to SWAP out. > > For a simple example, with memsw.limit == memory.limit, containers > can't exceed their original memory limit, even with SWAP enabled, they > get OOM killed as how they used to, but the host is now able to > offload cold pages. > > Similar ability seems absent with V2: With memory.swap.max == 0, the > host can't use SWAP to reclaim container memory at all. But with a > value larger than that, containers are able to overuse memory, causing > delayed OOM kill, thrashing, CPU/Memory usage ratio could be heavily > out of balance, especially with compress SWAP backends. > > This patch set adds two interfaces to control the behavior of the > memory.swap.max/current in cgroupv2: > > CONFIG_MEMSW_ACCOUNT_ON_DFL > cgroup.memsw_account_on_dfl={0, 1} > > When one of the interfaces is enabled: memory.swap.current and > memory.swap.max represents the usage/limit of swap. > When neither is enabled (default behavior),memory.swap.current and > memory.swap.max represents the usage/limit of memory+swap. > > As discussed in [1], this patch set can change the semantics of > of memory.swap.max/current to the v1-like behavior. > > Link: > https://lore.kernel.org/all/Zk-fQtFrj-2YDJOo@xxxxxxxxxxxxxxxxxxxxxxxxx/ [1] Overall I don't have objection but I would like to keep the changes separate from v2 code as much as possible. More specifically: 1. Keep CONFIG_MEMSW_ACCOUNT_ON_DFL dependent on CONFIG_MEMCG_V1 and disabled by default (as you already did). 2. Keep the changes in memcontrol-v1.[h|c] as much as possible. I will go over the patches but let's see what others have to say.