Re: [PATCH rfc 0/9] mm: memcg: separate legacy cgroup v1 code and put under config option

Shakeel Butt <shakeel.butt@xxxxxxxxx> · Sat, 18 May 2024 00:32:53 -0700

On Thu, May 16, 2024 at 11:35:57AM +0800, Yafang Shao wrote:
> On Thu, May 9, 2024 at 2:33 PM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote:
> >
> 
[...]
> Hi Shakeel,
> 
> Hopefully I'm not too late.  We are currently using memcg v1.
> 
> One specific feature we rely on in v1 is skmem accounting. In v1, we
> account for TCP memory usage without charging it to memcg v1, which is
> useful for monitoring the TCP memory usage generated by tasks running
> in a container. However, in memcg v2, monitoring TCP memory requires
> charging it to the container, which can easily cause OOM issues. It
> would be better if we could monitor skmem usage without charging it in
> the memcg v2, allowing us to account for it without the risk of
> triggering OOM conditions.
> 

Hi Yafang,

No worries. From what I understand, you are not really using skmem
charging of v1 but just the network memory usage stats and you are
worried that charging network memory to cgroup memory may cause OOMs. Is
that correct? Have you tried charging network memory to cgroup memory
before and saw OOMs? If yes then I would really like to see OOM reports.

I have two examples where the v2's skmem charging is working fine in
production namely Google and Meta. Google is still on v1 but for skmem
charging, they have moved to v2 semantics. Actually I have another
report from Cloudflare [0] where the tcp throttling mechanism for v2's
tcp memory accounting is too much conservative for their production
traffic.

Anyways this just means that we need a more flexible way to provide
and enforce semantics for tcp memory pressure with a decent default
behavior. I will followup on this separately.

[0] https://lore.kernel.org/lkml/CABWYdi0G7cyNFbndM-ELTDAR3x4Ngm0AehEp5aP0tfNkXUE+Uw@xxxxxxxxxxxxxx/

thanks,
Shakeel