On Wed, Oct 12, 2022 at 05:38:25PM -0700, Jakub Kicinski wrote: > On Wed, 12 Oct 2022 17:17:38 -0700 Shakeel Butt wrote: > > Did the revert of this patch fix the issue you are seeing? The reason > > I am asking is because this patch should not change the behavior. > > Actually someone else reported the similar issue for UDP RX at [1] and > > they tested the revert as well. The revert did not fix the issue for > > them. > > > > Wei has a better explanation at [2] why this patch is not the cause > > for these issues. > > We're talking TCP here, to be clear. I haven't tested a revert, yet (not > that easy to test with a real workload) but I'm relatively confident the > change did introduce an "unforced" call, specifically this bit: > > @@ -2728,10 +2728,12 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) > { > struct proto *prot = sk->sk_prot; > long allocated = sk_memory_allocated_add(sk, amt); > + bool memcg_charge = mem_cgroup_sockets_enabled && sk->sk_memcg; > bool charged = true; > > - if (mem_cgroup_sockets_enabled && sk->sk_memcg && > - !(charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt))) > + if (memcg_charge && > + !(charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt, > + gfp_memcg_charge()))) > > where gfp_memcg_charge() is GFP_NOWAIT in softirq. > > The above gets called from (inverted stack): > tcp_data_queue() > tcp_try_rmem_schedule(sk, skb, skb->truesize) > tcp_try_rmem_schedule() > sk_rmem_schedule() > __sk_mem_schedule() > __sk_mem_raise_allocated() > > Is my confidence unjustified? :) > Let me add Wei's explanation inline which is protocol independent: __sk_mem_raise_allocated() BEFORE the above patch is: - mem_cgroup_charge_skmem() gets called: - try_charge() with GFP_NOWAIT gets called and failed - try_charge() with __GFP_NOFAIL - return false - goto suppress_allocation: - mem_cgroup_uncharge_skmem() gets called - return 0 (which means failure) AFTER the above patch, what happens in __sk_mem_raise_allocated() is: - mem_cgroup_charge_skmem() gets called: - try_charge() with GFP_NOWAIT gets called and failed - return false - goto suppress_allocation: - We no longer calls mem_cgroup_uncharge_skmem() - return 0 So, before the patch, the memcg code may force charges but it will return false and make the networking code to uncharge memcg for SK_MEM_RECV.