On Wed, Oct 12, 2022 at 09:36:34PM -0700, Shakeel Butt wrote: > On Wed, Aug 17, 2022 at 1:12 PM Gražvydas Ignotas <notasas@xxxxxxxxx> wrote: > > > > On Wed, Aug 17, 2022 at 9:16 PM Wei Wang <weiwan@xxxxxxxxxx> wrote: > > > > > > On Wed, Aug 17, 2022 at 10:37 AM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > > > > > > > + Eric and netdev > > > > > > > > On Wed, Aug 17, 2022 at 10:13 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > > > > > > > > This is most likely a regression caused by this patch: > > > > > > > > > > commit 4b1327be9fe57443295ae86fe0fcf24a18469e9f > > > > > Author: Wei Wang <weiwan@xxxxxxxxxx> > > > > > Date: Tue Aug 17 12:40:03 2021 -0700 > > > > > > > > > > net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem() > > > > > > > > > > Add gfp_t mask as an input parameter to mem_cgroup_charge_skmem(), > > > > > to give more control to the networking stack and enable it to change > > > > > memcg charging behavior. In the future, the networking stack may decide > > > > > to avoid oom-kills when fallbacks are more appropriate. > > > > > > > > > > One behavior change in mem_cgroup_charge_skmem() by this patch is to > > > > > avoid force charging by default and let the caller decide when and if > > > > > force charging is needed through the presence or absence of > > > > > __GFP_NOFAIL. > > > > > > > > > > Signed-off-by: Wei Wang <weiwan@xxxxxxxxxx> > > > > > Reviewed-by: Shakeel Butt <shakeelb@xxxxxxxxxx> > > > > > Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> > > > > > > > > > > We never used to fail these allocations. Cgroups don't have a > > > > > kswapd-style watermark reclaimer, so the network relied on > > > > > force-charging and leaving reclaim to allocations that can block. > > > > > Now it seems network packets could just fail indefinitely. > > > > > > > > > > The changelog is a bit terse given how drastic the behavior change > > > > > is. Wei, Shakeel, can you fill in why this was changed? Can we revert > > > > > this for the time being? > > > > > > > > Does reverting the patch fix the issue? However I don't think it will. > > > > > > > > Please note that we still have the force charging as before this > > > > patch. Previously when mem_cgroup_charge_skmem() force charges, it > > > > returns false and __sk_mem_raise_allocated takes suppress_allocation > > > > code path. Based on some heuristics, it may allow it or it may > > > > uncharge and return failure. > > > > > > The force charging logic in __sk_mem_raise_allocated only gets > > > considered on tx path for STREAM socket. So it probably does not take > > > effect on UDP path. And, that logic is NOT being altered in the above > > > patch. > > > So specifically for UDP receive path, what happens in > > > __sk_mem_raise_allocated() BEFORE the above patch is: > > > - mem_cgroup_charge_skmem() gets called: > > > - try_charge() with GFP_NOWAIT gets called and failed > > > - try_charge() with __GFP_NOFAIL > > > - return false > > > - goto suppress_allocation: > > > - mem_cgroup_uncharge_skmem() gets called > > > - return 0 (which means failure) > > > > > > AFTER the above patch, what happens in __sk_mem_raise_allocated() is: > > > - mem_cgroup_charge_skmem() gets called: > > > - try_charge() with GFP_NOWAIT gets called and failed > > > - return false > > > - goto suppress_allocation: > > > - We no longer calls mem_cgroup_uncharge_skmem() > > > - return 0 > > > > > > So I agree with Shakeel, that this change shouldn't alter the behavior > > > of the above call path in such a situation. > > > But do let us know if reverting this change has any effect on your test. > > > > The problem is still there (the kernel wasn't compiling after revert, > > had to adjust another seemingly unrelated callsite). It's hard to tell > > if it's better or worse since it happens so randomly. > > > > Hello everyone, we have a better understanding why the patch pointed > out by Johannes might have exposed this issue. See > https://lore.kernel.org/all/20221013041833.rhifxw4gqwk4ofi2@xxxxxxxxxx/. Wow, that's super subtle! Nice sleuthing. > To summarize, the old code was depending on a subtle interaction of > force-charge and percpu charge caches which this patch removed. The > fix I am proposing is for the network stack to be explicit of its need > (i.e. use GFP_ATOMIC) instead of depending on a subtle behavior. That sounds good to me.