On Wed, 12 Oct 2022 18:40:50 -0700 Jakub Kicinski wrote: > Did the fact that we used to force charge not potentially cause > reclaim, tho? Letting TCP accept the next packet even if it had > to drop the current one? I pushed this little nugget to one affected machine via KLP: diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 03ffbb255e60..c1ca369a1b77 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -7121,6 +7121,10 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages, return true; } + if (gfp_mask == GFP_NOWAIT) { + try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages); + refill_stock(memcg, nr_pages); + } return false; } The problem normally reproes reliably within 10min -- 30min and counting and the application-level latency has not spiked.