On Fri, Oct 21, 2022 at 09:03:04AM -0700, Jakub Kicinski wrote: > As Shakeel explains the commit under Fixes had the unintended > side-effect of no longer pre-loading the cached memory allowance. > Even tho we previously dropped the first packet received when > over memory limit - the consecutive ones would get thru by using > the cache. The charging was happening in batches of 128kB, so > we'd let in 128kB (truesize) worth of packets per one drop. > > After the change we no longer force charge, there will be no > cache filling side effects. This causes significant drops and > connection stalls for workloads which use a lot of page cache, > since we can't reclaim page cache under GFP_NOWAIT. > > Some of the latency can be recovered by improving SACK reneg > handling but nowhere near enough to get back to the pre-5.15 > performance (the application I'm experimenting with still > sees 5-10x worst latency). > > Apply the suggested workaround of using GFP_ATOMIC. We will now > be more permissive than previously as we'll drop _no_ packets > in softirq when under pressure. But I can't think of any good > and simple way to address that within networking. > > Link: https://lore.kernel.org/all/20221012163300.795e7b86@xxxxxxxxxx/ > Suggested-by: Shakeel Butt <shakeelb@xxxxxxxxxx> > Fixes: 4b1327be9fe5 ("net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem()") > Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx> Acked-by: Roman Gushchin <roman.gushchin@xxxxxxxxx> Thanks!