On Thu, Nov 05, 2020 at 02:02:24PM +0000, Matthew Wilcox wrote: > On Thu, Nov 05, 2020 at 02:21:25PM +0100, Eric Dumazet wrote: > > On 11/5/20 5:21 AM, Matthew Wilcox (Oracle) wrote: > > > When the machine is under extreme memory pressure, the page_frag allocator > > > signals this to the networking stack by marking allocations with the > > > 'pfmemalloc' flag, which causes non-essential packets to be dropped. > > > Unfortunately, even after the machine recovers from the low memory > > > condition, the page continues to be used by the page_frag allocator, > > > so all allocations from this page will continue to be dropped. > > > > > > Fix this by freeing and re-allocating the page instead of recycling it. > > > > > > Reported-by: Dongli Zhang <dongli.zhang@xxxxxxxxxx> > > > Cc: Aruna Ramakrishna <aruna.ramakrishna@xxxxxxxxxx> > > > Cc: Bert Barbe <bert.barbe@xxxxxxxxxx> > > > Cc: Rama Nichanamatlu <rama.nichanamatlu@xxxxxxxxxx> > > > Cc: Venkat Venkatsubra <venkat.x.venkatsubra@xxxxxxxxxx> > > > Cc: Manjunath Patil <manjunath.b.patil@xxxxxxxxxx> > > > Cc: Joe Jin <joe.jin@xxxxxxxxxx> > > > Cc: SRINIVAS <srinivas.eeda@xxxxxxxxxx> > > > Cc: stable@xxxxxxxxxxxxxxx > > > Fixes: 79930f5892e ("net: do not deplete pfmemalloc reserve") > > > > Your patch looks fine, although this Fixes: tag seems incorrect. > > > > 79930f5892e ("net: do not deplete pfmemalloc reserve") was propagating > > the page pfmemalloc status into the skb, and seems correct to me. > > > > The bug was the page_frag_alloc() was keeping a problematic page for > > an arbitrary period of time ? > > Isn't this the commit which unmasks the problem, though? I don't think > it's the buggy commit, but if your tree doesn't have 79930f5892e, then > you don't need this patch. > > Or are you saying the problem dates back all the way to > c93bdd0e03e8 ("netvm: allow skb allocation to use PFMEMALLOC reserves") > > > > + if (nc->pfmemalloc) { > > > > if (unlikely(nc->pfmemalloc)) { > > ACK. Will make the change once we've settled on an appropriate Fixes tag. Which commit should I claim this fixes?