Re: kswapd0: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel v6.5.9, 32bit ppc)

"Vlastimil Babka (SUSE)" <vbabka@xxxxxxxxxx> · Thu, 6 Jun 2024 18:53:36 +0200

On 6/6/24 3:32 PM, Erhard Furtner wrote:
> On Thu, 6 Jun 2024 09:24:56 +0200
> "Vlastimil Babka (SUSE)" <vbabka@xxxxxxxxxx> wrote:
> 
>> Besides the zpool commit which might have just pushed the machine over the
>> edge, but it was probably close to it already. I've noticed a more general
>> problem that there are GFP_KERNEL allocations failing from kswapd. Those
>> could probably use be __GFP_NOMEMALLOC (or scoped variant, is there one?)
>> since it's the case of "allocating memory to free memory". Or use mempools
>> if the progress (success will lead to freeing memory) is really guaranteed.
>> 
>> Another interesting data point could be to see if traditional reclaim
>> behaves any better on this machine than MGLRU. I saw in the config:
>> 
>> CONFIG_LRU_GEN=y
>> CONFIG_LRU_GEN_ENABLED=y
>> 
>> So disabling at least the second one would revert to the traditional reclaim
>> and we could see if it handles such a constrained system better or not.
> 
> I set RANDOM_KMALLOC_CACHES=n and LRU_GEN_ENABLED=n but still hit the issue.
> 
> dmesg looks a bit different (unpatched v6.10-rc2).

What caught my eye, but it's also in some of the previous dmesg with MGRLU,
is that in one case there's:

DMA free:0kB

That means many allocations went through that are allowed to just ignore all
reserves, and depleted everything. That would mean __GFP_MEMALLOC or
PF_MEMALLOC, which I suggested earlier for the GFP_KERNEL failure, is being
used somewhere, but not leading to the expected memory freeing.

> Regards,
> Erhard