Re: kswapd0: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel v6.5.9, 32bit ppc)

Yu Zhao <yuzhao@xxxxxxxxxx> · Sun, 2 Jun 2024 14:38:18 -0600

On Sun, Jun 2, 2024 at 12:03 PM Erhard Furtner <erhard_f@xxxxxxxxxxx> wrote:
>
> On Sat, 1 Jun 2024 00:01:48 -0600
> Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
>
> > Hi Erhard,
> >
> > The OOM kills on both kernel versions seem to be reasonable to me.
> >
> > Your system has 2GB memory and it uses zswap with zsmalloc (which is
> > good since it can allocate from the highmem zone) and zstd/lzo (which
> > doesn't matter much). Somehow -- I couldn't figure out why -- it
> > splits the 2GB into a 0.25GB DMA zone and a 1.75GB highmem zone:
> >
> > [    0.000000] Zone ranges:
> > [    0.000000]   DMA      [mem 0x0000000000000000-0x000000002fffffff]
> > [    0.000000]   Normal   empty
> > [    0.000000]   HighMem  [mem 0x0000000030000000-0x000000007fffffff]
> >
> > The kernel can't allocate from the highmem zone -- only userspace and
> > zsmalloc can. OOM kills were due to the low memory conditions in the
> > DMA zone where the kernel itself failed to allocate from.
> >
> > Do you know a kernel version that doesn't have OOM kills while running
> > the same workload? If so, could you send that .config to me? If not,
> > could you try disabling CONFIG_HIGHMEM? (It might not help but I'm out
> > of ideas at the moment.)
> >
> > Thanks!
>
> Hi Yu!
>
> Thanks for looking into this.
>
> The reason for this 0.25GB DMA / 1.75GB highmem split is beyond my knowledge. I can only tell this much that it's like this at least since kernel v4.14.x (dmesg of an old bugreport of mine at https://bugzilla.kernel.org/show_bug.cgi?id=201723), I guess earlier kernel versions too.
>
> Without CONFIG_HIGHMEM the memory layout looks like this:
>
> Total memory = 768MB; using 2048kB for hash table
> [...]
> Top of RAM: 0x30000000, Total RAM: 0x30000000
> Memory hole size: 0MB
> Zone ranges:
>   DMA      [mem 0x0000000000000000-0x000000002fffffff]
>   Normal   empty
> Movable zone start for each node
> Early memory node ranges
>   node   0: [mem 0x0000000000000000-0x000000002fffffff]
> Initmem setup node 0 [mem 0x0000000000000000-0x000000002fffffff]
> percpu: Embedded 29 pages/cpu s28448 r8192 d82144 u118784
> pcpu-alloc: s28448 r8192 d82144 u118784 alloc=29*4096
> pcpu-alloc: [0] 0 [0] 1
> Kernel command line: ro root=/dev/sda5 slub_debug=FZP page_poison=1 netconsole=6666@192.168.2.8/eth0,6666@192.168.2.3/A8:A1:59:16:4F:EA debug
> Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear)
> Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
> Built 1 zonelists, mobility grouping on.  Total pages: 194880
> mem auto-init: stack:all(pattern), heap alloc:off, heap free:off
> Kernel virtual memory layout:
>   * 0xffbdf000..0xfffff000  : fixmap
>   * 0xff8f4000..0xffbdf000  : early ioremap
>   * 0xf1000000..0xff8f4000  : vmalloc & ioremap
>   * 0xb0000000..0xc0000000  : modules
> Memory: 761868K/786432K available (7760K kernel code, 524K rwdata, 4528K rodata, 1100K init, 253K bss, 24564K reserved, 0K cma-reserved)
> [...]
>
> With only 768 MB RAM and 2048K hashtable I get pretty much the same "kswapd0: page allocation failure: order:0, mode:0xcc0(GFP_KERNEL),nodemask=(null),cpuset=/,mems_allowed=0" as with the HIGHMEM enabled kernel at
> running "stress-ng --vm 2 --vm-bytes 1930M --verify -v".
>
> I tried the workload on v6.6.32 LTS where the issue shows up too. But v6.1.92 LTS seems ok! Triple checked v6.1.92 to be sure.
>
> Attached please find kernel v6.9.3 dmesg (without HIGHMEM) and kernel v6.1.92 .config.

Thanks.

I compared the .config between v6.8.9 (you attached previously) and
v6.1.92 -- I didn't see any major differences (both have ZONE_DMA,
HIGHMEM, MGLRU and zswap/zsmalloc). Either there is something broken
between v6.1.92 and v6.6.32 (as you mentioned above), or it's just a
kernel allocation bloat which puts the DMA zone (0.25GB) under too
heavy pressure. The latter isn't uncommon when upgrading to a newer
version of the kernel.

Could you please attach the dmesg from v6.1.92? I want to compare the
dmegs between the two kernel versions as well -- that might provide
some hints.