Huge percpu memory usage on multi NUMA node system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I use Vmware VM with the following configuration.
- 1 vCPU per 1 NUMA node
- 4 online vCPUs, 128 possible vCPUs. It translates to
- 4 online nodes and 128 possible nodes.
- 192VM memory

Linux 5.15 with CONFIG_NODES_SHIFT=6 complains on node numbers more
that maximum supported (1 << 6):
Nov 27 01:59:37 photon-576f8974caf.org kernel: SRAT: PXM 62 -> APIC 0x7c -> Node 62
Nov 27 01:59:37 photon-576f8974caf.org kernel: SRAT: PXM 63 -> APIC 0x7e -> Node 63
Nov 27 01:59:37 photon-576f8974caf.org kernel: SRAT: Too many proximity domains 40
Nov 27 01:59:37 photon-576f8974caf.org kernel: ACPI: SRAT: SRAT not used.
Nov 27 01:59:37 photon-576f8974caf.org kernel: No NUMA configuration found

But it boots OK and Percpu memory amount is 53760 kB

If I compile with CONFIG_NODES_SHIFT=10 to support 128 nodes, boot warning disappears,
cpu info reports proper numa nodes for existing cpus.
But boot process fails with OOM in pid 1.

Increasing VM RAM from 192 MB to 1024MB fixed OOM.
/proc/meminfo reported increase in Percpu to 718048 kB !!

It sounds surprising as number of CPUs are the same in both cases.

Initial analysis showed that each memory cgroup allocates per node structures. Each of
them have percpu allocations, doing 128 * 128 * struct size.
See: mem_cgroup_alloc() -> alloc_mem_cgroup_per_node_info()

There is also old comment about it in alloc_mem_cgroup_per_node_info()
      /*
       * This routine is called against possible nodes.
       * But it's BUG to call kmalloc() against offline node.
       *
       * TODO: this routine can waste much memory for nodes which will
       *       never be onlined. It's better to use memory hotplug callback
       *       function.
       */
There are might be other places not efficiently using memory for non existing nodes.

Thanks,
—Alexey

Attachment: signature.asc
Description: Message signed with OpenPGP


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux