Re: KASAN-related VMAP allocation errors in debug kernels with many logical CPUS

David Hildenbrand <david@xxxxxxxxxx> · Thu, 6 Oct 2022 18:12:33 +0200

On 06.10.22 17:35, Uladzislau Rezki wrote:
Hi,

we're currently hitting a weird vmap issue in debug kernels with KASAN enabled
on fairly large VMs. I reproduced it on v5.19 (did not get the chance to
try 6.0 yet because I don't have access to the machine right now, but
I suspect it persists).

It seems to trigger when udev probes a massive amount of devices in parallel
while the system is booting up. Once the system booted, I no longer see any
such issues.

[  165.818200] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
[  165.836622] vmap allocation for size 315392 failed: use vmalloc=<size> to increase size
[  165.837461] vmap allocation for size 315392 failed: use vmalloc=<size> to increase size
[  165.840573] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
[  165.841059] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
[  165.841428] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
[  165.841819] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
[  165.842123] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
[  165.843359] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
[  165.844894] vmap allocation for size 2498560 failed: use vmalloc=<size> to increase size
[  165.847028] CPU: 253 PID: 4995 Comm: systemd-udevd Not tainted 5.19.0 #2
[  165.935689] Hardware name: Lenovo ThinkSystem SR950 -[7X12ABC1WW]-/-[7X12ABC1WW]-, BIOS -[PSE130O-1.81]- 05/20/2020
[  165.947343] Call Trace:
[  165.950075]  <TASK>
[  165.952425]  dump_stack_lvl+0x57/0x81
[  165.956532]  warn_alloc.cold+0x95/0x18a
[  165.960836]  ? zone_watermark_ok_safe+0x240/0x240
[  165.966100]  ? slab_free_freelist_hook+0x11d/0x1d0
[  165.971461]  ? __get_vm_area_node+0x2af/0x360
[  165.976341]  ? __get_vm_area_node+0x2af/0x360
[  165.981219]  __vmalloc_node_range+0x291/0x560
[  165.986087]  ? __mutex_unlock_slowpath+0x161/0x5e0
[  165.991447]  ? move_module+0x4c/0x630
[  165.995547]  ? vfree_atomic+0xa0/0xa0
[  165.999647]  ? move_module+0x4c/0x630
[  166.003741]  module_alloc+0xe7/0x170
[  166.007747]  ? move_module+0x4c/0x630
[  166.011840]  move_module+0x4c/0x630
[  166.015751]  layout_and_allocate+0x32c/0x560
[  166.020519]  load_module+0x8e0/0x25c0

Can it be that we do not have enough "module section" size? I mean the
section size, which is MODULES_END - MODULES_VADDR is rather small so
some modules are not loaded due to no space.

CONFIG_RANDOMIZE_BASE also creates some offset overhead if enabled on
your box. But it looks it is rather negligible.

Right, I suspected both points -- but was fairly confused why the 
numbers of CPUs would matter.

What would make sense is that if we're tight on module vmap space, that 
the race I think that could happen with purging only once and then 
failing could become relevant.

Maybe try to increase the module-section size to see if it solves the
problem.

What would be the easiest way to do that?

Thanks!

--
Thanks,

David / dhildenb