> > > > > OK. It is related to a module vmap space allocation when a module is > > inserted. I wounder why it requires 2.5MB for a module? It seems a lot > > to me. > > > > Indeed. I assume KASAN can go wild when it instruments each and every memory > access. > > > > > > > Really looks like only module vmap space. ~ 1 GiB of vmap module space ... > > > > > If an allocation request for a module is 2.5MB we can load ~400 modules > > having 1GB address space. > > > > "lsmod | wc -l"? How many modules your system has? > > > > ~71, so not even close to 400. > > > > What I find interesting is that we have these recurring allocations of similar sizes failing. > > > I wonder if user space is capable of loading the same kernel module concurrently to > > > trigger a massive amount of allocations, and module loading code only figures out > > > later that it has already been loaded and backs off. > > > > > If there is a request about allocating memory it has to be succeeded > > unless there are some errors like no space no memory. > > Yes. But as I found out we're really out of space because module loading > code allocates module VMAP space first, before verifying if the module was > already loaded or is concurrently getting loaded. > > See below. > > [...] > > > I wrote a small patch to dump a modules address space when a fail occurs: > > > > <snip v6.0> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index 83b54beb12fa..88d323310df5 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -1580,6 +1580,37 @@ preload_this_cpu_lock(spinlock_t *lock, gfp_t gfp_mask, int node) > > kmem_cache_free(vmap_area_cachep, va); > > } > > +static void > > +dump_modules_free_space(unsigned long vstart, unsigned long vend) > > +{ > > + unsigned long va_start, va_end; > > + unsigned int total = 0; > > + struct vmap_area *va; > > + > > + if (vend != MODULES_END) > > + return; > > + > > + trace_printk("--- Dump a modules address space: 0x%lx - 0x%lx\n", vstart, vend); > > + > > + spin_lock(&free_vmap_area_lock); > > + list_for_each_entry(va, &free_vmap_area_list, list) { > > + va_start = (va->va_start > vstart) ? va->va_start:vstart; > > + va_end = (va->va_end < vend) ? va->va_end:vend; > > + > > + if (va_start >= va_end) > > + continue; > > + > > + if (va_start >= vstart && va_end <= vend) { > > + trace_printk(" va_free: 0x%lx - 0x%lx size=%lu\n", > > + va_start, va_end, va_end - va_start); > > + total += (va_end - va_start); > > + } > > + } > > + > > + spin_unlock(&free_vmap_area_lock); > > + trace_printk("--- Total free: %u ---\n", total); > > +} > > + > > /* > > * Allocate a region of KVA of the specified size and alignment, within the > > * vstart and vend. > > @@ -1663,10 +1694,13 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > > goto retry; > > } > > - if (!(gfp_mask & __GFP_NOWARN) && printk_ratelimit()) > > + if (!(gfp_mask & __GFP_NOWARN) && printk_ratelimit()) { > > pr_warn("vmap allocation for size %lu failed: use vmalloc=<size> to increase size\n", > > size); > > + dump_modules_free_space(); > > + } > > + > > kmem_cache_free(vmap_area_cachep, va); > > return ERR_PTR(-EBUSY); > > } > > Thanks! > > I can spot the same module getting loaded over and over again concurrently > from user space, only failing after all the allocations when realizing that > the module is in fact already loaded in add_unformed_module(), failing with > -EEXIST. > > That looks quite inefficient. Here is how often user space tries to load the > same module on that system. Note that I print *after* allocating module VMAP > space. > OK. It explains the problem :) Indeed it is inefficient. Allocating and later on figuring out that a module is already there looks weird. Furthermore an attacking from the user space can be organized. > # dmesg | grep Loading | cut -d" " -f5 | sort | uniq -c > 896 acpi_cpufreq > 1 acpi_pad > 1 acpi_power_meter > 2 ahci > 1 cdrom > 2 compiled-in > 1 coretemp > 15 crc32c_intel > 307 crc32_pclmul > 1 crc64 > 1 crc64_rocksoft > 1 crc64_rocksoft_generic > 12 crct10dif_pclmul > 16 dca > 1 dm_log > 1 dm_mirror > 1 dm_mod > 1 dm_region_hash > 1 drm > 1 drm_kms_helper > 1 drm_shmem_helper > 1 fat > 1 fb_sys_fops > 14 fjes > 1 fuse > 205 ghash_clmulni_intel > 1 i2c_algo_bit > 1 i2c_i801 > 1 i2c_smbus > 4 i40e > 4 ib_core > 1 ib_uverbs > 4 ice > 403 intel_cstate > 1 intel_pch_thermal > 1 intel_powerclamp > 1 intel_rapl_common > 1 intel_rapl_msr > 399 intel_uncore > 1 intel_uncore_frequency > 1 intel_uncore_frequency_common > 64 ioatdma > 1 ipmi_devintf > 1 ipmi_msghandler > 1 ipmi_si > 1 ipmi_ssif > 4 irdma > 406 irqbypass > 1 isst_if_common > 165 isst_if_mbox_msr > 300 kvm > 408 kvm_intel > 1 libahci > 2 libata > 1 libcrc32c > 409 libnvdimm > 8 Loading > 1 lpc_ich > 1 megaraid_sas > 1 mei > 1 mei_me > 1 mgag200 > 1 nfit > 1 pcspkr > 1 qrtr > 405 rapl > 1 rfkill > 1 sd_mod > 2 sg > 409 skx_edac > 1 sr_mod > 1 syscopyarea > 1 sysfillrect > 1 sysimgblt > 1 t10_pi > 1 uas > 1 usb_storage > 1 vfat > 1 wmi > 1 x86_pkg_temp_thermal > 1 xfs > > > For each if these loading request, we'll reserve module VMAP space, and free > it once we realize later that the module was already previously loaded. > > So with a lot of CPUs we might end up trying to load the same module that > often at the same time that we actually run out of module VMAP space. > > I have a prototype patch that seems to fix this in module loading code. > Good! I am glad the problem can be solved :) -- Uladzislau Rezki