On Wed, Oct 16, 2024 at 03:24:18PM +0300, Mike Rapoport wrote: > From: "Mike Rapoport (Microsoft)" <rppt@xxxxxxxxxx> > > vmalloc allocations with VM_ALLOW_HUGE_VMAP that do not explicitly > specify node ID will use huge pages only if size_per_node is larger than > a huge page. > Still the actual allocated memory is not distributed between nodes and > there is no advantage in such approach. > On the contrary, BPF allocates SZ_2M * num_possible_nodes() for each > new bpf_prog_pack, while it could do with a single huge page per pack. > > Don't account for number of nodes for VM_ALLOW_HUGE_VMAP with > NUMA_NO_NODE and use huge pages whenever the requested allocation size > is larger than a huge page. > > Signed-off-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx> > Reviewed-by: Christoph Hellwig <hch@xxxxxx> > --- > mm/vmalloc.c | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 634162271c00..86b2344d7461 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -3763,8 +3763,6 @@ void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align, > } > > if (vmap_allow_huge && (vm_flags & VM_ALLOW_HUGE_VMAP)) { > - unsigned long size_per_node; > - > /* > * Try huge pages. Only try for PAGE_KERNEL allocations, > * others like modules don't yet expect huge pages in > @@ -3772,13 +3770,10 @@ void *__vmalloc_node_range_noprof(unsigned long size, unsigned long align, > * supporting them. > */ > > - size_per_node = size; > - if (node == NUMA_NO_NODE) > - size_per_node /= num_online_nodes(); > - if (arch_vmap_pmd_supported(prot) && size_per_node >= PMD_SIZE) > + if (arch_vmap_pmd_supported(prot) && size >= PMD_SIZE) > shift = PMD_SHIFT; > else > - shift = arch_vmap_pte_supported_shift(size_per_node); > + shift = arch_vmap_pte_supported_shift(size); > > align = max(real_align, 1UL << shift); > size = ALIGN(real_size, 1UL << shift); > Looking at this place, i see that an overwriting a "size" approach seems as something that is a bit hard to follow. Below we have following code: <snip> ... again: area = __get_vm_area_node(real_size, align, shift, VM_ALLOC | VM_UNINITIALIZED | vm_flags, start, end, node, gfp_mask, caller); ... <snip> where we pass a "real_size", whereas there is only one place in the __vmalloc_node_range_noprof() function where a "size" is used. It is in the end of function: <snip> ... size = PAGE_ALIGN(size); if (!(vm_flags & VM_DEFER_KMEMLEAK)) kmemleak_vmalloc(area, size, gfp_mask); return area->addr; <snip> As fro this patch: Reviewed-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx> -- Uladzislau Rezki