Re: [PATCH v6 11/12] mm/vmalloc: Hugepage vmalloc mappings

Eric Dumazet <eric.dumazet@xxxxxxxxx> · Fri, 21 Aug 2020 08:38:31 -0700

On 8/21/20 8:12 AM, Nicholas Piggin wrote:
> Support huge page vmalloc mappings. Config option HAVE_ARCH_HUGE_VMALLOC
> enables support on architectures that define HAVE_ARCH_HUGE_VMAP and
> supports PMD sized vmap mappings.
> 
> vmalloc will attempt to allocate PMD-sized pages if allocating PMD size or
> larger, and fall back to small pages if that was unsuccessful.
> 
> Allocations that do not use PAGE_KERNEL prot are not permitted to use huge
> pages, because not all callers expect this (e.g., module allocations vs
> strict module rwx).
> 
> This reduces TLB misses by nearly 30x on a `git diff` workload on a 2-node
> POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%.
> 
> This can result in more internal fragmentation and memory overhead for a
> given allocation, an option nohugevmalloc is added to disable at boot.
> 
>

Thanks for working on this stuff, I tried something similar in the past,
but could not really do more than a hack.
( https://lkml.org/lkml/2016/12/21/285 )

Note that __init alloc_large_system_hash() is used at boot time,
when NUMA policy is spreading allocations over all NUMA nodes.

This means that on a dual node system, a hash table should be 50/50 spread.

With your patch, if a hashtable is exactly the size of one huge page,
the location of this hashtable will be not balanced, this might have some
unwanted impact.

Thanks !