On 10.06.20 00:54, Daniel Jordan wrote: > Some of our servers spend significant time at kernel boot initializing > memory block sysfs directories and then creating symlinks between them > and the corresponding nodes. The slowness happens because the machines > get stuck with the smallest supported memory block size on x86 (128M), > which results in 16,288 directories to cover the 2T of installed RAM. > The search for each memory block is noticeable even with > commit 4fb6eabf1037 ("drivers/base/memory.c: cache memory blocks in > xarray to accelerate lookup"). > > Commit 078eb6aa50dc ("x86/mm/memory_hotplug: determine block size based > on the end of boot memory") chooses the block size based on alignment > with memory end. That addresses hotplug failures in qemu guests, but > for bare metal systems whose memory end isn't aligned to even the > smallest size, it leaves them at 128M. > > Make kernels that aren't running on a hypervisor use the largest > supported size (2G) to minimize overhead on big machines. Kernel boot > goes 7% faster on the aforementioned servers, shaving off half a second. > > Signed-off-by: Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Andy Lutomirski <luto@xxxxxxxxxx> > Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > Cc: David Hildenbrand <david@xxxxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxxxx> > Cc: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Cc: Steven Sistare <steven.sistare@xxxxxxxxxx> > Cc: linux-mm@xxxxxxxxx > Cc: linux-kernel@xxxxxxxxxxxxxxx > --- > > Applies to 5.7 and today's mainline > > arch/x86/mm/init_64.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 8b5f73f5e207c..906fbdb060748 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -55,6 +55,7 @@ > #include <asm/uv/uv.h> > #include <asm/setup.h> > #include <asm/ftrace.h> > +#include <asm/hypervisor.h> > > #include "mm_internal.h" > > @@ -1390,6 +1391,15 @@ static unsigned long probe_memory_block_size(void) > goto done; > } > > + /* > + * Use max block size to minimize overhead on bare metal, where > + * alignment for memory hotplug isn't a concern. > + */ > + if (hypervisor_is_type(X86_HYPER_NATIVE)) { > + bz = MAX_BLOCK_SIZE; > + goto done; > + } I'd assume that bioses on physical machines >= 64GB will not align bigger (>= 2GB) DIMMs to something < 2GB. Acked-by: David Hildenbrand <david@xxxxxxxxxx> > + > /* Find the largest allowed block size that aligns to memory end */ > for (bz = MAX_BLOCK_SIZE; bz > MIN_MEMORY_BLOCK_SIZE; bz >>= 1) { > if (IS_ALIGNED(boot_mem_end, bz)) > -- Thanks, David / dhildenb