Some of our servers spend 14 out of the 21 seconds of kernel boot initializing memory block sysfs directories and then creating symlinks between them and the corresponding nodes. The slowness happens because the machines get stuck with the smallest supported memory block size on x86 (128M), which results in 16,288 directories to cover the 2T of installed RAM, and each of these paths does a linear search of the memory blocks for every block id, with atomic ops at each step. Commit 078eb6aa50dc ("x86/mm/memory_hotplug: determine block size based on the end of boot memory") chooses the block size based on alignment with memory end. That addresses hotplug failures in qemu guests, but for bare metal systems whose memory end isn't aligned to the smallest size, it leaves them at 128M. For such systems, use the largest supported size (2G) to minimize overhead on big machines. That saves nearly all of the 14 seconds so the kernel boots 3x faster. There are some simple ways to avoid the linear searches, but for now it makes no difference with a 2G block. Signed-off-by: Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> --- arch/x86/mm/init_64.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 8b5f73f5e207c..d388127d1b519 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1390,6 +1390,15 @@ static unsigned long probe_memory_block_size(void) goto done; } + /* + * Memory end isn't aligned to any allowed block size, so default to + * the largest to minimize overhead on large memory systems. + */ + if (!IS_ALIGNED(boot_mem_end, MIN_MEMORY_BLOCK_SIZE)) { + bz = MAX_BLOCK_SIZE; + goto done; + } + /* Find the largest allowed block size that aligns to memory end */ for (bz = MAX_BLOCK_SIZE; bz > MIN_MEMORY_BLOCK_SIZE; bz >>= 1) { if (IS_ALIGNED(boot_mem_end, bz)) base-commit: 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162 -- 2.26.2