vmalloc_sync_all() was put in the common path in __purge_vmap_area_lazy(), for one sync issue only happened on X86_32 with PTI enabled. It is needless for X86_64, which caused a big regression in UnixBench Shell8 testing on X86_64. Similar regression also reported by 0-day kernel test robot in reaim benchmarking: https://lists.01.org/hyperkitty/list/lkp@xxxxxxxxxxxx/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/ Fix it by adding more conditions. Fixes: 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") Signed-off-by: Shile Zhang <shile.zhang@xxxxxxxxxxxxxxxxx> --- mm/vmalloc.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index a3c70e275f4e..7b9fc7966da6 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1255,11 +1255,17 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end) if (unlikely(valist == NULL)) return false; +#if defined(CONFIG_X86_32) && defined(CONFIG_X86_PAE) /* * First make sure the mappings are removed from all page-tables * before they are freed. + * + * This is only needed on x86-32 with !SHARED_KERNEL_PMD, which is + * the case on a PAE kernel with PTI enabled. */ - vmalloc_sync_all(); + if (!SHARED_KERNEL_PMD && boot_cpu_has(X86_FEATURE_PTI)) + vmalloc_sync_all(); +#endif /* * TODO: to calculate a flush range without looping. -- 2.24.0.rc2