Fix wrong cc list.
On 2019/11/14 05:12, Andrew Morton wrote:
On Wed, 13 Nov 2019 17:55:30 +0800 Shile Zhang <shile.zhang@xxxxxxxxxxxxxxxxx> wrote:
vmalloc_sync_all() was put in the common path in
__purge_vmap_area_lazy(), for one sync issue only happened on X86_32
with PTI enabled. It is needless for X86_64, which caused a big regression
in UnixBench Shell8 testing on X86_64.
Similar regression also reported by 0-day kernel test robot in reaim
benchmarking:
https://lists.01.org/hyperkitty/list/lkp@xxxxxxxxxxxx/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/
That is indeed a large performance regression.
Fix it by adding more conditions.
Fixes: 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()")
...
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1255,11 +1255,17 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
if (unlikely(valist == NULL))
return false;
+#if defined(CONFIG_X86_32) && defined(CONFIG_X86_PAE)
Are we sure that x86_32 is the only architecture whcih is (or ever will
be) affected?
/*
* First make sure the mappings are removed from all page-tables
* before they are freed.
+ *
+ * This is only needed on x86-32 with !SHARED_KERNEL_PMD, which is
+ * the case on a PAE kernel with PTI enabled.
*/
- vmalloc_sync_all();
+ if (!SHARED_KERNEL_PMD && boot_cpu_has(X86_FEATURE_PTI))
+ vmalloc_sync_all();
+#endif
/*
* TODO: to calculate a flush range without looping.
CONFIG_X86_PAE depends on CONFIG_X86_32 so no need to check
CONFIG_X86_32?
Yes, Thanks for your review and kindly refactoring!
From: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Subject: mm-vmalloc-fix-regression-caused-by-needless-vmalloc_sync_all-fix
simplify config expression, use IS_ENABLED()
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: Joerg Roedel <jroedel@xxxxxxx>
Cc: Qian Cai <cai@xxxxxx>
Cc: Shile Zhang <shile.zhang@xxxxxxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---
mm/vmalloc.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
--- a/mm/vmalloc.c~mm-vmalloc-fix-regression-caused-by-needless-vmalloc_sync_all-fix
+++ a/mm/vmalloc.c
@@ -1255,16 +1255,17 @@ static bool __purge_vmap_area_lazy(unsig
if (unlikely(valist == NULL))
return false;
-#if defined(CONFIG_X86_32) && defined(CONFIG_X86_PAE)
- /*
- * First make sure the mappings are removed from all page-tables
- * before they are freed.
- *
- * This is only needed on x86-32 with !SHARED_KERNEL_PMD, which is
- * the case on a PAE kernel with PTI enabled.
- */
- if (!SHARED_KERNEL_PMD && boot_cpu_has(X86_FEATURE_PTI))
- vmalloc_sync_all();
+ if (IS_ENABLED(CONFIG_X86_PAE)) {
+ /*
+ * First make sure the mappings are removed from all page-tables
+ * before they are freed.
+ *
+ * This is only needed on x86-32 with !SHARED_KERNEL_PMD, which
+ * is the case on a PAE kernel with PTI enabled.
+ */
+ if (!SHARED_KERNEL_PMD && boot_cpu_has(X86_FEATURE_PTI))
+ vmalloc_sync_all();
+ }
#endif
/*
_