在 2022/7/4 下午11:59, Greg Kroah-Hartman 写道:
On Mon, Jul 04, 2022 at 11:45:08PM +0800, Wen Yang wrote:
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
commit ddd07b750382adc2b78fdfbec47af8a6e0d8ef37 upstream.
CAT has happened, WBINDV is bad (even before CAT blowing away the
entire cache on a multi-core platform wasn't nice), try not to use it
ever.
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Reviewed-by: Dave Hansen <dave.hansen@xxxxxxxxx>
Cc: Bin Yang <bin.yang@xxxxxxxxx>
Cc: Mark Gross <mark.gross@xxxxxxxxx>
Link: https://lkml.kernel.org/r/20180919085947.933674526@xxxxxxxxxxxxx
Cc: <stable@xxxxxxxxxxxxxxx> # 4.19.x
Signed-off-by: Wen Yang <wenyang@xxxxxxxxxxxxxxxxx>
---
arch/x86/mm/pageattr.c | 18 ++----------------
1 file changed, 2 insertions(+), 16 deletions(-)
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 101f3ad0d6ad..ab87da7a6043 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -239,26 +239,12 @@ static void cpa_flush_array(unsigned long *start, int numpages, int cache,
int in_flags, struct page **pages)
{
unsigned int i, level;
-#ifdef CONFIG_PREEMPT
- /*
- * Avoid wbinvd() because it causes latencies on all CPUs,
- * regardless of any CPU isolation that may be in effect.
- *
- * This should be extended for CAT enabled systems independent of
- * PREEMPT because wbinvd() does not respect the CAT partitions and
- * this is exposed to unpriviledged users through the graphics
- * subsystem.
- */
- unsigned long do_wbinvd = 0;
-#else
- unsigned long do_wbinvd = cache && numpages >= 1024; /* 4M threshold */
-#endif
BUG_ON(irqs_disabled() && !early_boot_irqs_disabled);
- on_each_cpu(__cpa_flush_all, (void *) do_wbinvd, 1);
+ flush_tlb_all();
- if (!cache || do_wbinvd)
+ if (!cache)
return;
/*
--
2.19.1.6.gb485710b
Why is this needed on 4.19.y? What problem does it solve, it looks only
like an optimization, not a bugfix.
And if it's a bugfix, why only 4.19.y, why not older kernels too?
We need more information here please.
On a 128-core Intel(R) Xeon(R) Platinum 8369B CPU @ 2.90GHz server, when
the user program frequently calls nv_alloc_system_pages to allocate
large memory, it often causes a delay of about 200 milliseconds for the
entire system. In this way, other latency-sensitive tasks on this system
are heavily impacted, causing stability issues in large-scale clusters
as well.
nv_alloc_system_pages
-> _set_memory_array
-> change_page_attr_set_clr
-> cpa_flush_array
-> on_each_cpu(__cpa_flush_all, (void *) do_wbinvd, 1);
This patch can be directly merged into the 4.19 kernel to solve this
problem, and most of the machines in our production environment are 4.19
kernels.
We're also happy to apply it to the 4.14 and 4.9 kernels, and send the
corresponding patches soon, although there are very few such servers in
our production clusters.
--
Best wishes,
Wen