FYI, this is our test scenario, simulating the FaaS business, every VM assign 0.1 core, starting lots VMs run in backgroud (such as 800 VM on a machine with 80 cores), then burst create 10 VMs, then got 100ms+ latency in creating "kvm-nx-lpage-recovery". On Tue, May 2, 2023 at 10:20 AM Robert Hoo <robert.hoo.linux@xxxxxxxxx> wrote: > > On 3/23/2023 3:18 PM, lirongqing@xxxxxxxxx wrote: > > From: Li RongQing <lirongqing@xxxxxxxxx> > > > > if CPU has not X86_BUG_ITLB_MULTIHIT bug, kvm-nx-lpage-re kthread > > is not needed to create > > (directed by Sean from > https://lore.kernel.org/kvm/ZE%2FR1%2FhvbuWmD8mw@xxxxxxxxxx/ here.) > > No, I think it should tie to "nx_huge_pages" value rather than > directly/partially tie to boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT). > > > > Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx> > > --- > > arch/x86/kvm/mmu/mmu.c | 19 +++++++++++++++++++ > > 1 file changed, 19 insertions(+) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 8354262..be98c69 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -6667,6 +6667,11 @@ static bool get_nx_auto_mode(void) > > return boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT) && !cpu_mitigations_off(); > > } > > > > +static bool cpu_has_itlb_multihit(void) > > +{ > > + return boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT); > > +} > > + > > static void __set_nx_huge_pages(bool val) > > { > > nx_huge_pages = itlb_multihit_kvm_mitigation = val; > > @@ -6677,6 +6682,11 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp) > > bool old_val = nx_huge_pages; > > bool new_val; > > > > + if (!cpu_has_itlb_multihit()) { > > + __set_nx_huge_pages(false); > > + return 0; > > + } > > + > It's rude simply return here just because > !boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT), leaving all else behind, i.e. > leaving below sysfs node useless. > If you meant to do this, you should clear these sysfs APIs because of > !boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT). > > > /* In "auto" mode deploy workaround only if CPU has the bug. */ > > if (sysfs_streq(val, "off")) > > new_val = 0; > > @@ -6816,6 +6826,9 @@ static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel > > uint old_period, new_period; > > int err; > > > > + if (!cpu_has_itlb_multihit()) > > + return 0; > > + > > was_recovery_enabled = calc_nx_huge_pages_recovery_period(&old_period); > > > > err = param_set_uint(val, kp); > > @@ -6971,6 +6984,9 @@ int kvm_mmu_post_init_vm(struct kvm *kvm) > > { > > int err; > > > > + if (!cpu_has_itlb_multihit()) > > + return 0; > > + > It's rude to simply return. kvm_mmu_post_init_vm() by name is far more than > nx_hugepage stuff, though at present only this stuff in. > I would rather > > if (cpu_has_itlb_multihit()) { > ... > } > > Consider people in the future when they do modifications on this function. > > err = kvm_vm_create_worker_thread(kvm, kvm_nx_huge_page_recovery_worker, 0, > > "kvm-nx-lpage-recovery", > > &kvm->arch.nx_huge_page_recovery_thread); > > @@ -6982,6 +6998,9 @@ int kvm_mmu_post_init_vm(struct kvm *kvm) > > > > void kvm_mmu_pre_destroy_vm(struct kvm *kvm) > > { > > + if (!cpu_has_itlb_multihit()) > > + return; > > Ditto. It looks (wrongly) like: if !cpu_has_itlb_multihit(), no need to do > anything about pre_destroy_vm. > > > if (kvm->arch.nx_huge_page_recovery_thread) > > kthread_stop(kvm->arch.nx_huge_page_recovery_thread); > > } > > To summary, regardless of the concrete patch/implementation, what Sean more > urgently needs is real world justification to mitigate NX_hugepage; which I > believe you have at hand: why would you like to do this, what real world > issue caused by this bothers you. You could have more descriptions. > > With regards to NX_hugepage, I see people dislike it [1][2][3], but on HW > with itlb_multihit, they've no choice but to use it to mitigate. > > [1] this patch > [2] > https://lore.kernel.org/kvm/CANZk6aSv5ta3emitOfWKxaB-JvURBVu-sXqFnCz9PKXhqjbV9w@xxxxxxxxxxxxxx/ > [3] > https://lore.kernel.org/kvm/20220613212523.3436117-1-bgardon@xxxxxxxxxx/ > (merged) -- —————————— zhuangel570 ——————————