On Sat, Apr 29, 2023, Robert Hoo wrote: > On 4/27/2023 8:38 PM, zhuangel570 wrote: > > - kvm_vm_create_worker_thread introduce tail latency more than 100ms. > > This function was called when create "kvm-nx-lpage-recovery" kthread when > > create a new VM, this patch was introduced to recovery large page to relief > > performance loss caused by software mitigation of ITLB_MULTIHIT, see > > b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation") and 1aa9b9572b10 > > ("kvm: x86: mmu: Recovery of shattered NX large pages"). > > > Yes, this kthread is for NX-HugePage feature and NX-HugePage in turn is to > SW mitigate itlb-multihit issue. > However, HW level mitigation has been available for quite a while, you can > check "/sys/devices/system/cpu/vulnerabilities/itlb_multihit" for your > system's mitigation status. > I believe most recent Intel CPUs have this HW mitigated (check > MSR_ARCH_CAPABILITIES::IF_PSCHANGE_MC_NO), let alone non-Intel CPUs. > But, the kvm_vm_create_worker_thread is still created anyway, nonsense I > think. I previously had a internal patch getting rid of it but didn't get a > chance to send out. For the NX hugepage mitation, I think it makes sense to restart the discussion in the context of this thread: https://lore.kernel.org/all/ZBxf+ewCimtHY2XO@xxxxxxxxxx TL;DR: I am open to providng an option to hard disable the mitigation, but there needs to be sufficient justification, e.g. that the above 100ms latency is a problem for real world deployments. > As more and more old CPUs retires, I think NX-HugePage code will become more > and more minority code path/situation, and be refactored out eventually one > day. Heh, yeah, one day. But "one day" is likely 10+ years away. Intel discontinuing a CPU has practically zero relevance to KVM removing support a CPU, e.g. KVM still supports the original Core CPUs from ~2006, which were launched in 2006 and discontinued in 2008.