On Mon, May 01, 2023, Jim Mattson wrote: > On Mon, May 1, 2023 at 7:51 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > On Sat, Apr 29, 2023, Robert Hoo wrote: > > > On 4/27/2023 8:38 PM, zhuangel570 wrote: > > > > - kvm_vm_create_worker_thread introduce tail latency more than 100ms. > > > > This function was called when create "kvm-nx-lpage-recovery" kthread when > > > > create a new VM, this patch was introduced to recovery large page to relief > > > > performance loss caused by software mitigation of ITLB_MULTIHIT, see > > > > b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation") and 1aa9b9572b10 > > > > ("kvm: x86: mmu: Recovery of shattered NX large pages"). > > > > > > > Yes, this kthread is for NX-HugePage feature and NX-HugePage in turn is to > > > SW mitigate itlb-multihit issue. > > > However, HW level mitigation has been available for quite a while, you can > > > check "/sys/devices/system/cpu/vulnerabilities/itlb_multihit" for your > > > system's mitigation status. > > > I believe most recent Intel CPUs have this HW mitigated (check > > > MSR_ARCH_CAPABILITIES::IF_PSCHANGE_MC_NO), let alone non-Intel CPUs. > > > But, the kvm_vm_create_worker_thread is still created anyway, nonsense I > > > think. I previously had a internal patch getting rid of it but didn't get a > > > chance to send out. > > > > For the NX hugepage mitation, I think it makes sense to restart the discussion > > in the context of this thread: https://lore.kernel.org/all/ZBxf+ewCimtHY2XO@xxxxxxxxxx > > > > TL;DR: I am open to providng an option to hard disable the mitigation, but there > > needs to be sufficient justification, e.g. that the above 100ms latency is a > > problem for real world deployments. > > Whatever became of > https://lore.kernel.org/kvm/20220613212523.3436117-1-bgardon@xxxxxxxxxx/? That's merged, but disabling the mitigation for a single VM doesn't stop the worker thread (arguably that's a bug), let alone prevent creation of the worker in the first place as KVM spawns the worker before the VM is exposed to userspace. I.e. there's no way for userspace to say "don't spawn workers, the NX hugepage mitigation will *never* be enabled".