On Wed, Jul 12, 2023, Like Xu wrote: > On 2023/6/15 03:07, Sean Christopherson wrote: > > On Wed, Jun 14, 2023, Luiz Capitulino wrote: > > > > Applied to kvm-x86 mmu. I kept the default as "auto" for now, as that can go on > > > > top and I don't want to introduce that change this late in the cycle. If no one > > > > beats me to the punch (hint, hint ;-) ), I'll post a patch to make "never" the > > > > default for unaffected hosts so that we can discuss/consider that change for 6.6. > > > > > > Thanks Sean, I agree with the plan. I could give a try on the patch if you'd like. > > > > Yes please, thanks! > > As a KVM/x86 *feature*, playing with splitting and reconstructing large > pages have other potential user scenarios, e.g. for performance test > comparisons in a easier approach, not just for itlb_multihit mitigation. Enabling and disabling dirty logging is a far better tool for that, as it gives userspace much more explicit control over what pages are are split/reconstituted, and when. > On unaffected machines (ICX and later), nx_huge_pages is already "N", > and turning it into "never" doesn't help materially in the mitigation > implementation, but loses flexibility. I'm becoming more and more convinced that losing the flexibility is perfectly acceptable. There's a very good argument to be made that mitigating DoS attacks from the guest kernel should be done several levels up, e.g. by refusing to create VMs for a customer that is bringing down hosts. As Jim has a pointed out, plugging the hole only works if you are 100% confident there are no other holes, and will never be other holes. > IMO, the real issue here is that the kernel thread "kvm-nx-lpage- > recovery" is created unconditionally. We also need to be aware of the > existence of this commit 084cc29f8bbb ("KVM: x86/MMU: Allow NX huge > pages to be disabled on a per-vm basis"). > > One of the technical proposals is to defer kvm_vm_create_worker_thread() > to kvm_mmu_create() or kvm_init_mmu(), based on > kvm->arch.disable_nx_huge_pages, even until guest paging mode is enabled > on the first vcpu. > > Is this step worth taking ? IMO, no. In hindsight, adding KVM_CAP_VM_DISABLE_NX_HUGE_PAGES was likely a mistake; requiring CAP_SYS_BOOT makes it annoyingly difficult to safely use the capability. My preference at this point is to make changes to the NX hugepage mitigation only when there is a substantial benefit to an already-deployed usecase.