On Wed, Sep 21, 2022, Vitaly Kuznetsov wrote: > Sean Christopherson <seanjc@xxxxxxxxxx> writes: > > > WARN and kill the VM if KVM attempts to double count an NX huge page, > > i.e. attempts to re-tag a shadow page with "NX huge page disallowed". > > KVM does NX huge page accounting only when linking a new shadow page, and > > it should be impossible for a new shadow page to be already accounted. > > E.g. even in the TDP MMU case, where vCPUs can race to install a new > > shadow page, only the "winner" will account the installed page. > > > > Kill the VM instead of continuing on as either KVM has an egregious bug, > > e.g. didn't zero-initialize the data, or there's host data corruption, in > > which carrying on is dangerous, e.g. could cause silent data corruption > > in the guest. > > > > Reported-by: David Matlack <dmatlack@xxxxxxxxxx> > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > > Reviewed-by: Mingwei Zhang <mizhang@xxxxxxxxxx> > > --- > > arch/x86/kvm/mmu/mmu.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 32b60a6b83bd..74afee3f2476 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -804,7 +804,7 @@ static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) > > > > void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp) > > { > > - if (sp->lpage_disallowed) > > + if (KVM_BUG_ON(sp->lpage_disallowed, kvm)) > > return; > > > > ++kvm->stat.nx_lpage_splits; > > This patch (now in sean/for_paolo/6.1) causes nested Hyper-V guests to > break early in the boot sequence but the fault is not > Hyper-V-enlightenments related, e.g. even without them I see: ... > [ 962.257992] ept_fetch+0x504/0x5a0 [kvm] > [ 962.261959] ept_page_fault+0x2d7/0x300 [kvm] > [ 962.287701] kvm_mmu_page_fault+0x258/0x290 [kvm] > [ 962.292451] vmx_handle_exit+0xe/0x40 [kvm_intel] > [ 962.297173] vcpu_enter_guest+0x665/0xfc0 [kvm] > [ 962.307580] vcpu_run+0x33/0x250 [kvm] > [ 962.311367] kvm_arch_vcpu_ioctl_run+0xf7/0x460 [kvm] > [ 962.316456] kvm_vcpu_ioctl+0x271/0x670 [kvm] > [ 962.320843] __x64_sys_ioctl+0x87/0xc0 > [ 962.324602] do_syscall_64+0x38/0x90 > [ 962.328192] entry_SYSCALL_64_after_hwframe+0x63/0xcd Ugh, past me completely forgot the basics of shadow paging[*]. The shadow MMU can reuse existing shadow pages, whereas the TDP MMU always links in new pages. I got turned around by the "doesn't exist" check, which only means "is there already a _SPTE_ here", not "is there an existing SP for the target gfn+role that can be used". I'll drop the series from the queue, send a new pull request, and spin a v5 targeting 6.2, which amusing will look a lot like v1... Thanks for catching this! [*] https://lore.kernel.org/all/Yt8uwMt%2F3JPrSWM9@xxxxxxxxxx