On Fri, 5 Mar 2021 at 09:12, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > Check the validity of the PDPTRs before allocating any of the PAE roots, > otherwise a bad PDPTR will cause KVM to leak any previously allocated > roots. > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > --- > arch/x86/kvm/mmu/mmu.c | 20 ++++++++++++++------ > 1 file changed, 14 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 7ebfbc77b050..9fc2b46f8541 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -3269,7 +3269,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) > static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) > { > struct kvm_mmu *mmu = vcpu->arch.mmu; > - u64 pdptr, pm_mask; > + u64 pdptrs[4], pm_mask; > gfn_t root_gfn, root_pgd; > hpa_t root; > int i; > @@ -3280,6 +3280,17 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) > if (mmu_check_root(vcpu, root_gfn)) > return 1; > > + if (mmu->root_level == PT32E_ROOT_LEVEL) { > + for (i = 0; i < 4; ++i) { > + pdptrs[i] = mmu->get_pdptr(vcpu, i); > + if (!(pdptrs[i] & PT_PRESENT_MASK)) > + continue; > + > + if (mmu_check_root(vcpu, pdptrs[i] >> PAGE_SHIFT)) > + return 1; > + } > + } > + I saw this splatting: BUG: sleeping function called from invalid context at arch/x86/kvm/kvm_cache_regs.h:115 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3090, name: qemu-system-x86 3 locks held by qemu-system-x86/3090: #0: ffff8d499f8d00d0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x8e/0x680 [kvm] #1: ffffb1b540f873e8 (&kvm->srcu){....}-{0:0}, at: vcpu_enter_guest+0xb30/0x1b10 [kvm] #2: ffffb1b540f7d018 (&(kvm)->mmu_lock){+.+.}-{2:2}, at: kvm_mmu_load+0xb5/0x540 [kvm] Preemption disabled at: [<ffffffffc0787365>] kvm_mmu_load+0xb5/0x540 [kvm] CPU: 2 PID: 3090 Comm: qemu-system-x86 Tainted: G W OE 5.12.0-rc3+ #3 Call Trace: dump_stack+0x87/0xb7 ___might_sleep+0x202/0x250 __might_sleep+0x4a/0x80 kvm_pdptr_read+0x20/0x60 [kvm] kvm_mmu_load+0x3bd/0x540 [kvm] vcpu_enter_guest+0x1297/0x1b10 [kvm] kvm_arch_vcpu_ioctl_run+0x372/0x690 [kvm] kvm_vcpu_ioctl+0x3ca/0x680 [kvm] __x64_sys_ioctl+0x27a/0x800 do_syscall_64+0x38/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xae There is a might_sleep() in kvm_pdptr_read(), however, the original commit didn't explain more. I can send a formal one if the below fix is acceptable. diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index efb41f3..f33026f 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3289,17 +3289,24 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) if (mmu_check_root(vcpu, root_gfn)) return 1; + write_unlock(&vcpu->kvm->mmu_lock); if (mmu->root_level == PT32E_ROOT_LEVEL) { for (i = 0; i < 4; ++i) { pdptrs[i] = mmu->get_pdptr(vcpu, i); if (!(pdptrs[i] & PT_PRESENT_MASK)) continue; - if (mmu_check_root(vcpu, pdptrs[i] >> PAGE_SHIFT)) + if (mmu_check_root(vcpu, pdptrs[i] >> PAGE_SHIFT)) { + write_lock(&vcpu->kvm->mmu_lock); return 1; + } } } + write_lock(&vcpu->kvm->mmu_lock); + if (make_mmu_pages_available(vcpu)) + return -ENOSPC; + /* * Do we shadow a long mode page table? If so we need to * write-protect the guests page table root.