From: Sean Christopherson <seanjc@xxxxxxxxxx> commit 2867eb782cf7f64c2ac427596133b6f9c3f64b7a upstream. Apply make_spte()'s optimization to skip trying to unsync shadow pages if and only if the old SPTE was a leaf SPTE, as non-leaf SPTEs in direct MMUs are always writable, i.e. could trigger a false positive and incorrectly lead to KVM creating a SPTE without write-protecting or marking shadow pages unsync. This bug only affects the TDP MMU, as the shadow MMU only overwrites a shadow-present SPTE when synchronizing SPTEs (and only 4KiB SPTEs can be unsync). Specifically, mmu_set_spte() drops any non-leaf SPTEs *before* calling make_spte(), whereas the TDP MMU can do a direct replacement of a page table with the leaf SPTE. Opportunistically update the comment to explain why skipping the unsync stuff is safe, as opposed to simply saying "it's someone else's problem". Cc: stable@xxxxxxxxxxxxxxx Tested-by: Alex Bennée <alex.bennee@xxxxxxxxxx> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> Tested-by: Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> Message-ID: <20241010182427.1434605-5-seanjc@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- arch/x86/kvm/mmu/spte.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -194,12 +194,20 @@ bool make_spte(struct kvm_vcpu *vcpu, st spte |= PT_WRITABLE_MASK | shadow_mmu_writable_mask; /* - * Optimization: for pte sync, if spte was writable the hash - * lookup is unnecessary (and expensive). Write protection - * is responsibility of kvm_mmu_get_page / kvm_mmu_sync_roots. - * Same reasoning can be applied to dirty page accounting. + * When overwriting an existing leaf SPTE, and the old SPTE was + * writable, skip trying to unsync shadow pages as any relevant + * shadow pages must already be unsync, i.e. the hash lookup is + * unnecessary (and expensive). + * + * The same reasoning applies to dirty page/folio accounting; + * KVM will mark the folio dirty using the old SPTE, thus + * there's no need to immediately mark the new SPTE as dirty. + * + * Note, both cases rely on KVM not changing PFNs without first + * zapping the old SPTE, which is guaranteed by both the shadow + * MMU and the TDP MMU. */ - if (is_writable_pte(old_spte)) + if (is_last_spte(old_spte, level) && is_writable_pte(old_spte)) goto out; /* Patches currently in stable-queue which might be from seanjc@xxxxxxxxxx are queue-6.1/kvm-x86-mmu-skip-the-try-unsync-path-iff-the-old-spte-was-a-leaf-spte.patch