On Fri, Nov 04, 2022, Chao Peng wrote: > On Thu, Oct 27, 2022 at 11:29:14AM +0100, Fuad Tabba wrote: > > Hi, > > > > On Tue, Oct 25, 2022 at 4:19 PM Chao Peng <chao.p.peng@xxxxxxxxxxxxxxx> wrote: > > > > > > Currently in mmu_notifier validate path, hva range is recorded and then > > > checked against in the mmu_notifier_retry_hva() of the page fault path. > > > However, for the to be introduced private memory, a page fault may not > > > have a hva associated, checking gfn(gpa) makes more sense. > > > > > > For existing non private memory case, gfn is expected to continue to > > > work. The only downside is when aliasing multiple gfns to a single hva, > > > the current algorithm of checking multiple ranges could result in a much > > > larger range being rejected. Such aliasing should be uncommon, so the > > > impact is expected small. > > > > > > It also fixes a bug in kvm_zap_gfn_range() which has already been using > > > > nit: Now it's kvm_unmap_gfn_range(). > > Forgot to mention: the bug is still with kvm_zap_gfn_range(). It calls > kvm_mmu_invalidate_begin/end with a gfn range but before this series > kvm_mmu_invalidate_begin/end actually accept a hva range. Note it's > unrelated to whether we use kvm_zap_gfn_range() or kvm_unmap_gfn_range() > in the following patch (patch 05). Grr, in the future, if you find an existing bug, please send a patch. At the very least, report the bug. The APICv case that this was added for could very well be broken because of this, and the resulting failures would be an absolute nightmare to debug. Compile tested only... -- From: Sean Christopherson <seanjc@xxxxxxxxxx> Date: Fri, 4 Nov 2022 22:20:33 +0000 Subject: [PATCH] KVM: x86/mmu: Block all page faults during kvm_zap_gfn_range() When zapping a GFN range, pass 0 => ALL_ONES for the to-be-invalidated range to effectively block all page faults while the zap is in-progress. The invalidation helpers take a host virtual address, whereas zapping a GFN obviously provides a guest physical address and with the wrong unit of measurement (frame vs. byte). Alternatively, KVM could walk all memslots to get the associated HVAs, but thanks to SMM, that would require multiple lookups. And practically speaking, kvm_zap_gfn_range() usage is quite rare and not a hot path, e.g. MTRR and CR0.CD are almost guaranteed to be done only on vCPU0 during boot, and APICv inhibits are similarly infrequent operations. Fixes: edb298c663fc ("KVM: x86/mmu: bump mmu notifier count in kvm_zap_gfn_range") Cc: stable@xxxxxxxxxxxxxxx Cc: Maxim Levitsky <mlevitsk@xxxxxxxxxx> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> --- arch/x86/kvm/mmu/mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6f81539061d6..1ccb769f62af 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6056,7 +6056,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) write_lock(&kvm->mmu_lock); - kvm_mmu_invalidate_begin(kvm, gfn_start, gfn_end); + kvm_mmu_invalidate_begin(kvm, 0, -1ul); flush = kvm_rmap_zap_gfn_range(kvm, gfn_start, gfn_end); @@ -6070,7 +6070,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) kvm_flush_remote_tlbs_with_address(kvm, gfn_start, gfn_end - gfn_start); - kvm_mmu_invalidate_end(kvm, gfn_start, gfn_end); + kvm_mmu_invalidate_end(kvm, 0, -1ul); write_unlock(&kvm->mmu_lock); } base-commit: c12879206e47730ff5ab255bbf625b28ade4028f --