Re: [PATCH v2 11/27] KVM: x86/mmu: Zap only the relevant pages when removing a memslot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 13, 2019 at 10:04:58AM -0600, Alex Williamson wrote:
> On Tue,  5 Feb 2019 13:01:21 -0800
> Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:
> 
> > Modify kvm_mmu_invalidate_zap_pages_in_memslot(), a.k.a. the x86 MMU's
> > handler for kvm_arch_flush_shadow_memslot(), to zap only the pages/PTEs
> > that actually belong to the memslot being removed.  This improves
> > performance, especially why the deleted memslot has only a few shadow
> > entries, or even no entries.  E.g. a microbenchmark to access regular
> > memory while concurrently reading PCI ROM to trigger memslot deletion
> > showed a 5% improvement in throughput.
> > 
> > Cc: Xiao Guangrong <guangrong.xiao@xxxxxxxxx>
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> > ---
> >  arch/x86/kvm/mmu.c | 33 ++++++++++++++++++++++++++++++++-
> >  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> A number of vfio users are reporting VM instability issues since v5.1,
> some have traced it back to this commit 4e103134b862 ("KVM: x86/mmu: Zap
> only the relevant pages when removing a memslot"), which I've confirmed
> via bisection of the 5.1 merge window KVM pull (636deed6c0bc) and
> re-verified on current 5.3-rc4 using the below patch to toggle the
> broken behavior.
> 
> My reproducer is a Windows 10 VM with assigned GeForce GPU running a
> variety of tests, including FurMark and PassMark Performance Test.
> With the code enabled as exists in upstream currently, PassMark will
> generally introduce graphics glitches or hangs.  Sometimes it's
> necessary to reboot the VM to see these issues.

As in, the issue only shows up when the VM is rebooted?  Just want to
double check that that's not a typo.

> Flipping the 0/1 in the below patch appears to resolve the issue.
> 
> I'd appreciate any insights into further debugging this block of code
> so that we can fix this regression.  Thanks,

If it's not too painful to reproduce, I'd say start by determining whether
it's a problem with the basic logic or if the cond_resched_lock() handling
is wrong.  I.e. comment/ifdef out this chunk:

		if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
			kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, flush);
			flush = false;
			cond_resched_lock(&kvm->mmu_lock);
		}


> 
> Alex
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 24843cf49579..6a77306940f7 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5653,6 +5653,9 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
>  			struct kvm_memory_slot *slot,
>  			struct kvm_page_track_notifier_node *node)
>  {
> +#if 0
> +	kvm_mmu_zap_all(kvm);
> +#else
>  	struct kvm_mmu_page *sp;
>  	LIST_HEAD(invalid_list);
>  	unsigned long i;
> @@ -5685,6 +5688,7 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
>  
>  out_unlock:
>  	spin_unlock(&kvm->mmu_lock);
> +#endif
>  }
>  
>  void kvm_mmu_init_vm(struct kvm *kvm)



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux