Re: [PATCH v6 3/8] KVM: Add support for using dirty ring in conjunction with bitmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 25, 2022, Marc Zyngier wrote:
> On Tue, 25 Oct 2022 01:24:19 +0100, Oliver Upton <oliver.upton@xxxxxxxxx> wrote:
> > > That's why I asked if it's possible for KVM to require a dirty_bitmap when KVM
> > > might end up collecting dirty information without a vCPU.  KVM is still
> > > technically prescribing a solution to userspace, but only because there's only
> > > one solution.
> > 
> > I was trying to allude to something like this by flat-out requiring
> > ring + bitmap on arm64.
> 
> And I claim that this is wrong. It may suit a particular use case, but
> that's definitely not a universal truth.

Agreed, KVM should not unconditionally require a dirty bitmap for arm64.

> > Otherwise, we'd either need to:
> > 
> >  (1) Document the features that explicitly depend on ring + bitmap (i.e.
> >  GIC ITS, whatever else may come) such that userspace sets up the
> >  correct configuration based on what its using. The combined likelihood
> >  of both KVM and userspace getting this right seems low.
> 
> But what is there to get wrong? Absolutely nothing.

I strongly disagree.  On x86, we've had two bugs escape where KVM attempted to
mark a page dirty without an active vCPU.

  2efd61a608b0 ("KVM: Warn if mark_page_dirty() is called without an active vCPU") 
  42dcbe7d8bac ("KVM: x86: hyper-v: Avoid writing to TSC page without an active vCPU")

Call us incompetent, but I have zero confidence that KVM will never unintentionally
add a path that invokes mark_page_dirty_in_slot() without a running vCPU.

By completely dropping the rule that KVM must have an active vCPU on architectures
that support ring+bitmap, those types of bugs will go silently unnoticed, and will
manifest as guest data corruption after live migration.

And ideally such bugs would detected without relying on userspace to enabling
dirty logging, e.g. the Hyper-V bug lurked for quite some time and was only found
when mark_page_dirty_in_slot() started WARNing.

I'm ok if arm64 wants to let userspace shoot itself in the foot with the ITS, but
I'm not ok dropping the protections in the common mark_page_dirty_in_slot().

One somewhat gross idea would be to let architectures override the "there must be
a running vCPU" rule, e.g. arm64 could toggle a flag in kvm->arch in its
kvm_write_guest_lock() to note that an expected write without a vCPU is in-progress:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 8c5c69ba47a7..d1da8914f749 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3297,7 +3297,10 @@ void mark_page_dirty_in_slot(struct kvm *kvm,
        struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
 
 #ifdef CONFIG_HAVE_KVM_DIRTY_RING
-       if (WARN_ON_ONCE(!vcpu) || WARN_ON_ONCE(vcpu->kvm != kvm))
+       if (!kvm_arch_allow_write_without_running_vcpu(kvm) && WARN_ON_ONCE(!vcpu))
+               return;
+
+       if (WARN_ON_ONCE(vcpu && vcpu->kvm != kvm))
                return;
 #endif
 
@@ -3305,10 +3308,10 @@ void mark_page_dirty_in_slot(struct kvm *kvm,
                unsigned long rel_gfn = gfn - memslot->base_gfn;
                u32 slot = (memslot->as_id << 16) | memslot->id;
 
-               if (kvm->dirty_ring_size)
+               if (kvm->dirty_ring_size && vcpu)
                        kvm_dirty_ring_push(&vcpu->dirty_ring,
                                            slot, rel_gfn);
-               else
+               else if (memslot->dirty_bitmap)
                        set_bit_le(rel_gfn, memslot->dirty_bitmap);
        }
 }




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux