Re: [PATCH v8 3/7] KVM: Support dirty ring in conjunction with bitmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 06 Nov 2022 21:40:49 +0000,
Gavin Shan <gshan@xxxxxxxxxx> wrote:
> 
> Hi Marc,
> 
> On 11/6/22 11:43 PM, Marc Zyngier wrote:
> > On Fri, 04 Nov 2022 23:40:45 +0000,
> > Gavin Shan <gshan@xxxxxxxxxx> wrote:
> >> 
> >> ARM64 needs to dirty memory outside of a VCPU context when VGIC/ITS is
> >> enabled. It's conflicting with that ring-based dirty page tracking always
> >> requires a running VCPU context.
> >> 
> >> Introduce a new flavor of dirty ring that requires the use of both VCPU
> >> dirty rings and a dirty bitmap. The expectation is that for non-VCPU
> >> sources of dirty memory (such as the VGIC/ITS on arm64), KVM writes to
> >> the dirty bitmap. Userspace should scan the dirty bitmap before migrating
> >> the VM to the target.
> >> 
> >> Use an additional capability to advertise this behavior. The newly added
> >> capability (KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP) can't be enabled before
> >> KVM_CAP_DIRTY_LOG_RING_ACQ_REL on ARM64. In this way, the newly added
> >> capability is treated as an extension of KVM_CAP_DIRTY_LOG_RING_ACQ_REL.
> >> 
> >> Suggested-by: Marc Zyngier <maz@xxxxxxxxxx>
> >> Suggested-by: Peter Xu <peterx@xxxxxxxxxx>
> >> Co-developed-by: Oliver Upton <oliver.upton@xxxxxxxxx>
> >> Signed-off-by: Oliver Upton <oliver.upton@xxxxxxxxx>
> >> Signed-off-by: Gavin Shan <gshan@xxxxxxxxxx>
> >> Acked-by: Peter Xu <peterx@xxxxxxxxxx>
> >> ---
> >>   Documentation/virt/kvm/api.rst | 33 ++++++++++++++++++-----
> >>   include/linux/kvm_dirty_ring.h |  7 +++++
> >>   include/linux/kvm_host.h       |  1 +
> >>   include/uapi/linux/kvm.h       |  1 +
> >>   virt/kvm/Kconfig               |  8 ++++++
> >>   virt/kvm/dirty_ring.c          | 10 +++++++
> >>   virt/kvm/kvm_main.c            | 49 +++++++++++++++++++++++++++-------
> >>   7 files changed, 93 insertions(+), 16 deletions(-)
> >> 
> >> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> >> index eee9f857a986..2ec32bd41792 100644
> >> --- a/Documentation/virt/kvm/api.rst
> >> +++ b/Documentation/virt/kvm/api.rst
> >> @@ -8003,13 +8003,6 @@ flushing is done by the KVM_GET_DIRTY_LOG ioctl).  To achieve that, one
> >>   needs to kick the vcpu out of KVM_RUN using a signal.  The resulting
> >>   vmexit ensures that all dirty GFNs are flushed to the dirty rings.
> >>   -NOTE: the capability KVM_CAP_DIRTY_LOG_RING and the
> >> corresponding
> >> -ioctl KVM_RESET_DIRTY_RINGS are mutual exclusive to the existing ioctls
> >> -KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG.  After enabling
> >> -KVM_CAP_DIRTY_LOG_RING with an acceptable dirty ring size, the virtual
> >> -machine will switch to ring-buffer dirty page tracking and further
> >> -KVM_GET_DIRTY_LOG or KVM_CLEAR_DIRTY_LOG ioctls will fail.
> >> -
> >>   NOTE: KVM_CAP_DIRTY_LOG_RING_ACQ_REL is the only capability that
> >>   should be exposed by weakly ordered architecture, in order to indicate
> >>   the additional memory ordering requirements imposed on userspace when
> >> @@ -8018,6 +8011,32 @@ Architecture with TSO-like ordering (such as x86) are allowed to
> >>   expose both KVM_CAP_DIRTY_LOG_RING and KVM_CAP_DIRTY_LOG_RING_ACQ_REL
> >>   to userspace.
> >>   +After using the dirty rings, the userspace needs to detect the
> >> capability
> > 
> > using? or enabling? What comes after suggest the latter.
> > 
> 
> s/using/enabling in next revision :)
> 
> >> +of KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP to see whether the ring structures
> >> +need to be backed by per-slot bitmaps. With this capability advertised
> >> +and supported, it means the architecture can dirty guest pages without
> > 
> > If it is advertised, it is supported, right?
> > 
> 
> Yes, s/advertised and supported/advertised in next revision.
> 
> >> +vcpu/ring context, so that some of the dirty information will still be
> >> +maintained in the bitmap structure. KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP
> >> +can't be enabled until the capability of KVM_CAP_DIRTY_LOG_RING_ACQ_REL
> >> +has been enabled.
> >> +
> >> +Note that the bitmap here is only a backup of the ring structure, and
> >> +normally should only contain a very small amount of dirty pages, which
> > 
> > I don't think we can claim this. It is whatever amount of memory is
> > dirtied outside of a vcpu context, and we shouldn't make any claim
> > regarding the number of dirty pages.
> > 
> 
> It's the pre-requisite to use the backup bitmap. Otherwise, the guest
> will experience long down-time during migration, as mentioned by Peter
> in another thread. So it's appropriate to mention the limit of dirty
> pages here.

See my alternative wording for this in the other sub-thread.

> 
> >> +needs to be transferred during VM downtime. Collecting the dirty bitmap
> >> +should be the very last thing that the VMM does before transmitting state
> >> +to the target VM. VMM needs to ensure that the dirty state is final and
> >> +avoid missing dirty pages from another ioctl ordered after the bitmap
> >> +collection.
> >> +
> >> +To collect dirty bits in the backup bitmap, the userspace can use the
> >> +same KVM_GET_DIRTY_LOG ioctl. KVM_CLEAR_DIRTY_LOG shouldn't be needed
> >> +and its behavior is undefined since collecting the dirty bitmap always
> >> +happens in the last phase of VM's migration.
> > 
> > It isn't clear to me why KVM_CLEAR_DIRTY_LOG should be called out. If
> > you have multiple devices that dirty the memory, such as multiple
> > ITSs, why shouldn't userspace be allowed to snapshot the dirty state
> > multiple time? This doesn't seem like a reasonable restriction, and I
> > really dislike the idea of undefined behaviour here.
> > 
> 
> It was actually documenting the expected QEMU's usage. With QEMU
> excluded, KVM_CLEAR_DIRTY_LOG can be used as usual. Undefined behavior
> seems not precise here. We can improve it like below, to avoid talking
> about 'undefined behaviour'.
> 
>   To collect dirty bits in the backup bitmap, the userspace can use the
>   same KVM_GET_DIRTY_LOG ioctl. KVM_CLEAR_DIRTY_LOG shouldn't be needed
>   since collecting the dirty bitmap always happens in the last phase of
>   VM's migration.

That's better, but the "shouldn't be needed" wording makes things
ambiguous, and we shouldn't mention migration at all (this is not the
only purpose of this API). I'd suggest this:

   To collect dirty bits in the backup bitmap, userspace can use the
   same KVM_GET_DIRTY_LOG ioctl. KVM_CLEAR_DIRTY_LOG isn't needed as
   long as all the generation of the dirty bits is done in a single
   pass.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux