Re: [PATCH v8 3/7] KVM: Support dirty ring in conjunction with bitmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 07 Nov 2022 10:45:34 +0000,
Gavin Shan <gshan@xxxxxxxxxx> wrote:
> 
> Hi Marc, Peter, Oliver and Sean,
> 
> On 11/5/22 7:40 AM, Gavin Shan wrote:
> > ARM64 needs to dirty memory outside of a VCPU context when VGIC/ITS is
> > enabled. It's conflicting with that ring-based dirty page tracking always
> > requires a running VCPU context.
> > 
> > Introduce a new flavor of dirty ring that requires the use of both VCPU
> > dirty rings and a dirty bitmap. The expectation is that for non-VCPU
> > sources of dirty memory (such as the VGIC/ITS on arm64), KVM writes to
> > the dirty bitmap. Userspace should scan the dirty bitmap before migrating
> > the VM to the target.
> > 
> > Use an additional capability to advertise this behavior. The newly added
> > capability (KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP) can't be enabled before
> > KVM_CAP_DIRTY_LOG_RING_ACQ_REL on ARM64. In this way, the newly added
> > capability is treated as an extension of KVM_CAP_DIRTY_LOG_RING_ACQ_REL.
> > 
> > Suggested-by: Marc Zyngier <maz@xxxxxxxxxx>
> > Suggested-by: Peter Xu <peterx@xxxxxxxxxx>
> > Co-developed-by: Oliver Upton <oliver.upton@xxxxxxxxx>
> > Signed-off-by: Oliver Upton <oliver.upton@xxxxxxxxx>
> > Signed-off-by: Gavin Shan <gshan@xxxxxxxxxx>
> > Acked-by: Peter Xu <peterx@xxxxxxxxxx>
> > ---
> >   Documentation/virt/kvm/api.rst | 33 ++++++++++++++++++-----
> >   include/linux/kvm_dirty_ring.h |  7 +++++
> >   include/linux/kvm_host.h       |  1 +
> >   include/uapi/linux/kvm.h       |  1 +
> >   virt/kvm/Kconfig               |  8 ++++++
> >   virt/kvm/dirty_ring.c          | 10 +++++++
> >   virt/kvm/kvm_main.c            | 49 +++++++++++++++++++++++++++-------
> >   7 files changed, 93 insertions(+), 16 deletions(-)
> > 
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index eee9f857a986..2ec32bd41792 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -8003,13 +8003,6 @@ flushing is done by the KVM_GET_DIRTY_LOG ioctl).  To achieve that, one
> >   needs to kick the vcpu out of KVM_RUN using a signal.  The resulting
> >   vmexit ensures that all dirty GFNs are flushed to the dirty rings.
> >   -NOTE: the capability KVM_CAP_DIRTY_LOG_RING and the corresponding
> > -ioctl KVM_RESET_DIRTY_RINGS are mutual exclusive to the existing ioctls
> > -KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG.  After enabling
> > -KVM_CAP_DIRTY_LOG_RING with an acceptable dirty ring size, the virtual
> > -machine will switch to ring-buffer dirty page tracking and further
> > -KVM_GET_DIRTY_LOG or KVM_CLEAR_DIRTY_LOG ioctls will fail.
> > -
> >   NOTE: KVM_CAP_DIRTY_LOG_RING_ACQ_REL is the only capability that
> >   should be exposed by weakly ordered architecture, in order to indicate
> >   the additional memory ordering requirements imposed on userspace when
> > @@ -8018,6 +8011,32 @@ Architecture with TSO-like ordering (such as x86) are allowed to
> >   expose both KVM_CAP_DIRTY_LOG_RING and KVM_CAP_DIRTY_LOG_RING_ACQ_REL
> >   to userspace.
> >   +After using the dirty rings, the userspace needs to detect the
> > capability
> > +of KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP to see whether the ring structures
> > +need to be backed by per-slot bitmaps. With this capability advertised
> > +and supported, it means the architecture can dirty guest pages without
> > +vcpu/ring context, so that some of the dirty information will still be
> > +maintained in the bitmap structure. KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP
> > +can't be enabled until the capability of KVM_CAP_DIRTY_LOG_RING_ACQ_REL
> > +has been enabled.
> > +
> > +Note that the bitmap here is only a backup of the ring structure, and
> > +normally should only contain a very small amount of dirty pages, which
> > +needs to be transferred during VM downtime. Collecting the dirty bitmap
> > +should be the very last thing that the VMM does before transmitting state
> > +to the target VM. VMM needs to ensure that the dirty state is final and
> > +avoid missing dirty pages from another ioctl ordered after the bitmap
> > +collection.
> > +
> > +To collect dirty bits in the backup bitmap, the userspace can use the
> > +same KVM_GET_DIRTY_LOG ioctl. KVM_CLEAR_DIRTY_LOG shouldn't be needed
> > +and its behavior is undefined since collecting the dirty bitmap always
> > +happens in the last phase of VM's migration.
> > +
> > +NOTE: One example of using the backup bitmap is saving arm64 vgic/its
> > +tables through KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} command on
> > +KVM device "kvm-arm-vgic-its" during VM's migration.
> > +
> 
> In order to speed up the review and reduce unnecessary respins. After
> collecting comments on PATCH[v8 3/7] from Marc and Peter, I would change
> above description as below. Could you please confirm it looks good to you?
> 
> In the 4th paragraph, the words starting from "Collecting the dirty bitmap..."
> to the end, was previously suggested by Oliver, even Marc suggested to avoid
> mentioning "migration".
> 
>   After enabling the dirty rings, the userspace needs to detect the
>   capability of KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP to see whether the ring
>   structures need to be backed by per-slot bitmaps. With this capability

s/need/can/. If there was a *need*, it should happen automatically
without user intervention.

>   advertised, it means the architecture can dirty guest pages without
>   vcpu/ring context, so that some of the dirty information will still be
>   maintained in the bitmap structure. KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP
>   can't be enabled if the capability of KVM_CAP_DIRTY_LOG_RING_ACQ_REL
>   hasn't been enabled, or any memslot has been existing.
> 
>   Note that the bitmap here is only a backup of the ring structure. The
>   use of the ring and bitmap combination is only beneficial if there is
>   only a very small amount of memory that is dirtied out of vcpu/ring
>   context. Otherwise, the stand-alone per-slot bitmap mechanism needs to
>   be considered.
> 
>   To collect dirty bits in the backup bitmap, userspace can use the same
>   KVM_GET_DIRTY_LOG ioctl. KVM_CLEAR_DIRTY_LOG isn't needed as long as all
>   the generation of the dirty bits is done in a single pass. Collecting
>   the dirty bitmap should be the very last thing that the VMM does before
>   transmitting state to the target VM. VMM needs to ensure that the dirty
>   state is final and avoid missing dirty pages from another ioctl ordered
>   after the bitmap collection.

I would replace "transmitting state to the target VM" with
"considering the state as complete", as I still object to casting this
API into the migration mold. People use this stuff more far more than
migration (checkpointing, for example).

> 
>   NOTE: One example of using the backup bitmap is saving arm64 vgic/its
>   tables through KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} command on
>   KVM device "kvm-arm-vgic-its" during VM's migration.

Same remark about migration.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux