Hi Oliver,
On 11/5/22 4:12 AM, Oliver Upton wrote:
On Fri, Nov 04, 2022 at 02:57:15PM +0800, Gavin Shan wrote:
On 11/4/22 9:06 AM, Oliver Upton wrote:
[...]
Just to make sure we're on the same page, there's two issues:
(1) If DIRTY_LOG_RING is enabled before memslot creation and
RING_WITH_BITMAP is enabled after memslots have been created w/
dirty logging enabled, memslot->dirty_bitmap == NULL and the
kernel will fault when attempting to save the ITS tables.
(2) Not your code, but a similar issue. If DIRTY_LOG_RING[_ACQ_REL] is
enabled after memslots have been created w/ dirty logging enabled,
memslot->dirty_bitmap != NULL and that memory is wasted until the
memslot is freed.
I don't expect you to fix #2, though I've mentioned it because using the
same approach to #1 and #2 would be nice.
Yes, I got your points. Case (2) is still possible to happen with QEMU
excluded. However, QEMU is always enabling DIRTY_LOG_RING[_ACQ_REL] before
any memory slot is created. I agree that we need to ensure there are no
memory slots when DIRTY_LOG_RING[_ACQ_REL] is enabled.
For case (1), we can ensure RING_WTIH_BITMAP is enabled before any memory
slot is added, as below. QEMU needs a new helper (as above) to enable it
on board's level.
Lets fix both with a new helper in PATCH[v8 4/9] like below?
I agree that we should address (1) like this, but in (2) requiring that
no memslots were created before enabling the existing capabilities would
be a change in ABI. If we can get away with that, great, but otherwise
we may need to delete the bitmaps associated with all memslots when the
cap is enabled.
I had the assumption QEMU and kvm/selftests are the only consumers to
use DIRTY_RING. In this case, requiring that no memslots were created
to enable DIRTY_RING won't break userspace. Following your thoughts,
the tracked dirty pages in the bitmap also need to be synchronized to
the per-vcpu-ring before the bitmap can be destroyed. We don't have
per-vcpu-ring at this stage.
static inline bool kvm_vm_has_memslot_pages(struct kvm *kvm)
{
bool has_memslot_pages;
mutex_lock(&kvm->slots_lock);
has_memslot_pages = !!kvm->nr_memslot_pages;
mutex_unlock(&kvm->slots_lock);
return has_memslot_pages;
}
Do we need to build another helper for this? kvm_memslots_empty() will
tell you whether or not a memslot has been created by checking the gfn
tree.
The helper was introduced to be shared when DIRTY_RING[_ACQ_REL] and
DIRTY_RING_WITH_BITMAP are enabled. Since the issue (2) isn't concern
to us, lets put it aside and the helper isn't needed. kvm_memslots_empty()
has same effect as to 'kvm->nr_memslot_pages', it's fine to use
kvm_memslots_empty(), which is more generic.
On top of that, the memslot check and setting
kvm->dirty_ring_with_bitmap must happen behind the slots_lock. Otherwise
you could still wind up creating memslots w/o bitmaps.
Agree.
Something like:
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 91cf51a25394..420cc101a16e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4588,6 +4588,32 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
return -EINVAL;
return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]);
+
+ case KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP: {
+ struct kvm_memslots *slots;
+ int r = -EINVAL;
+
+ if (!IS_ENABLED(CONFIG_HAVE_KVM_DIRTY_RING_WITH_BITMAP) ||
+ !kvm->dirty_ring_size)
+ return r;
+
+ mutex_lock(&kvm->slots_lock);
+
+ slots = kvm_memslots(kvm);
+
+ /*
+ * Avoid a race between memslot creation and enabling the ring +
+ * bitmap capability to guarantee that no memslots have been
+ * created without a bitmap.
+ */
+ if (kvm_memslots_empty(slots)) {
+ kvm->dirty_ring_with_bitmap = cap->args[0];
+ r = 0;
+ }
+
+ mutex_unlock(&kvm->slots_lock);
+ return r;
+ }
default:
return kvm_vm_ioctl_enable_cap(kvm, cap);
}
The proposed changes look good to me. It will be integrated to PATCH[v8 4/9].
By the way, v8 will be posted shortly.
Thanks,
Gavin