Hi Sean,
On 11/8/22 12:05 AM, Sean Christopherson wrote:
On Sat, Nov 05, 2022, Gavin Shan wrote:
diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c
index fecbb7d75ad2..758679724447 100644
--- a/virt/kvm/dirty_ring.c
+++ b/virt/kvm/dirty_ring.c
@@ -21,6 +21,16 @@ u32 kvm_dirty_ring_get_rsvd_entries(void)
return KVM_DIRTY_RING_RSVD_ENTRIES + kvm_cpu_dirty_log_size();
}
+bool kvm_use_dirty_bitmap(struct kvm *kvm)
+{
lockdep_assert_held(&kvm->slots_lock);
To guard against accessing kvm->dirty_ring_with_bitmap without holding slots_lock.
Right, it makes sense to do so :)
+ return !kvm->dirty_ring_size || kvm->dirty_ring_with_bitmap;
+}
+
@@ -4588,6 +4594,31 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
return -EINVAL;
return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]);
+ case KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP: {
+ struct kvm_memslots *slots;
+ int r = -EINVAL;
+
+ if (!IS_ENABLED(CONFIG_HAVE_KVM_DIRTY_RING_WITH_BITMAP) ||
+ !kvm->dirty_ring_size)
+ return r;
+
+ mutex_lock(&kvm->slots_lock);
+
+ slots = kvm_memslots(kvm);
Sadly, this needs to iterate over all possible memslots thanks to x86's SMM
address space. Might be worth adding a separate helper (that's local to kvm_main.c
to discourage use), e.g.
static bool kvm_are_all_memslots_empty(struct kvm *kvm)
{
int i;
lockdep_assert_held(&kvm->slots_lock);
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
if (!kvm_memslots_empty(__kvm_memslots(kvm, i)))
return false;
}
return true;
}
Strictly speaking, x86 doesn't reach to this point because we bail
on !CONFIG_HAVE_KVM_DIRTY_RING_WITH_BITMAP. However, it's still nice
to make the check complete. Besides, the lockdep_assert_held() is
duplicate to that one inside __kvm_memslots(). I would move the whole
hunk of code into kvm_vm_ioctl_enable_dirty_log_ring_with_bitmap().
In this way, kvm_vm_ioctl_enable_cap_generic() looks clean at least.
+
+ /*
+ * Avoid a race between memslot creation and enabling the ring +
+ * bitmap capability to guarantee that no memslots have been
+ * created without a bitmap.
Nit, it's not just enabling, the below also allows disabling the bitmap. The
enabling case is definitely the most interesting, but the above wording makes it
sound like the enabling case is the only thing that being given protection. That's
kinda true since KVM frees bitmaps without checking kvm_use_dirty_bitmap(), but
that's not a strict requirement.
And there's no race required, e.g. without this check userspace could simply create
a memslot and then toggle on the capability. Acquiring slots_lock above is what
guards against races.
Might also be worth alluding to the alternative solution of allocating the bitmap
for all memslots here, e.g. something like
/*
* For simplicity, allow toggling ring+bitmap if and only if
* there are no memslots, e.g. to ensure all memslots allocate a
* bitmap after the capability is enabled.
*/
Frankly, I don't expect the capability to be disabled. Similar to KVM_CAP_DIRTY_LOG_RING
or KVM_CAP_DIRTY_LOG_RING_ACQ_REL, it would a one-shot capability and only enablement is
allowed. The disablement was suggested by Oliver without providing a clarify, even I dropped
it for several times. I would like to see if there is particular reason why Oliver want
to disable the capability.
kvm->dirty_ring_with_bitmap = cap->args[0];
If Oliver agrees that the capability needn't to be disabled. The whole chunk of
code can be squeezed into kvm_vm_ioctl_enable_dirty_log_ring_with_bitmap() to
make kvm_vm_ioctl_enable_cap_generic() a bit clean, as I said above.
Sean and Oliver, could you help to confirm if the changes look good to you? :)
static int kvm_vm_ioctl_enable_dirty_log_ring_with_bitmap(struct kvm *kvm)
{
int i, r = 0;
if (!IS_ENABLED(CONFIG_HAVE_KVM_DIRTY_RING_WITH_BITMAP) ||
!kvm->dirty_ring_size)
return -EINVAL;
mutex_lock(&kvm->slots_lock);
/* We only allow it to set once */
if (kvm->dirty_ring_with_bitmap) {
r = -EINVAL;
goto out_unlock;
}
/*
* Avoid a race between memslot creation and enabling the ring +
* bitmap capability to guarantee that no memslots have been
* created without a bitmap.
*/
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
if (!kvm_memslots_empty(__kvm_memslots(kvm, i))) {
r = -EINVAL;
goto out_unlock;
}
}
kvm->dirty_ring_with_bitmap = true;
out_unlock:
mutex_unlock(&kvm->slots_lock);
return r;
}
static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
struct kvm_enable_cap *cap)
{
switch (cap->cap) {
:
case KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP:
return kvm_vm_ioctl_enable_dirty_log_ring_with_bitmap(kvm);
default:
return kvm_vm_ioctl_enable_cap(kvm, cap);
}
}
+ */
+ if (kvm_memslots_empty(slots)) {
+ kvm->dirty_ring_with_bitmap = cap->args[0];
+ r = 0;
+ }
+
+ mutex_unlock(&kvm->slots_lock);
+ return r;
+ }
default:
return kvm_vm_ioctl_enable_cap(kvm, cap);
}
--
Thanks,
Gavin