Re: [PATCH 2/3] KVM: selftests: Introduce kvm_vm_dead_free

Oliver Upton <oliver.upton@xxxxxxxxx> · Thu, 7 Nov 2024 11:08:06 -0800

On Thu, Nov 07, 2024 at 09:55:52AM -0800, Sean Christopherson wrote:
> On Thu, Nov 07, 2024, Eric Auger wrote:
> > In case a KVM_REQ_VM_DEAD request was sent to a VM, subsequent
> > KVM ioctls will fail and cause test failure. This now happens
> > with an aarch64 vgic test where the kvm_vm_free() fails. Let's
> > add a new kvm_vm_dead_free() helper that does all the deallocation
> > besides the KVM_SET_USER_MEMORY_REGION2 ioctl.
> 
> Please no.  I don't want to bleed the kvm->vm_dead behavior all over selftests.
> The hack in __TEST_ASSERT_VM_VCPU_IOCTL() is there purely to provide users with
> a more helpful error message, it is most definitely not intended to be an "official"
> way to detect and react to the VM being dead.
> 
> IMO, tests that intentionally result in a dead VM should assert that subsequent
> VM/vCPU ioctls return -EIO, and that's all.  Attempting to gracefully free
> resources adds complexity and pollutes the core selftests APIs, with very little
> benefit.

Encouraging tests to explicitly leak resources to fudge around assertions
in the selftests library seems off to me.

IMO, the better approach would be to provide a helper that gives the
impression of freeing the VM but implicitly leaks it, paired with some
reasoning for it.

diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index bc7c242480d6..75ad05c3c429 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -426,6 +426,7 @@ void vm_enable_dirty_ring(struct kvm_vm *vm, uint32_t ring_size);
 const char *vm_guest_mode_string(uint32_t i);
 
 void kvm_vm_free(struct kvm_vm *vmp);
+void __kvm_vm_free_dead(struct kvm_vm *vmp);
 void kvm_vm_restart(struct kvm_vm *vmp);
 void kvm_vm_release(struct kvm_vm *vmp);
 void kvm_vm_elf_load(struct kvm_vm *vm, const char *filename);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index a2b7df5f1d39..34ed397d7811 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -771,6 +771,21 @@ void kvm_vm_free(struct kvm_vm *vmp)
 	free(vmp);
 }
 
+/*
+ * For use with the *extremely* uncommon case that a test expects a VM to be
+ * marked as dead.
+ *
+ * Deliberately leak the VM to avoid hacking up kvm_vm_free() to cope with a
+ * dead VM while giving the outward impression of 'doing the right thing'.
+ */
+void __kvm_vm_dead_free(struct kvm_vm *vmp)
+{
+	if (!vmp)
+		return;
+
+	TEST_ASSERT(vm_dead(vmp));
+}
+
 int kvm_memfd_alloc(size_t size, bool hugepages)
 {
 	int memfd_flags = MFD_CLOEXEC;

> Marking a VM dead should be a _very_ rare event; it's not something that I think
> we should encourage, i.e. we shouldn't make it easier to deal with.  Ideally,
> use of kvm_vm_dead() should be limited to things like sev_vm_move_enc_context_from(),
> where KVM needs to prever accessing the source VM to protect the host.  IMO, the
> vGIC case and x86's enter_smm() are hacks.  E.g. I don't see any reason why the
> enter_smm() case can't synthesize a triple fault.

The VGIC case is at least better than the alternative of slapping
bandaids all over the shop to cope with a half-baked VM and ensure we
tear it down correctly. Userspace is far up shit creek at the point the
VM is marked as dead, so I don't see any value in hobbling along
afterwards.

-- 
Thanks,
Oliver