On Thu, Nov 07, 2024 at 09:55:52AM -0800, Sean Christopherson wrote: > On Thu, Nov 07, 2024, Eric Auger wrote: > > In case a KVM_REQ_VM_DEAD request was sent to a VM, subsequent > > KVM ioctls will fail and cause test failure. This now happens > > with an aarch64 vgic test where the kvm_vm_free() fails. Let's > > add a new kvm_vm_dead_free() helper that does all the deallocation > > besides the KVM_SET_USER_MEMORY_REGION2 ioctl. > > Please no. I don't want to bleed the kvm->vm_dead behavior all over selftests. > The hack in __TEST_ASSERT_VM_VCPU_IOCTL() is there purely to provide users with > a more helpful error message, it is most definitely not intended to be an "official" > way to detect and react to the VM being dead. > > IMO, tests that intentionally result in a dead VM should assert that subsequent > VM/vCPU ioctls return -EIO, and that's all. Attempting to gracefully free > resources adds complexity and pollutes the core selftests APIs, with very little > benefit. Encouraging tests to explicitly leak resources to fudge around assertions in the selftests library seems off to me. IMO, the better approach would be to provide a helper that gives the impression of freeing the VM but implicitly leaks it, paired with some reasoning for it. diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index bc7c242480d6..75ad05c3c429 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -426,6 +426,7 @@ void vm_enable_dirty_ring(struct kvm_vm *vm, uint32_t ring_size); const char *vm_guest_mode_string(uint32_t i); void kvm_vm_free(struct kvm_vm *vmp); +void __kvm_vm_free_dead(struct kvm_vm *vmp); void kvm_vm_restart(struct kvm_vm *vmp); void kvm_vm_release(struct kvm_vm *vmp); void kvm_vm_elf_load(struct kvm_vm *vm, const char *filename); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index a2b7df5f1d39..34ed397d7811 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -771,6 +771,21 @@ void kvm_vm_free(struct kvm_vm *vmp) free(vmp); } +/* + * For use with the *extremely* uncommon case that a test expects a VM to be + * marked as dead. + * + * Deliberately leak the VM to avoid hacking up kvm_vm_free() to cope with a + * dead VM while giving the outward impression of 'doing the right thing'. + */ +void __kvm_vm_dead_free(struct kvm_vm *vmp) +{ + if (!vmp) + return; + + TEST_ASSERT(vm_dead(vmp)); +} + int kvm_memfd_alloc(size_t size, bool hugepages) { int memfd_flags = MFD_CLOEXEC; > Marking a VM dead should be a _very_ rare event; it's not something that I think > we should encourage, i.e. we shouldn't make it easier to deal with. Ideally, > use of kvm_vm_dead() should be limited to things like sev_vm_move_enc_context_from(), > where KVM needs to prever accessing the source VM to protect the host. IMO, the > vGIC case and x86's enter_smm() are hacks. E.g. I don't see any reason why the > enter_smm() case can't synthesize a triple fault. The VGIC case is at least better than the alternative of slapping bandaids all over the shop to cope with a half-baked VM and ensure we tear it down correctly. Userspace is far up shit creek at the point the VM is marked as dead, so I don't see any value in hobbling along afterwards. -- Thanks, Oliver