On 25/10/19 17:22, Sean Christopherson wrote: > On Fri, Oct 25, 2019 at 04:56:23PM +0200, Paolo Bonzini wrote: >> On 25/10/19 16:48, Sean Christopherson wrote: >>>> It seems to me that kvm_get_kvm() in >>>> kvm_arch_init_vm() should be okay as long as it is balanced in >>>> kvm_arch_destroy_vm(). So we can apply patch 2 first, and then: >>> No, this will effectively leak the VM because you'll end up with a cyclical >>> reference to kvm_put_kvm(), i.e. users_count will never hit zero. >>> >>> void kvm_put_kvm(struct kvm *kvm) >>> { >>> if (refcount_dec_and_test(&kvm->users_count)) >>> kvm_destroy_vm(kvm); >>> | >>> -> kvm_arch_destroy_vm() >>> | >>> -> kvm_put_kvm() >>> } >> >> There's two parts to this: >> >> - if kvm_arch_init_vm() calls kvm_get_kvm(), then kvm_arch_destroy_vm() >> won't be called until the corresponding kvm_put_kvm(). >> >> - if the error case causes kvm_arch_destroy_vm() to be called early, >> however, that'd be okay and would not leak memory, as long as >> kvm_arch_destroy_vm() detects the situation and calls kvm_put_kvm() itself. >> >> One case could be where you have some kind of delayed work, where the >> callback does kvm_put_kvm. You'd have to cancel the work item and call >> kvm_put_kvm in kvm_arch_destroy_vm, and you would go through that path >> if kvm_create_vm() fails after kvm_arch_init_vm(). > > But do we really want/need to allow handing out references to KVM during > kvm_arch_init_vm()? AFAICT, it's not currently required by any arch. Probably not, but the full code paths are long, so I don't see much value in outright forbidding it. There are very few kvm_get_kvm() calls anyway in arch-dependent code, so it's easy to check that they're not causing reference cycles. Paolo > If an actual use case comes along then we can move refcount_set() back, > but with the requirement that the arch/user provide a mechanism to > handle the reference with respect to kvm_arch_destroy_vm(). As opposed > to the current behavior, which allows an arch to naively do get()/put() > in init_vm()/destroy_vm() without any hint that what they're doing is > broken. >