On Wed, 18 Oct 2023 13:46:21 -0700, Sean Christopherson wrote: > Clean up a KVM module refcounting mess that Al pointed out in the context > of the guest_memfd series. The worst behavior was recently introduced by > an ill-fated attempt to fix a bug in x86's async #PF code. Instead of > fixing the underlying bug of not flushing a workqueue (see patch 2), KVM > fudged around the bug by gifting every VM a reference to the KVM module. > > That made the reproducer happy (hopefully there was actually a reproducer > at one point), but it didn't fully fix the use-after-free bug, it just made > the bug harder to hit. E.g. as pointed out by Al, if kvm_destroy_vm() is > preempted after putting the last KVM module reference, KVM can be unloaded > before kvm_destroy_vm() completes, and scheduling back in the associated > task will explode (preemption isn't strictly required, it's just the most > obvious path to failure). > > [...] Applied 1 and 3 (the .owner fixes) to kvm-x86 fixes. I'll follow-up with a separate series to tackle the async #PF mess. [1/3] KVM: Set file_operations.owner appropriately for all such structures https://github.com/kvm-x86/linux/commit/087e15206d6a [2/3] KVM: Always flush async #PF workqueue when vCPU is being destroyed (no commit info) [3/3] Revert "KVM: Prevent module exit until all VMs are freed" https://github.com/kvm-x86/linux/commit/ea61294befd3 -- https://github.com/kvm-x86/linux/tree/next