On 2023-10-18 01:46 PM, Sean Christopherson wrote: > Always flush the per-vCPU async #PF workqueue when a vCPU is clearing its > completion queue, i.e. when a VM and all its vCPUs is being destroyed. nit: ... or when the guest toggles CR0.PG or async_pf support. > KVM must ensure that none of its workqueue callbacks is running when the > last reference to the KVM _module_ is put. Gifting a reference to the > associated VM prevents the workqueue callback from dereferencing freed > vCPU/VM memory, but does not prevent the KVM module from being unloaded > before the callback completes. > > Drop the misguided VM refcount gifting, as calling kvm_put_kvm() from > async_pf_execute() if kvm_put_kvm() flushes the async #PF workqueue will > result in deadlock. async_pf_execute() can't return until kvm_put_kvm() > finishes, and kvm_put_kvm() can't return until async_pf_execute() finishes: > [...] > > Note, commit 5f6de5cbebee ("KVM: Prevent module exit until all VMs are > freed") *tried* to fix the module refcounting issue by having VMs grab a > reference to the module, but that only made the bug slightly harder to hit > as it gave async_pf_execute() a bit more time to complete before the KVM > module could be unloaded. Blegh! Thanks for the fix. > > Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out") > Cc: stable@xxxxxxxxxxxxxxx > Cc: David Matlack <dmatlack@xxxxxxxxxx> > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> Reviewed-by: David Matlack <dmatlack@xxxxxxxxxx>