On Mon, Feb 19, 2024 at 07:51:24AM -0800, Sean Christopherson wrote: > On Mon, Feb 19, 2024, Xu Yilun wrote: > > > void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu) > > > @@ -114,7 +132,6 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu) > > > #else > > > if (cancel_work_sync(&work->work)) { > > > mmput(work->mm); > > > - kvm_put_kvm(vcpu->kvm); /* == work->vcpu->kvm */ > > > kmem_cache_free(async_pf_cache, work); > > > } > > > #endif > > > @@ -126,7 +143,18 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu) > > > list_first_entry(&vcpu->async_pf.done, > > > typeof(*work), link); > > > list_del(&work->link); > > > - kmem_cache_free(async_pf_cache, work); > > > + > > > + spin_unlock(&vcpu->async_pf.lock); > > > + > > > + /* > > > + * The async #PF is "done", but KVM must wait for the work item > > > + * itself, i.e. async_pf_execute(), to run to completion. If > > > + * KVM is a module, KVM must ensure *no* code owned by the KVM > > > + * (the module) can be run after the last call to module_put(), > > > + * i.e. after the last reference to the last vCPU's file is put. > > > + */ > > > + kvm_flush_and_free_async_pf_work(work); > > > > I have a new concern when I re-visit this patchset. > > > > Form kvm_check_async_pf_completion(), I see async_pf.queue is always a > > superset of async_pf.done (except wake-all work, which is not within > > concern). And done work would be skipped from sync (cancel_work_sync()) by: > > > > if (!work->vcpu) > > continue; > > > > But now with this patch we also sync done works, how about we just sync all > > queued work instead. > > Hmm, IIUC, I think we can simply revert commit 22583f0d9c85 ("KVM: async_pf: avoid > recursive flushing of work items"). Ah, yes. This also make me clear about the history of the confusing spin_lock. Reverting is good to me. Thanks, Yilun