Re: [PATCH 1/4] KVM: Always flush async #PF workqueue when vCPU is being destroyed

Sean Christopherson <seanjc@xxxxxxxxxx> · Mon, 19 Feb 2024 07:51:24 -0800

On Mon, Feb 19, 2024, Xu Yilun wrote:
> >  void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> > @@ -114,7 +132,6 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> >  #else
> >  		if (cancel_work_sync(&work->work)) {
> >  			mmput(work->mm);
> > -			kvm_put_kvm(vcpu->kvm); /* == work->vcpu->kvm */
> >  			kmem_cache_free(async_pf_cache, work);
> >  		}
> >  #endif
> > @@ -126,7 +143,18 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> >  			list_first_entry(&vcpu->async_pf.done,
> >  					 typeof(*work), link);
> >  		list_del(&work->link);
> > -		kmem_cache_free(async_pf_cache, work);
> > +
> > +		spin_unlock(&vcpu->async_pf.lock);
> > +
> > +		/*
> > +		 * The async #PF is "done", but KVM must wait for the work item
> > +		 * itself, i.e. async_pf_execute(), to run to completion.  If
> > +		 * KVM is a module, KVM must ensure *no* code owned by the KVM
> > +		 * (the module) can be run after the last call to module_put(),
> > +		 * i.e. after the last reference to the last vCPU's file is put.
> > +		 */
> > +		kvm_flush_and_free_async_pf_work(work);
> 
> I have a new concern when I re-visit this patchset.
> 
> Form kvm_check_async_pf_completion(), I see async_pf.queue is always a
> superset of async_pf.done (except wake-all work, which is not within
> concern).  And done work would be skipped from sync (cancel_work_sync()) by:
> 
>                 if (!work->vcpu)
>                         continue;
> 
> But now with this patch we also sync done works, how about we just sync all
> queued work instead.

Hmm, IIUC, I think we can simply revert commit 22583f0d9c85 ("KVM: async_pf: avoid
recursive flushing of work items").