On Fri, Oct 15, 2021, Paolo Bonzini wrote: > On 15/10/21 18:15, Sean Christopherson wrote: > > > > > > - now vCPU1 finally starts running the page fault code. > > > > > > - vCPU1 AVIC is still enabled > > > (because vCPU1 never handled KVM_REQ_APICV_UPDATE), > > > so the page fault code will populate the SPTE. > > But vCPU1 won't install the SPTE if it loses the race to acquire mmu_lock, because > > kvm_zap_gfn_range() bumps the notifier sequence and so vCPU1 will retry the fault. > > If vCPU1 wins the race, i.e. sees the same sequence number, then the zap is > > guaranteed to find the newly-installed SPTE. > > > > And IMO, retrying is the desired behavior. Installing a SPTE based on the global > > state works, but it's all kinds of weird to knowingly take an action the directly > > contradicts the current vCPU state. > > I think both of you are correct. :) > > Installing a SPTE based on global state is weird because this is a vCPU > action; installing it based on vCPU state is weird because it is knowingly > out of date. If that's the argument, then kvm_faultin_page() should explicitly check for a pending KVM_REQ_APICV_UPDATE, because I would then argue that contuining on when KVM _knows_ its new SPTE will either get zapped (page fault wins the race) or will get rejected (kvm_zap_gfn_range() wins the race) is just as wrong. The SPTE _cannot_ be used even if the page fault wins the race, becuase all vCPUs need to process KVM_REQ_APICV_UPDATE and thus will be blocked until the initiating vCPU zaps the range and drops the APICv lock. And I personally do _not_ want to add a check for the request because it implies the check is sufficient, which it is not, because the page fault doesn't yet hold mmu_lock. Since all answers are some form of wrong, IMO we should at least be coherent with respect to the original page fault.