On Fri, May 28, 2021, Lai Jiangshan wrote: > > On 2021/5/28 00:13, Sean Christopherson wrote: > > And making a request won't work without revamping the order of request handling > > in vcpu_enter_guest(), e.g. KVM_REQ_MMU_RELOAD and KVM_REQ_MMU_SYNC are both > > serviced before KVM_REQ_STEAL_UPDATE. > > Yes, it just fixes the said problem in the simplest way. > I copied KVM_REQ_MMU_RELOAD from kvm_handle_invpcid(INVPCID_TYPE_ALL_INCL_GLOBAL). > (If the guest is not preempted, it will call invpcid_flush_all() and will be handled > by this way) The problem is that record_steal_time() is called after KVM_REQ_MMU_RELOAD in vcpu_enter_guest() and so the reload request won't be recognized until the next VM-Exit. It works for kvm_handle_invpcid() because vcpu_enter_guest() is guaranteed to run between the invcpid code and VM-Enter. > The improvement code will go later, and will not be backported. I would argue that introducing a potential performance regression is in itself a bug. IMO, going straight to kvm_mmu_sync_roots() is not high risk. > The proper way to flush guest is to use code in > > https://lore.kernel.org/lkml/20210525213920.3340-1-jiangshanlai@xxxxxxxxx/ > as: > + kvm_mmu_sync_roots(vcpu); > + kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); //or just call flush_current directly > + for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) > + vcpu->arch.mmu->prev_roots[i].need_sync = true; > > If need_sync patch is not accepted, we can just use kvm_mmu_sync_roots(vcpu) > to keep the current pagetable and use kvm_mmu_free_roots() to free all the other > roots in prev_roots. I like the idea, I just haven't gotten around to reviewing that patch yet. > > Cleaning up and documenting the MMU related requests is on my todo list, but the > > immediate fix should be tiny and I can do my cleanups on top. > > > > I believe the minimal fix is: > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 81ab3b8f22e5..b0072063f9bf 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -3072,6 +3072,9 @@ static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu) > > static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu) > > { > > ++vcpu->stat.tlb_flush; > > + > > + if (!tdp_enabled) > > + kvm_mmu_sync_roots(vcpu); > > it doesn't handle prev_roots which are also needed as > shown in kvm_handle_invpcid(INVPCID_TYPE_ALL_INCL_GLOBAL). Ya, I belated realized this :-) > > static_call(kvm_x86_tlb_flush_guest)(vcpu); > > For tdp_enabled, I think it is better to use kvm_x86_tlb_flush_current() > to make it consistent with other shadowpage code. > > > } > >