Re: [PATCH] KVM: x86/mmu: optimizing the code in mmu_try_to_unsync_pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 20, 2022, Yuan Yao wrote:
> On Fri, May 20, 2022 at 02:09:07PM +0800, Yun Lu wrote:
> > There is no need to check can_unsync and prefetch in the loop
> > every time, just move this check before the loop.
> >
> > Signed-off-by: Yun Lu <luyun@xxxxxxxxxx>
> > ---
> >  arch/x86/kvm/mmu/mmu.c | 12 ++++++------
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 311e4e1d7870..e51e7735adca 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -2534,6 +2534,12 @@ int mmu_try_to_unsync_pages(struct kvm *kvm, const struct kvm_memory_slot *slot,
> >  	if (kvm_slot_page_track_is_active(kvm, slot, gfn, KVM_PAGE_TRACK_WRITE))
> >  		return -EPERM;
> >
> > +	if (!can_unsync)
> > +		return -EPERM;
> > +
> > +	if (prefetch)
> > +		return -EEXIST;
> > +
> >  	/*
> >  	 * The page is not write-tracked, mark existing shadow pages unsync
> >  	 * unless KVM is synchronizing an unsync SP (can_unsync = false).  In
> > @@ -2541,15 +2547,9 @@ int mmu_try_to_unsync_pages(struct kvm *kvm, const struct kvm_memory_slot *slot,
> >  	 * allowing shadow pages to become unsync (writable by the guest).
> >  	 */
> >  	for_each_gfn_indirect_valid_sp(kvm, sp, gfn) {
> > -		if (!can_unsync)
> > -			return -EPERM;
> > -
> >  		if (sp->unsync)
> >  			continue;
> >
> > -		if (prefetch)
> > -			return -EEXIST;
> > -
> 
> Consider the case that for_each_gfn_indirect_valid_sp() loop is
> not triggered, means the gfn is not MMU page table page:
> 
> The old behavior when : return 0;
> The new behavior with this change: returrn -EPERM / -EEXIST;
> 
> It at least breaks FNAME(sync_page) -> make_spte(prefetch = true, can_unsync = false)
> which removes PT_WRITABLE_MASK from last level mapping unexpectedly.

Yep, the flags should be queried if and only if there's at least one valid, indirect
SP for th gfn.  And querying whether there's such a SP is quite expesnive and requires
looping over a list, so checking every iteration of the loop is far cheaper.  E.g. each
check is a single uop on modern CPUs as both gcc and clang are smart enough to stash
the flags in registers so that there's no reload from memory on each loop.  And that
also means the CPU can more than likely correctly predict subsequent iterations.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux