On Tue, Sep 27, 2016 at 2:29 AM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > > On 26/09/2016 19:45, Peter Feiner wrote: >> The MMU notifier sequence number keeps GPA->HPA mappings in sync when >> GPA->HPA lookups are done outside of the MMU lock (e.g., in >> tdp_page_fault). Since kvm_age_hva doesn't change GPA->HPA, it's >> unnecessary to increment the sequence number. >> >> Signed-off-by: Peter Feiner <pfeiner@xxxxxxxxxx> > > Reviewed-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > This needed some pondering, but it made sense in the end. :) Do you > actually see an effect or is it just a cleanup? Thanks Paolo. This is just a cleanup. I wanted to run it by you to make sure *I* understood the role of the notifier sequence number. I'm reworking this function in a fix for nested bug exposed by PML. When PML is on, EPT A/D is enabled in the vmcs02 EPTP regardless of the vmcs12's EPTP value. The problem is that enabling A/D changes the behavior of L2's x86 page table walks as seen by L1. With A/D enabled, x86 page table walks are always treated as EPT writes. The EPT test in kvm-unit-tests's x86/vmx.c shows this failure. > Related to this, what do you think of this patch? > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 3d4cc8cc56a3..23683d5f6640 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -1617,16 +1617,8 @@ static int kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, > struct rmap_iterator iter; > int young = 0; > > - /* > - * If there's no access bit in the secondary pte set by the > - * hardware it's up to gup-fast/gup to set the access bit in > - * the primary pte or in the page structure. > - */ > - if (!shadow_accessed_mask) > - goto out; > - > for_each_rmap_spte(rmap_head, &iter, sptep) { > - if (*sptep & shadow_accessed_mask) { > + if (!shadow_accessed_mask || (*sptep & shadow_accessed_mask)) { > young = 1; > break; > } > > Since kvm_age_hva blows up "old" sPTEs, all PTEs that are there must be > young... > > Paolo Patch looks good. You can add "Reviewed-By: Peter Feiner <pfeiner@xxxxxxxxxx>" >> --- >> arch/x86/kvm/mmu.c | 10 +--------- >> 1 file changed, 1 insertion(+), 9 deletions(-) >> >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c >> index 3d4cc8cc..dc6d1e8 100644 >> --- a/arch/x86/kvm/mmu.c >> +++ b/arch/x86/kvm/mmu.c >> @@ -1660,17 +1660,9 @@ int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end) >> * This has some overhead, but not as much as the cost of swapping >> * out actively used pages or breaking up actively used hugepages. >> */ >> - if (!shadow_accessed_mask) { >> - /* >> - * We are holding the kvm->mmu_lock, and we are blowing up >> - * shadow PTEs. MMU notifier consumers need to be kept at bay. >> - * This is correct as long as we don't decouple the mmu_lock >> - * protected regions (like invalidate_range_start|end does). >> - */ >> - kvm->mmu_notifier_seq++; >> + if (!shadow_accessed_mask) >> return kvm_handle_hva_range(kvm, start, end, 0, >> kvm_unmap_rmapp); >> - } >> >> return kvm_handle_hva_range(kvm, start, end, 0, kvm_age_rmapp); >> } >> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html