On Thu, Apr 18, 2013 at 12:00:16PM +0800, Xiao Guangrong wrote: > > > > What is the justification for this? > > We want the rmap of being deleted memslot is removed-only that is > needed for unmapping rmap out of mmu-lock. > > ====== > 1) do not corrupt the rmap > 2) keep pte-list-descs available > 3) keep shadow page available > > Resolve 1): > we make the invalid rmap be remove-only that means we only delete and > clear spte from the rmap, no new sptes can be added to it. > This is reasonable since kvm can not do address translation on invalid rmap > (gfn_to_pfn is failed on invalid memslot) and all sptes on invalid rmap can > not be reused (they belong to invalid shadow page). > ====== > > clear_flush_young / test_young / change_pte of mmu-notify can rewrite > rmap with the present-spte (P bit is set), we should umap rmap in > these handlers. > > > > >> + > >> + /* > >> + * To ensure that all vcpus and mmu-notify are not clearing > >> + * spte and rmap entry. > >> + */ > >> + synchronize_srcu_expedited(&kvm->srcu); > >> +} > >> + > >> #ifdef MMU_DEBUG > >> static int is_empty_shadow_page(u64 *spt) > >> { > >> @@ -2219,6 +2283,11 @@ static void clear_sp_write_flooding_count(u64 *spte) > >> __clear_sp_write_flooding_count(sp); > >> } > >> > >> +static bool is_valid_sp(struct kvm *kvm, struct kvm_mmu_page *sp) > >> +{ > >> + return likely(sp->mmu_valid_gen == kvm->arch.mmu_valid_gen); > >> +} > >> + > >> static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, > >> gfn_t gfn, > >> gva_t gaddr, > >> @@ -2245,6 +2314,9 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, > >> role.quadrant = quadrant; > >> } > >> for_each_gfn_sp(vcpu->kvm, sp, gfn) { > >> + if (!is_valid_sp(vcpu->kvm, sp)) > >> + continue; > >> + > >> if (!need_sync && sp->unsync) > >> need_sync = true; > >> > >> @@ -2281,6 +2353,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, > >> > >> account_shadowed(vcpu->kvm, gfn); > >> } > >> + sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen; > >> init_shadow_page_table(sp); > >> trace_kvm_mmu_get_page(sp, true); > >> return sp; > >> @@ -2451,8 +2524,12 @@ static int kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp, > >> ret = mmu_zap_unsync_children(kvm, sp, invalid_list); > >> kvm_mmu_page_unlink_children(kvm, sp); > > > > The rmaps[] arrays linked to !is_valid_sp() shadow pages should not be > > accessed (as they have been freed already). > > > > I suppose the is_valid_sp() conditional below should be moved earlier, > > before kvm_mmu_unlink_parents or any other rmap access. > > > > This is fine: the !is_valid_sp() shadow pages are only reachable > > by SLAB and the hypervisor itself. > > Unfortunately we can not do this. :( > > The sptes in shadow pape can linked to many slots, if the spte is linked > to the rmap of being deleted memslot, it is ok, otherwise, the rmap of > still used memslot is miss updated. > > For example, slot 0 is being deleted, sp->spte[0] is linked on slot[0].rmap, > sp->spte[1] is linked on slot[1].rmap. If we do not access rmap of this 'sp', > the already-freed spte[1] is still linked on slot[1].rmap. > > We can let kvm update the rmap for sp->spte[1] and do not unlink sp->spte[0]. > This is also not allowed since mmu-notify can access the invalid rmap before > the memslot is destroyed, then mmu-notify will get already-freed spte on > the rmap or page Access/Dirty is miss tracked (if let mmu-notify do not access > the invalid rmap). Why not release all rmaps? Subject: [PATCH v2 3/7] KVM: x86: introduce kvm_clear_all_gfn_page_info This function is used to reset the rmaps and page info of all guest page which will be used in later patch Signed-off-by: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html