Re: [PATCH v7 09/11] KVM: MMU: introduce kvm_mmu_prepare_zap_obsolete_page

Gleb Natapov <gleb@xxxxxxxxxx> · Thu, 23 May 2013 11:09:22 +0300

On Thu, May 23, 2013 at 03:50:16PM +0800, Xiao Guangrong wrote:
> On 05/23/2013 03:37 PM, Gleb Natapov wrote:
> > On Thu, May 23, 2013 at 02:31:47PM +0800, Xiao Guangrong wrote:
> >> On 05/23/2013 02:18 PM, Gleb Natapov wrote:
> >>> On Thu, May 23, 2013 at 02:13:06PM +0800, Xiao Guangrong wrote:
> >>>> On 05/23/2013 01:57 PM, Gleb Natapov wrote:
> >>>>> On Thu, May 23, 2013 at 03:55:58AM +0800, Xiao Guangrong wrote:
> >>>>>> It is only used to zap the obsolete page. Since the obsolete page
> >>>>>> will not be used, we need not spend time to find its unsync children
> >>>>>> out. Also, we delete the page from shadow page cache so that the page
> >>>>>> is completely isolated after call this function.
> >>>>>>
> >>>>>> The later patch will use it to collapse tlb flushes
> >>>>>>
> >>>>>> Signed-off-by: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxxxxxxxxx>
> >>>>>> ---
> >>>>>>  arch/x86/kvm/mmu.c |   46 +++++++++++++++++++++++++++++++++++++++++-----
> >>>>>>  1 files changed, 41 insertions(+), 5 deletions(-)
> >>>>>>
> >>>>>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> >>>>>> index 9b57faa..e676356 100644
> >>>>>> --- a/arch/x86/kvm/mmu.c
> >>>>>> +++ b/arch/x86/kvm/mmu.c
> >>>>>> @@ -1466,7 +1466,7 @@ static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, int nr)
> >>>>>>  static void kvm_mmu_free_page(struct kvm_mmu_page *sp)
> >>>>>>  {
> >>>>>>  	ASSERT(is_empty_shadow_page(sp->spt));
> >>>>>> -	hlist_del(&sp->hash_link);
> >>>>>> +	hlist_del_init(&sp->hash_link);
> >>>>> Why do you need hlist_del_init() here? Why not move it into
> >>>>
> >>>> Since the hlist will be double freed. We will it like this:
> >>>>
> >>>> kvm_mmu_prepare_zap_obsolete_page(page, list);
> >>>> kvm_mmu_commit_zap_page(list);
> >>>>    kvm_mmu_free_page(page);
> >>>>
> >>>> The first place is kvm_mmu_prepare_zap_obsolete_page(page), which have
> >>>> deleted the hash list.
> >>>>
> >>>>> kvm_mmu_prepare_zap_page() like we discussed it here:
> >>>>> https://patchwork.kernel.org/patch/2580351/ instead of doing
> >>>>> it differently for obsolete and non obsolete pages?
> >>>>
> >>>> It is can break the hash-list walking: we should rescan the
> >>>> hash list once the page is prepared-ly zapped.
> >>>>
> >>>> I mentioned it in the changelog:
> >>>>
> >>>>   4): drop the patch which deleted page from hash list at the "prepare"
> >>>>       time since it can break the walk based on hash list.
> >>> Can you elaborate on how this can happen?
> >>
> >> There is a example:
> >>
> >> int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
> >> {
> >> 	struct kvm_mmu_page *sp;
> >> 	LIST_HEAD(invalid_list);
> >> 	int r;
> >>
> >> 	pgprintk("%s: looking for gfn %llx\n", __func__, gfn);
> >> 	r = 0;
> >> 	spin_lock(&kvm->mmu_lock);
> >> 	for_each_gfn_indirect_valid_sp(kvm, sp, gfn) {
> >> 		pgprintk("%s: gfn %llx role %x\n", __func__, gfn,
> >> 			 sp->role.word);
> >> 		r = 1;
> >> 		kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
> >> 	}
> >> 	kvm_mmu_commit_zap_page(kvm, &invalid_list);
> >> 	spin_unlock(&kvm->mmu_lock);
> >>
> >> 	return r;
> >> }
> >>
> >> It works fine since kvm_mmu_prepare_zap_page does not touch the hash list.
> >> If we delete hlist in kvm_mmu_prepare_zap_page(), this kind of codes should
> >> be changed to:
> >>
> >> restart:
> >> 	for_each_gfn_indirect_valid_sp(kvm, sp, gfn) {
> >> 		pgprintk("%s: gfn %llx role %x\n", __func__, gfn,
> >> 			 sp->role.word);
> >> 		r = 1;
> >> 		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
> >> 			goto restart;
> >> 	}
> >> 	kvm_mmu_commit_zap_page(kvm, &invalid_list);
> >>
> > Hmm, yes. So lets leave it as is and always commit invalid_list before
> 
> So, you mean drop this patch and the patch of
> KVM: MMU: collapse TLB flushes when zap all pages?
> 
We still want to add kvm_reload_remote_mmus() to
kvm_mmu_invalidate_zap_all_pages(). But yes, we disable a nice
optimization here. So may be skipping obsolete pages while walking
hashtable is better solution.

> But, we only introduced less code in this patch, most of them is reusing
> the code of __kvm_mmu_prepare_zap_page...
> 
> Furthermore, maybe not related to this patch, i do not think calling
> mmu_zap_unsync_children() in kvm_mmu_prepare_zap_page() is necessary,
> but i need to test it very carefully. Why not let
> kvm_mmu_prepare_zap_obsolete_page for the first step? :(

Yes, I want Marcelo opinion on skipping mmu_zap_unsync_children() first.

> > releasing lock in kvm_zap_obsolete_pages() or skip obsolete pages while
> > walking hash table. Former is clearer I think.
> > 
> > --
> > 			Gleb.
> > 
> > 
> > 

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html