RE: [PATCH 4/4] KVM: MMU: fast cleanup D bit based on fast write protect

Zhuangyanying <ann.zhuangyanying@xxxxxxxxxx> · Wed, 23 Jan 2019 18:38:42 +0000

> -----Original Message-----
> From: Sean Christopherson [mailto:sean.j.christopherson@xxxxxxxxx]
> Sent: Tuesday, January 22, 2019 11:17 PM
> To: Zhuangyanying <ann.zhuangyanying@xxxxxxxxxx>
> Cc: xiaoguangrong@xxxxxxxxxxx; pbonzini@xxxxxxxxxx; Gonglei (Arei)
> <arei.gonglei@xxxxxxxxxx>; qemu-devel@xxxxxxxxxx; kvm@xxxxxxxxxxxxxxx;
> wangxin (U) <wangxinxin.wang@xxxxxxxxxx>; Liujinsong (Paul)
> <liu.jinsong@xxxxxxxxxx>; Zhoujian (jay) <jianjay.zhou@xxxxxxxxxx>;
> xiaoguangrong.eric@xxxxxxxxx
> Subject: Re: [PATCH 4/4] KVM: MMU: fast cleanup D bit based on fast write
> protect
> 
> On Mon, Jan 21, 2019 at 06:37:36AM +0000, Zhuangyanying wrote:
> >
> > > >  	u64 wp_all_indicator, kvm_wp_all_gen;
> > > >
> > > > -	mutex_lock(&kvm->slots_lock);
> > > >  	wp_all_indicator = get_write_protect_all_indicator(kvm);
> > > >  	kvm_wp_all_gen = get_write_protect_all_gen(wp_all_indicator);
> > > >
> > > > @@ -6134,8 +6136,8 @@ void
> kvm_mmu_write_protect_all_pages(struct
> > > kvm *kvm, bool write_protect)
> > > >  	 */
> > > >  	if (write_protect)
> > > >  		kvm_reload_remote_mmus(kvm);
> > > > -	mutex_unlock(&kvm->slots_lock);
> > >
> > > Why is the lock removed?  And why was it added in the first place?
> > >
> > The original purpose of fast write protect is to implement lock-free
> > get_dirty_log, kvm_mmu_write_protect_all_pages is a stand-alone kvm
> > API. See
> > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg04370.html
> > A total of 7 patches, I only need the first 3 patches to achieve
> > step-by-step page table traversal. In order to maintain the integrity
> > of the xiaoguangrong patch, I did not directly modify it on his patch.
> 
> That's not a sufficient argument for adding locking and removing it one patch
> later.
> 
> > > >  }
> > > > +EXPORT_SYMBOL_GPL(kvm_mmu_write_protect_all_pages);
> > > >
> > > >  static unsigned long
> > > >  mmu_shrink_scan(struct shrinker *shrink, struct shrink_control
> > > > *sc) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > > index
> > > > f6915f1..5236a07 100644
> > > > --- a/arch/x86/kvm/vmx/vmx.c
> > > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > > @@ -7180,8 +7180,7 @@ static void vmx_sched_in(struct kvm_vcpu
> > > > *vcpu, int cpu)  static void vmx_slot_enable_log_dirty(struct kvm *kvm,
> > > >  				     struct kvm_memory_slot *slot)  {
> > > > -	kvm_mmu_slot_leaf_clear_dirty(kvm, slot);
> > > > -	kvm_mmu_slot_largepage_remove_write_access(kvm, slot);
> > > > +	kvm_mmu_write_protect_all_pages(kvm, true);
> > >
> > > What's the purpose of having @write_protect if
> > > kvm_mmu_write_protect_all_pages() is only ever called to enable
> > > protection?  If there's no known scenario where write protection is
> > > explicitly disabled then there's no need for WP_ALL_ENABLE_MASK,
> > > i.e. a non-zero generation would indicate write protection is
> > > enabled.  That'd simplify the code and clean up the atomic usage.
> > >
> > In the live migration, The large page split depends on the creation of
> > memslot->dirty_bitmap in the function __kvm_set_memory_region().
> > The interface design between qemu and kvm to enable dirty log is one by one
> in slot units.
> > In order to enable dirty page tracking of the entire vm, it is
> > necessary to call kvm_mmu_write_protect_all_pages multiple times. The
> > page table update request can be merged for processing by the atomic usage.
> This method is not elegant, but it works.
> > Complete the creation of all solt's dirty_bitmap in an API, just call
> > kvm_mmu_write_protect_all_pages once, need more implementation
> changes, even qemu.
> 
> Calling kvm_mmu_write_protect_all_pages() multiple times is fine.  My
> question was regarding the 'write_protect' parameter.  If all callers always
> pass %true for 'write_protect' then why does the parameter exist?
> And eliminating the parameter means you don't need an 'enable' flag buried in
> the generation, which would simplify the implementation.

In fact, when cancel migration for 2T vm, the func memory_global_dirty_log_stop()
will hold the BQL for 12s, vmx_slot_disable_log_dirty() for 4s, 
kvm_mmu_zap_collapsible_sptes() for 2s, double because smm is enabled by default.
What about:
static void vmx_slot_disable_log_dirty(struct kvm *kvm,
				       struct kvm_memory_slot *slot)
{
+	kvm_mmu_write_protect_all_pages(kvm, false);
-	kvm_mmu_slot_set_dirty(kvm, slot);
}

Sorry, this patch is not complete. I will send patch v2 soon.

Best regards,
-Zhuang Yanying