Re: [PATCH 8/8] KVM: MMU: Move free_zapped_mmu_pages() out of the protection of mmu_lock

Takuya Yoshikawa <yoshikawa_takuya_b1@xxxxxxxxxxxxx> · Tue, 5 Feb 2013 11:21:02 +0900

On Mon, 4 Feb 2013 11:50:00 -0200
Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:

> On Wed, Jan 23, 2013 at 07:18:11PM +0900, Takuya Yoshikawa wrote:
> > We noticed that kvm_mmu_zap_all() could take hundreds of milliseconds
> > for zapping mmu pages with mmu_lock held.
> > 
> > Although we need to do conditional rescheduling for completely
> > fixing this issue, we can reduce the hold time to some extent by moving
> > free_zapped_mmu_pages() out of the protection.  Since invalid_list can
> > be very long, the effect is not negligible.
> > 
> > Note: this patch does not treat non-trivial cases.
> > 
> > Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@xxxxxxxxxxxxx>
> 
> Can you describe the case thats biting? Is it

Non-trivial cases: a few functions that indirectly call zap_page() and
cannot access invalid_list outside of mmu_lock.  Not worth fixing, I think.

> 
>         /*
>          * If memory slot is created, or moved, we need to clear all
>          * mmio sptes.
>          */
>         if (npages && old.base_gfn != mem->guest_phys_addr >> PAGE_SHIFT) {
>                 kvm_mmu_zap_all(kvm);
>                 kvm_reload_remote_mmus(kvm);
>         }
> 
> Because conditional rescheduling for kvm_mmu_zap_all() might not be
> desirable: KVM_SET_USER_MEMORY has low latency requirements.

This case is problematic.  With huge pages in use, things gets improved
to some extent: big guests need TDP and THP anyway.

But as Avi noted once, we need a way to make long mmu_lock holder break
for achieving lock-less TLB flushes.  Xiao's work may help zap_all() case.

But in general, protecting post zap work by spin_lock is not desirable.
I'll think about inaccurate n_used_mmu_pages problem you pointed out.

Thanks,
	Takuya
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html