On Mon, 4 Feb 2013 11:50:00 -0200 Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > On Wed, Jan 23, 2013 at 07:18:11PM +0900, Takuya Yoshikawa wrote: > > We noticed that kvm_mmu_zap_all() could take hundreds of milliseconds > > for zapping mmu pages with mmu_lock held. > > > > Although we need to do conditional rescheduling for completely > > fixing this issue, we can reduce the hold time to some extent by moving > > free_zapped_mmu_pages() out of the protection. Since invalid_list can > > be very long, the effect is not negligible. > > > > Note: this patch does not treat non-trivial cases. > > > > Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@xxxxxxxxxxxxx> > > Can you describe the case thats biting? Is it Non-trivial cases: a few functions that indirectly call zap_page() and cannot access invalid_list outside of mmu_lock. Not worth fixing, I think. > > /* > * If memory slot is created, or moved, we need to clear all > * mmio sptes. > */ > if (npages && old.base_gfn != mem->guest_phys_addr >> PAGE_SHIFT) { > kvm_mmu_zap_all(kvm); > kvm_reload_remote_mmus(kvm); > } > > Because conditional rescheduling for kvm_mmu_zap_all() might not be > desirable: KVM_SET_USER_MEMORY has low latency requirements. This case is problematic. With huge pages in use, things gets improved to some extent: big guests need TDP and THP anyway. But as Avi noted once, we need a way to make long mmu_lock holder break for achieving lock-less TLB flushes. Xiao's work may help zap_all() case. But in general, protecting post zap work by spin_lock is not desirable. I'll think about inaccurate n_used_mmu_pages problem you pointed out. Thanks, Takuya -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html