Hi! On Fri, Jan 10, 2014 at 06:38:06PM +0200, Izik Eidus wrote: > It look like commit 6bdb913f0a70a4dfb7f066fb15e2d6f960701d00 break the > semantic of set_pte_at_notify. > The change of calling first to mmu_notifier_invalidate_range_start, then > to set_pte_at_notify, and then to mmu_notifier_invalidate_range_end > not only increase the amount of locks kvm have to take and release by > factor of 3, but in addition mmu_notifier_invalidate_range_start is zapping > the pte entry from kvm, so when set_pte_at_notify get called, it doesn`t > have any spte to set and it acctuly get called for nothing, the result is > increasing of vmexits for kvm from both do_wp_page and replace_page, and > broken semantic of set_pte_at_notify. Agreed. I would suggest to change set_pte_at_notify to return if change_pte was missing in some mmu notifier attached to this mm, so we can do something like: ptep = page_check_address(page, mm, addr, &ptl, 0); [..] notify_missing = false; if (... ) { entry = ptep_clear_flush(...); [..] notify_missing = set_pte_at_notify(mm, addr, ptep, entry); } pte_unmap_unlock(ptep, ptl); if (notify_missing) mmu_notifier_invalidate_page_if_missing_change_pte(mm, addr); and drop the range calls. This will provide sleepability and at the same time it won't screw the ability of change_pte to update sptes (by leaving those established by the time change_pte runs). This assuming the mutex are going to stay mutex for anon_vma lock and i_mmap_mutex as I hope. Otherwise the commit could be as well reverted, it would be pointless then to try to keep the invalidate_page call outside the PT lock if all other invalidate_page calls are inside rmap spinlocks. I think giving a runtime or compiler option to switch the locks to spinlocks is just fine, cellphones I think would be better off with those locks as spinlocks for example, but completely removing the ability to run those locks as mutex even on server setups, doesn't look a too attractive development to me. A build option especially wouldn't be too painful to maintain. So I'd be positive for an update like above to retain the sleeability feature but without harming change_pte users like KVM anymore. Thanks! Andrea -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html