Re: set_pte_at_notify regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

On Fri, Jan 10, 2014 at 06:38:06PM +0200, Izik Eidus wrote:
> It look like commit 6bdb913f0a70a4dfb7f066fb15e2d6f960701d00 break the 
> semantic of set_pte_at_notify.
> The change of calling first to mmu_notifier_invalidate_range_start, then 
> to set_pte_at_notify, and then to mmu_notifier_invalidate_range_end
> not only increase the amount of locks kvm have to take and release by 
> factor of 3, but in addition mmu_notifier_invalidate_range_start is zapping
> the pte entry from kvm, so when set_pte_at_notify get called, it doesn`t 
> have any spte to set and it acctuly get called for nothing, the result is
> increasing of vmexits for kvm from both do_wp_page and replace_page, and 
> broken semantic of set_pte_at_notify.

Agreed.

I would suggest to change set_pte_at_notify to return if change_pte
was missing in some mmu notifier attached to this mm, so we can do
something like:

   ptep = page_check_address(page, mm, addr, &ptl, 0);
   [..]
   notify_missing = false;
   if (... ) {
      	entry = ptep_clear_flush(...);
        [..]
	notify_missing = set_pte_at_notify(mm, addr, ptep, entry);
   }
   pte_unmap_unlock(ptep, ptl);
   if (notify_missing)
   	mmu_notifier_invalidate_page_if_missing_change_pte(mm, addr);

and drop the range calls. This will provide sleepability and at the
same time it won't screw the ability of change_pte to update sptes (by
leaving those established by the time change_pte runs).

This assuming the mutex are going to stay mutex for anon_vma lock and
i_mmap_mutex as I hope. Otherwise the commit could be as well
reverted, it would be pointless then to try to keep the
invalidate_page call outside the PT lock if all other invalidate_page
calls are inside rmap spinlocks.

I think giving a runtime or compiler option to switch the locks to
spinlocks is just fine, cellphones I think would be better off with
those locks as spinlocks for example, but completely removing the
ability to run those locks as mutex even on server setups, doesn't
look a too attractive development to me. A build option especially
wouldn't be too painful to maintain. So I'd be positive for an update
like above to retain the sleeability feature but without harming
change_pte users like KVM anymore.

Thanks!
Andrea
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux