On Tue, Apr 03, 2018 at 01:40:18PM -0700, David Rientjes wrote: > On Tue, 3 Apr 2018, Jerome Glisse wrote: > > > > diff --git a/mm/memory.c b/mm/memory.c > > > index 21b1212a0892..4bc7b0bdcb40 100644 > > > --- a/mm/memory.c > > > +++ b/mm/memory.c > > > @@ -2309,21 +2309,29 @@ static bool pte_map_lock(struct vm_fault *vmf) > > > * parts, do_swap_page must check under lock before unmapping the pte and > > > * proceeding (but do_wp_page is only called after already making such a check; > > > * and do_anonymous_page can safely check later on). > > > + * > > > + * pte_unmap_same() returns: > > > + * 0 if the PTE are the same > > > + * VM_FAULT_PTNOTSAME if the PTE are different > > > + * VM_FAULT_RETRY if the VMA has changed in our back during > > > + * a speculative page fault handling. > > > */ [...] > > > > > > > This change what do_swap_page() returns ie before it was returning 0 > > when locked pte lookup was different from orig_pte. After this patch > > it returns VM_FAULT_PTNOTSAME but this is a new return value for > > handle_mm_fault() (the do_swap_page() return value is what ultimately > > get return by handle_mm_fault()) > > > > Do we really want that ? This might confuse some existing user of > > handle_mm_fault() and i am not sure of the value of that information > > to caller. > > > > Note i do understand that you want to return retry if anything did > > change from underneath and thus need to differentiate from when the > > pte value are not the same. > > > > I think VM_FAULT_RETRY should be handled appropriately for any user of > handle_mm_fault() already, and would be surprised to learn differently. > Khugepaged has the appropriate handling. I think the concern is whether a > user is handling anything other than VM_FAULT_RETRY and VM_FAULT_ERROR > (which VM_FAULT_PTNOTSAME is not set in)? I haven't done a full audit of > the users. I am not worried about VM_FAULT_RETRY and barely have any worry about VM_FAULT_PTNOTSAME either as they are other comparable new return value (VM_FAULT_NEEDDSYNC for instance which is quite recent). I wonder if adding a new value is really needed here. I don't see any value to it for caller of handle_mm_fault() except for stats. Note that I am not oppose, but while today we have free bits, maybe tomorrow we will run out, i am always worried about thing like that :) Cheers, Jérôme