On Fri, Dec 15, 2017 at 3:38 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > On Fri, Dec 15, 2017 at 11:25:29AM +0100, Peter Zijlstra wrote: >> The memory one is also clearly wrong, not having access does not a write >> fault make. If we have pte_write() set we should not do_wp_page() just >> because we don't have access. This falls under the "doing anything other >> than hard failure for !access is crazy" header. > > So per the very same reasoning I think the below is warranted too; also > rename that @dirty variable, because its also wrong. > > diff --git a/mm/memory.c b/mm/memory.c > index 5eb3d2524bdc..0d43b347eb0a 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3987,7 +3987,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address, > .pgoff = linear_page_index(vma, address), > .gfp_mask = __get_fault_gfp_mask(vma), > }; > - unsigned int dirty = flags & FAULT_FLAG_WRITE; > + unsigned int write = flags & FAULT_FLAG_WRITE; > struct mm_struct *mm = vma->vm_mm; > pgd_t *pgd; > p4d_t *p4d; > @@ -4013,7 +4013,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address, > > /* NUMA case for anonymous PUDs would go here */ > > - if (dirty && !pud_access_permitted(orig_pud, WRITE)) { > + if (write && !pud_write(orig_pud)) { > ret = wp_huge_pud(&vmf, orig_pud); > if (!(ret & VM_FAULT_FALLBACK)) > return ret; > @@ -4046,7 +4046,7 @@ static int __handle_mm_fault(struct vm_area_struct *vma, unsigned long address, > if (pmd_protnone(orig_pmd) && vma_is_accessible(vma)) > return do_huge_pmd_numa_page(&vmf, orig_pmd); > > - if (dirty && !pmd_access_permitted(orig_pmd, WRITE)) { > + if (write && !pmd_write(orig_pmd)) { > ret = wp_huge_pmd(&vmf, orig_pmd); > if (!(ret & VM_FAULT_FALLBACK)) > return ret; > > > I still cannot make sense of what the intention behind these changes > were, the Changelog that went with them is utter crap, it doesn't > explain anything. The motivation was that I noticed that get_user_pages_fast() was doing a full pud_access_permitted() check, but the get_user_pages() slow path was only doing a pud_write() check. That was inconsistent so I went to go resolve that across all the pte types and ended up making a mess of things, I'm fine if the answer is that we should have went the other way to only do write checks. However, when I was investigating which way to go the aspect that persuaded me to start sprinkling p??_access_permitted checks around was that the application behavior changed between mmap access and direct-i/o access to the same buffer. I assumed that different access behavior between those would be an inconsistent surprise to userspace. Although, infinitely looping in handle_mm_fault is an even worse surprise, apologies for that. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>