On Thu, Jan 27, 2022 at 05:19:56PM +0800, Peter Xu wrote: > > > > diff --git a/mm/gup.c b/mm/gup.c > > > > index f0af462ac1e2..8ebc04058e97 100644 > > > > +++ b/mm/gup.c > > > > @@ -440,7 +440,7 @@ static int follow_pfn_pte(struct vm_area_struct *vma, unsigned long address, > > > > pte_t *pte, unsigned int flags) > > > > { > > > > /* No page to get reference */ > > > > - if (flags & FOLL_GET) > > > > + if (flags & (FOLL_GET | FOLL_PIN)) > > > > return -EFAULT; > > > > > > Yes. This clearly fixes the problem that the patch describes, and also > > > clearly matches up with the Fixes tag. So that's correct. > > > > It is a really confusing though, why not just always return -EEXIST > > here? > > Because in current code GUP handles -EEXIST and -EFAULT differently? That has nothing to do with here. We shouldn't be deciding what the top layer does way down here. Return the correct error code for what was discovered at this layer the upper loop should make the decision what it should do > We do early bail out on -EFAULT. -EEXIST was first introduced in 2015 from > Kirill for not failing some mlock() or mmap(MAP_POPULATE) on dax (1027e4436b6). > Then in 2017 it got used again with pud-sized thp (a00cc7d9dd93d) on dax too. > They seem to service the same goal and it seems to be designed that -EEXIST > shouldn't fail GUP immediately. It must fail GUP immeidately if there is a pages list. Callers that want an early failure must pass in NULL for pages, it is just that simple. It has nothing to do with the FOLL flags. A WARN_ON would be appropriate to compare the FOLL flags against the pages. eg FOLL_GET without a pages is nonsense and should be immediately aborted. On the other hand, we avoid this by construction internal to gup.c > > > Here, however, I think we need to consider this a little more carefully, > > > and attempt to actually fix up this case. It is never going to be OK > > > here, to return a **pages array that has these little landmines of > > > potentially uninitialized pointers. And so continuing on *at all* seems > > > very wrong. > > > > Indeed, it should just be like this: > > > > @@ -1182,6 +1182,10 @@ static long __get_user_pages(struct mm_struct *mm, > > * Proper page table entry exists, but no corresponding > > * struct page. > > */ > > + if (pages) { > > + page = ERR_PTR(-EFAULT); > > + goto out; > > + } > > goto next_page; > > } else if (IS_ERR(page)) { > > ret = PTR_ERR(page); > > IIUC not failing -EEXIST immediately seems to be what we want. Which is what this does, for the only case it is acceptable - a null page list. > From that POV, WARN_ON_ONCE() helps better on exposing an illegal return of > -EEXIST (as mentioned in the commit message) than the -EFAULT convertion, IMHO. Again, that is upside down, -EEXIST should not be a illegal return. It should be valid, have a defined meaning 'the vaddr exists but has no struct page' and the top loop, and only the top loop, makes the decision what to do about it. Jason