Alex reported invalid page pointer returned with pin_user_pages_remote() from vfio after upstream commit 4b6c33b32296 ("vfio/type1: Prepare for batched pinning with struct vfio_batch"). This problem breaks NVIDIA vfio mdev. It turns out that it's not the fault of the vfio commit; however after vfio switches to a full page buffer to store the page pointers it starts to expose the problem easier. The problem is for VM_PFNMAP vmas we should normally fail with an -EFAULT then vfio will carry on to handle the MMIO regions. However when the bug triggered, follow_page_mask() returned -EEXIST for such a page, which will jump over the current page, leaving that entry in **pages untouched. However the caller is not aware of it, hence the caller will reference the page as usual even if the pointer data can be anything. We had that -EEXIST logic since commit 1027e4436b6a ("mm: make GUP handle pfn mapping unless FOLL_GET is requested") which seems very reasonable. It could be that when we reworked GUP with FOLL_PIN we could have overlooked that special path in commit 3faa52c03f44 ("mm/gup: track FOLL_PIN pages"), even if that commit rightfully touched up follow_devmap_pud() on checking FOLL_PIN when it needs to return an -EEXIST. Since at it, add another WARN_ON_ONCE() at the -EEXIST handling to make sure we mustn't have **pages set when reaching there, because otherwise it means the caller will try to read a garbage right after __get_user_pages() returns. Attaching the Fixes to the FOLL_PIN rework commit, as it happened later than 1027e4436b6a. Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Jan Kara <jack@xxxxxxx> Cc: Jérôme Glisse <jglisse@xxxxxxxxxx> Cc: John Hubbard <jhubbard@xxxxxxxxxx> Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages") Reported-by: Alex Williamson <alex.williamson@xxxxxxxxxx> Debugged-by: Alex Williamson <alex.williamson@xxxxxxxxxx> Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> --- mm/gup.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/gup.c b/mm/gup.c index f0af462ac1e2..8ebc04058e97 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -440,7 +440,7 @@ static int follow_pfn_pte(struct vm_area_struct *vma, unsigned long address, pte_t *pte, unsigned int flags) { /* No page to get reference */ - if (flags & FOLL_GET) + if (flags & (FOLL_GET | FOLL_PIN)) return -EFAULT; if (flags & FOLL_TOUCH) { @@ -1181,7 +1181,13 @@ static long __get_user_pages(struct mm_struct *mm, /* * Proper page table entry exists, but no corresponding * struct page. + * + * Warn if we jumped over even with a valid **pages. + * It shouldn't trigger in practise, but when there's + * buggy returns on -EEXIST we'll warn before returning + * an invalid page pointer in the array. */ + WARN_ON_ONCE(pages); goto next_page; } else if (IS_ERR(page)) { ret = PTR_ERR(page); -- 2.32.0