The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable@xxxxxxxxxxxxxxx>. Possible dependencies: 873aefb376bb ("vfio/type1: Unpin zero pages") 4b6c33b32296 ("vfio/type1: Prepare for batched pinning with struct vfio_batch") be16c1fd99f4 ("vfio/type1: Change success value of vaddr_get_pfn()") aae7a75a821a ("vfio/type1: Add proper error unwind for vfio_iommu_replay()") 64019a2e467a ("mm/gup: remove task_struct pointer for all gup code") bce617edecad ("mm: do page fault accounting in handle_mm_fault") ed03d924587e ("mm/gup: use a standard migration target allocation callback") bbe88753bd42 ("mm/hugetlb: make hugetlb migration callback CMA aware") 41b4dc14ee80 ("mm/gup: restrict CMA region by using allocation scope API") 19fc7bed252c ("mm/migrate: introduce a standard migration target allocation function") d92bbc2719bd ("mm/hugetlb: unify migration callbacks") b4b382238ed2 ("mm/migrate: move migration helper from .h to .c") c7073bab5772 ("mm/page_isolation: prefer the node of the source page") 3e4e28c5a8f0 ("mmap locking API: convert mmap_sem API comments") d8ed45c5dcd4 ("mmap locking API: use coccinelle to convert mmap_sem rwsem call sites") ca5999fde0a1 ("mm: introduce include/linux/pgtable.h") 420c2091b65a ("mm/gup: introduce pin_user_pages_locked()") 5a36f0f3f518 ("Merge tag 'vfio-v5.8-rc1' of git://github.com/awilliam/linux-vfio") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4 Mon Sep 17 00:00:00 2001 From: Alex Williamson <alex.williamson@xxxxxxxxxx> Date: Mon, 29 Aug 2022 21:05:40 -0600 Subject: [PATCH] vfio/type1: Unpin zero pages There's currently a reference count leak on the zero page. We increment the reference via pin_user_pages_remote(), but the page is later handled as an invalid/reserved page, therefore it's not accounted against the user and not unpinned by our put_pfn(). Introducing special zero page handling in put_pfn() would resolve the leak, but without accounting of the zero page, a single user could still create enough mappings to generate a reference count overflow. The zero page is always resident, so for our purposes there's no reason to keep it pinned. Therefore, add a loop to walk pages returned from pin_user_pages_remote() and unpin any zero pages. Cc: stable@xxxxxxxxxxxxxxx Reported-by: Luboslav Pivarc <lpivarc@xxxxxxxxxx> Reviewed-by: David Hildenbrand <david@xxxxxxxxxx> Link: https://lore.kernel.org/r/166182871735.3518559.8884121293045337358.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index db516c90a977..8706482665d1 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -558,6 +558,18 @@ static int vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr, ret = pin_user_pages_remote(mm, vaddr, npages, flags | FOLL_LONGTERM, pages, NULL, NULL); if (ret > 0) { + int i; + + /* + * The zero page is always resident, we don't need to pin it + * and it falls into our invalid/reserved test so we don't + * unpin in put_pfn(). Unpin all zero pages in the batch here. + */ + for (i = 0 ; i < ret; i++) { + if (unlikely(is_zero_pfn(page_to_pfn(pages[i])))) + unpin_user_page(pages[i]); + } + *pfn = page_to_pfn(pages[0]); goto done; }