The patch titled Subject: vpda: try to fix the potential crash due to misusing __GFP_NOFAIL has been added to the -mm mm-unstable branch. Its filename is vpda-try-to-fix-the-potential-crash-due-to-misusing-__gfp_nofail.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/vpda-try-to-fix-the-potential-crash-due-to-misusing-__gfp_nofail.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Barry Song <v-songbaohua@xxxxxxxx> Subject: vpda: try to fix the potential crash due to misusing __GFP_NOFAIL Date: Wed, 31 Jul 2024 12:01:52 +1200 Patch series "mm: clarify nofail memory allocation", v2. __GFP_NOFAIL carries the semantics of never failing, so its callers do not check the return value: %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller cannot handle allocation failures. The allocation could block indefinitely but will never return with failure. Testing for failure is pointless. However, __GFP_NOFAIL can sometimes fail if it exceeds size limits or is used with GFP_ATOMIC/GFP_NOWAIT in a non-sleepable context. This can expose security vulnerabilities due to potential NULL dereferences. Since __GFP_NOFAIL does not support non-blocking allocation, we introduce GFP_NOFAIL with inclusive blocking semantics and encourage using GFP_NOFAIL as a replacement for __GFP_NOFAIL in non-mm. If we must still fail a nofail allocation, we should trigger a BUG rather than exposing NULL dereferences to callers who do not check the return value. * The discussion started from this topic: [PATCH RFC] mm: warn potential return NULL for kmalloc_array and kvmalloc_array with __GFP_NOFAIL https://lore.kernel.org/linux-mm/20240717230025.77361-1-21cnbao@xxxxxxxxx/ This patch (of 4): mm doesn't support non-blockable __GFP_NOFAIL allocation. Because __GFP_NOFAIL without direct reclamation may just result in a busy loop within non-sleepable contexts. static inline struct page * __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, struct alloc_context *ac) { ... /* * Make sure that __GFP_NOFAIL request doesn't leak out and make sure * we always retry */ if (gfp_mask & __GFP_NOFAIL) { /* * All existing users of the __GFP_NOFAIL are blockable, so warn * of any new users that actually require GFP_NOWAIT */ if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) goto fail; ... } ... fail: warn_alloc(gfp_mask, ac->nodemask, "page allocation failure: order:%u", order); got_pg: return page; } Let's move the memory allocation out of the atomic context and use the normal sleepable context to get pages. Link: https://lkml.kernel.org/r/20240731000155.109583-1-21cnbao@xxxxxxxxx Link: https://lkml.kernel.org/r/20240731000155.109583-2-21cnbao@xxxxxxxxx Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> Cc: "Michael S. Tsirkin" <mst@xxxxxxxxxx> Cc: Jason Wang <jasowang@xxxxxxxxxx> Cc: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> Cc: "Eugenio Pérez" <eperezma@xxxxxxxxxx> Cc: Maxime Coquelin <maxime.coquelin@xxxxxxxxxx> Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx> Cc: Christoph Lameter <cl@xxxxxxxxx> Cc: David Rientjes <rientjes@xxxxxxxxxx> Cc: Hailong.Liu <hailong.liu@xxxxxxxx> Cc: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Cc: Lorenzo Stoakes <lstoakes@xxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Pekka Enberg <penberg@xxxxxxxxxx> Cc: Roman Gushchin <roman.gushchin@xxxxxxxxx> Cc: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx> Cc: Vlastimil Babka <vbabka@xxxxxxx> Cc: Kees Cook <kees@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- drivers/vdpa/vdpa_user/iova_domain.c | 31 ++++++++++++++++++++----- drivers/vdpa/vdpa_user/iova_domain.h | 5 +++- drivers/vdpa/vdpa_user/vduse_dev.c | 4 ++- 3 files changed, 33 insertions(+), 7 deletions(-) --- a/drivers/vdpa/vdpa_user/iova_domain.c~vpda-try-to-fix-the-potential-crash-due-to-misusing-__gfp_nofail +++ a/drivers/vdpa/vdpa_user/iova_domain.c @@ -283,7 +283,23 @@ out: return ret; } -void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain) +struct page **vduse_domain_alloc_pages_to_remove_bounce(struct vduse_iova_domain *domain) +{ + struct page **pages; + unsigned long count, i; + + if (!domain->user_bounce_pages) + return NULL; + + count = domain->bounce_size >> PAGE_SHIFT; + pages = kmalloc_array(count, sizeof(*pages), GFP_KERNEL | __GFP_NOFAIL); + for (i = 0; i < count; i++) + pages[i] = alloc_page(GFP_KERNEL | __GFP_NOFAIL); + + return pages; +} + +void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain, struct page **pages) { struct vduse_bounce_map *map; unsigned long i, count; @@ -294,15 +310,16 @@ void vduse_domain_remove_user_bounce_pag count = domain->bounce_size >> PAGE_SHIFT; for (i = 0; i < count; i++) { - struct page *page = NULL; + struct page *page = pages[i]; map = &domain->bounce_maps[i]; - if (WARN_ON(!map->bounce_page)) + if (WARN_ON(!map->bounce_page)) { + put_page(page); continue; + } /* Copy user page to kernel page if it's in use */ if (map->orig_phys != INVALID_PHYS_ADDR) { - page = alloc_page(GFP_ATOMIC | __GFP_NOFAIL); memcpy_from_page(page_address(page), map->bounce_page, 0, PAGE_SIZE); } @@ -310,6 +327,7 @@ void vduse_domain_remove_user_bounce_pag map->bounce_page = page; } domain->user_bounce_pages = false; + kfree(pages); out: write_unlock(&domain->bounce_lock); } @@ -543,10 +561,13 @@ static int vduse_domain_mmap(struct file static int vduse_domain_release(struct inode *inode, struct file *file) { struct vduse_iova_domain *domain = file->private_data; + struct page **pages; + + pages = vduse_domain_alloc_pages_to_remove_bounce(domain); spin_lock(&domain->iotlb_lock); vduse_iotlb_del_range(domain, 0, ULLONG_MAX); - vduse_domain_remove_user_bounce_pages(domain); + vduse_domain_remove_user_bounce_pages(domain, pages); vduse_domain_free_kernel_bounce_pages(domain); spin_unlock(&domain->iotlb_lock); put_iova_domain(&domain->stream_iovad); --- a/drivers/vdpa/vdpa_user/iova_domain.h~vpda-try-to-fix-the-potential-crash-due-to-misusing-__gfp_nofail +++ a/drivers/vdpa/vdpa_user/iova_domain.h @@ -74,7 +74,10 @@ void vduse_domain_reset_bounce_map(struc int vduse_domain_add_user_bounce_pages(struct vduse_iova_domain *domain, struct page **pages, int count); -void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain); +void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain, + struct page **pages); + +struct page **vduse_domain_alloc_pages_to_remove_bounce(struct vduse_iova_domain *domain); void vduse_domain_destroy(struct vduse_iova_domain *domain); --- a/drivers/vdpa/vdpa_user/vduse_dev.c~vpda-try-to-fix-the-potential-crash-due-to-misusing-__gfp_nofail +++ a/drivers/vdpa/vdpa_user/vduse_dev.c @@ -1030,6 +1030,7 @@ unlock: static int vduse_dev_dereg_umem(struct vduse_dev *dev, u64 iova, u64 size) { + struct page **pages; int ret; mutex_lock(&dev->mem_lock); @@ -1044,7 +1045,8 @@ static int vduse_dev_dereg_umem(struct v if (dev->umem->iova != iova || size != dev->domain->bounce_size) goto unlock; - vduse_domain_remove_user_bounce_pages(dev->domain); + pages = vduse_domain_alloc_pages_to_remove_bounce(dev->domain); + vduse_domain_remove_user_bounce_pages(dev->domain, pages); unpin_user_pages_dirty_lock(dev->umem->pages, dev->umem->npages, true); atomic64_sub(dev->umem->npages, &dev->umem->mm->pinned_vm); _ Patches currently in -mm which might be from v-songbaohua@xxxxxxxx are mm-extend-usage-parameter-so-that-cluster_swap_free_nr-can-be-reused.patch mm-swap-add-nr-argument-in-swapcache_prepare-and-swapcache_clear-to-support-large-folios.patch vpda-try-to-fix-the-potential-crash-due-to-misusing-__gfp_nofail.patch mm-document-__gfp_nofail-must-be-blockable.patch mm-bug_on-to-avoid-null-deference-while-__gfp_nofail-fails.patch mm-prohibit-null-deference-exposed-for-unsupported-non-blockable-__gfp_nofail.patch