From: Barry Song <v-songbaohua@xxxxxxxx> We have cases we still fail though callers might have __GFP_NOFAIL. Since they don't check the return, we are exposed to the security risks for NULL deference. Though BUG_ON() is not encouraged by Linus, this is an unrecoverable situation. Christoph Hellwig: The whole freaking point of __GFP_NOFAIL is that callers don't handle allocation failures. So in fact a straight BUG is the right thing here. Vlastimil Babka: It's just not a recoverable situation (WARN_ON is for recoverable situations). The caller cannot handle allocation failure and at the same time asked for an impossible allocation. BUG_ON() is a guaranteed oops with stracktrace etc. We don't need to hope for the later NULL pointer dereference (which might if really unlucky happen from a different context where it's no longer obvious what lead to the allocation failing). Michal Hocko: Linus tends to be against adding new BUG() calls unless the failure is absolutely unrecoverable (e.g. corrupted data structures etc.). I am not sure how he would look at simply incorrect memory allocator usage to blow up the kernel. Now the argument could be made that those failures could cause subtle memory corruptions or even be exploitable which might be a sufficient reason to stop them early. Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx> Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx> Cc: Lorenzo Stoakes <lstoakes@xxxxxxxxx> Cc: Christoph Lameter <cl@xxxxxxxxx> Cc: Pekka Enberg <penberg@xxxxxxxxxx> Cc: David Rientjes <rientjes@xxxxxxxxxx> Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> Cc: Vlastimil Babka <vbabka@xxxxxxx> Cc: Roman Gushchin <roman.gushchin@xxxxxxxxx> Cc: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Cc: Kees Cook <kees@xxxxxxxxxx> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> --- include/linux/slab.h | 4 +++- mm/page_alloc.c | 10 +++++----- mm/util.c | 1 + 3 files changed, 9 insertions(+), 6 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index c9cb42203183..4a4d1fdc2afe 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -827,8 +827,10 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node) { size_t bytes; - if (unlikely(check_mul_overflow(n, size, &bytes))) + if (unlikely(check_mul_overflow(n, size, &bytes))) { + BUG_ON(flags & __GFP_NOFAIL); return NULL; + } return kvmalloc_node_noprof(bytes, flags, node); } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 45d2f41b4783..4d6af00fccd4 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4435,11 +4435,11 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, */ if (gfp_mask & __GFP_NOFAIL) { /* - * All existing users of the __GFP_NOFAIL are blockable, so warn - * of any new users that actually require GFP_NOWAIT + * All existing users of the __GFP_NOFAIL are blockable + * otherwise we introduce a busy loop with inside the page + * allocator from non-sleepable contexts */ - if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) - goto fail; + BUG_ON(!can_direct_reclaim); /* * PF_MEMALLOC request from this context is rather bizarre @@ -4470,7 +4470,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, cond_resched(); goto retry; } -fail: + warn_alloc(gfp_mask, ac->nodemask, "page allocation failure: order:%u", order); got_pg: diff --git a/mm/util.c b/mm/util.c index 0ff5898cc6de..a1be50c243f1 100644 --- a/mm/util.c +++ b/mm/util.c @@ -668,6 +668,7 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) /* Don't even allow crazy sizes */ if (unlikely(size > INT_MAX)) { WARN_ON_ONCE(!(flags & __GFP_NOWARN)); + BUG_ON(flags & __GFP_NOFAIL); return NULL; } -- 2.34.1